ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

FTXS 2015 - International Workshop on Fault Tolerance for HPC at eXtreme Scale

Date2015-06-15 - 2015-06-18

Deadline2015-02-15

VenuePortland, OR, USA - United States USA - United States

Keywords

Websitehttp://www.hpdc.org/2015/workshops

Topics/Call fo Papers

Assuming hardware and software errors will be inescapable at extreme scale, this workshop will consider aspects of fault tolerance particular to extreme scale that include, but are not limited to:
Quantitative assessments of cost in terms of power, performance, and resource impacts of fault-tolerant techniques, such as checkpoint restart, that are redundant in space, time or information
Novel fault-tolerance techniques and implementations of emerging hardware and software technologies that guard against silent data corruption (SDC) in memory, logic, and storage and provide end-to-end data integrity for running applications
Studies of hardware / software tradeoffs in error detection, failure prediction, error preemption, and recovery
Advances in monitoring, analysis, and control of highly complex systems
Highly scalable fault-tolerant programming models
Metrics and standards for measuring, improving and enforcing the need for and effectiveness of fault-tolerance
Failure modeling and scalable methods of reliability, availability, performability and failure prediction for fault-tolerant HPC systems
Scalable Byzantine fault tolerance and security from single-fault and fail-silent violations
Benchmarks and experimental environments, including fault-injection and accelerated lifetime testing, for evaluating performance of resilience techniques under stress
Frameworks and APIs for fault tolerance and fault management.

Last modified: 2014-12-30 14:50:32