Publisher DOI: 10.1016/j.parco.2016.12.001
arXiv ID: 1510.08334v2
Title: Toward fault-tolerant parallel-in-time integration with PFASST
Language: English
Authors: Speck, Robert 
Ruprecht, Daniel  
Keywords: Algorithm-based fault tolerance;Boussinesq equations;Gray–Scott model;Parallel-in-time integration;Resilience;Computer Science - Distributed; Parallel; and Cluster Computing;Computer Science - Distributed; Parallel; and Cluster Computing
Issue Date: Feb-2017
Source: Parallel Computing 62: 20-37 (2017-02)
Journal: Parallel Computing 
Abstract (english): 
We introduce and analyze different strategies for the parallel-in-time integration method PFASST to recover from hard faults and subsequent data loss. Since PFASST stores solutions at multiple time steps on different processors, information from adjacent steps can be used to recover after a processor has failed. PFASST's multi-level hierarchy allows to use the coarse level for correcting the reconstructed solution, which can help to minimize overhead. A theoretical model is devised linking overhead to the number of additional PFASST iterations required for convergence after a fault. The potential efficiency of different strategies is assessed in terms of required additional iterations for examples of diffusive and advective type.
ISSN: 1872-7336
Document Type: Article
Peer Reviewed: Yes
Appears in Collections:Publications without fulltext

Show full item record

Page view(s)

checked on Oct 18, 2021


checked on Oct 14, 2021

Google ScholarTM


Add Files to Item

Note about this record

Cite this record


Items in TORE are protected by copyright, with all rights reserved, unless otherwise indicated.