Fault-tolerant meshes with minimal numbers of spares
Abstract
Presents several techniques for adding fault-tolerance to distributed memory parallel computers. More formally, given a target graph with n nodes, the authors create as fault-tolerant graph with n+k nodes such that given any set of k or fewer faulty nodes, the remaining graph is guaranteed to contain the target graph as a fault-free subgraph. As a result, any algorithm designed for the target graph will run with no slowdown in the presence of k or fewer node faults, regardless of their distribution. The authors present fault-tolerant graphs for target graphs which are 2-dimensional meshes, tori, eight-connected meshes and hexagonal meshes. In all cases these fault-tolerant graphs have smaller degree than any previously known graphs with the same properties.