Partial disk failures: Using software to analyze physical damage
Abstract
A good understanding of disk failures is crucial to ensure a reliable storage of data. There have been numerous studies characterizing disk failures under the common assumption that failed disks are generally unusable. Contrary to this assumption, partial disk failures are very common, e.g., caused by a head crash resulting in a small number of inaccessible disk sectors. Nevertheless, the damage can sometimes be catastrophic if the file system meta-data were among the affected sectors. As disk density rapidly increases, the likelihood of losing data also rises. This paper describes our experience in analyzing partial disk failures using the physical locations of damaged disk sectors to assess the extent and characteristics of the damage on disk platter surfaces. Based on our findings, we propose several fault-tolerance techniques to proactively guard against permanent data loss due to partial disk failures. © 2007 IEEE.