SWEEPER: An efficient disaster recovery point identification mechanism
Abstract
Data corruption is one of the key problems that is on top of the radar screen of most CIOs. Continuous Data Protection (CDP) technologies help enterprises deal with data corruption by maintaining multiple versions of data and facilitating recovery by allowing an administrator restore to an earlier clean version of data. The aim of the recovery process after data corruption is to quickly traverse through the backup copies (old versions), and retrieve a clean copy of data. Currently, data recovery is an ad-hoc, time consuming and frustrating process with sequential brute force approaches, where recovery time is proportional to the number of backup copies examined and the time to check a backup copy for data corruption. In this paper, we present the design and implementation of SW EEP ER architecture and backup copy selection algorithms that specifically tackle the problem of quickly and systematically identifying a good recovery point. We monitor various system events and generate checkpoint records that help in quickly identifying a clean backup copy. The SW EEP ER methodology dynamically determines the selection algorithm based on user specified recovery time and recovery point objectives, and thus, allows system administrators to perform trade-offs between recovery time and data currentness. We have implemented our solution as part of a popular Storage Resource Manager product and evaluated SW EEP ER under many diverse settings. Our study clearly establishes the effectiveness of SW EEP ER as a robust strategy to significantly reduce recovery time.