About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
COMSNET 2024
Demo paper
A Chaos Recommendation Tool for Reliability Testing in Large-Scale Cloud-Native Systems
Abstract
With the proliferation of cloud-native systems supported by container technology and the widespread deployment of 5G and Edge use-cases, modern applications have become increasingly distributed and complex, often consisting of hundreds of components. Ensuring the reliability of these workloads has grown increasingly intricate as a consequence, only further complicated by the continuous evolution of systems supported by CI/CD practices. In this context, Chaos Engineering can play a crucial role in assessing the reliability of these large-scale systems by intentionally introducing adverse conditions and gauging their resilience in inter-connected environments. This controlled approach enables organizations to identify and learn from potential failure points before they escalate into full-blown service degradation and production outages. Yet, the effectiveness of chaos testing hinges on the relevance of the targeted fault scenarios and often relies on arbitrary or intuitive fault injection practices, leading to inefficiencies and suboptimal outcomes. Addressing these challenges, we have developed a chaos-recommendation tool. This tool assesses the real-time behavior and characteristics of workloads and suggests fault injections that can cause disruptions. In this demo, we will illustrate how the Chaos recommendation tool can be used to automatically identify potential failure points for a system and suggest corresponding chaos test cases. This tool, part of Redhat's Chaos Engineering project Kraken, is open-source and available at: https://github.com/redhat-chaos/krkn/blob/main/utils/chaos_recommender/README.md