STRATUS: A Multi-agent System for Autonomous Reliability Engineering of Modern CloudsYinfang ChenJiaqi Panet al.2025NeurIPS 2025
ITBench: Evaluating AI Agents across Diverse Real-World IT Automation TasksSaurabh JhaRohan Aroraet al.2025ICML 2025