MicroLens: A Performance Analysis Framework for Microservices Using Hidden Metrics With BPF
Abstract
Determining the root cause of performance regression for microservices is challenging. The topological cascading performance implications among microservices hide the source of the problem. Additionally, the lack of knowledge about application phases can potentially lead to false-positive critical service detection. Service resource utilization is an imperfect proxy for application performance, potentially leading to false positives. Therefore, in this work, we propose a new performance testing framework that leverages hidden Berkeley Packet Filter (BPF) kernel metrics to locate root causes of performance regression. The framework applies a systematic multi-level approach to analyze microservice performance without intrusive code instrumentation. First, the framework constructs an attributed graph with microservice requests, scores the services to identify the critical paths, and ranks the low-level metrics to highlight the root cause of performance regression. Through judiciously designed experiments, we evaluated the metric collection overhead, showing less than 18% more latency when the application is running across hosts and 9% within the same host. In addition, depending on the application, no overhead is experienced, while the state-of-the-art approach presented up to 1060% more latency. The microservice benchmark evaluation shows that MicroLens can successfully identify the set of root causes and that the causes vary when the application is running in different infrastructures.