Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted Microservices (2212.12180v5)
Abstract: Achieving resource efficiency while preserving end-user experience is non-trivial for cloud application operators. As cloud applications progressively adopt microservices, resource managers are faced with two distinct levels of system behavior: end-to-end application latency and per-service resource usage. Translating between the two levels, however, is challenging because user requests traverse heterogeneous services that collectively (but unevenly) contribute to the end-to-end latency. We present Autothrottle, a bi-level resource management framework for microservices with latency SLOs (service-level objectives). It architecturally decouples application SLO feedback from service resource control, and bridges them through the notion of performance targets. Specifically, an application-wide learning-based controller is employed to periodically set performance targets -- expressed as CPU throttle ratios -- for per-service heuristic controllers to attain. We evaluate Autothrottle on three microservice applications, with workload traces from production scenarios. Results show superior CPU savings, up to 26.21% over the best-performing baseline and up to 93.84% over all baselines.
- AWS Auto Scaling. https://aws.amazon.com/autoscaling/.
- AWS Predictive Scaling. https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-predictive-scaling.html.
- Azure Autoscale. https://azure.microsoft.com/en-us/products/virtual-machines/autoscale/.
- Google Cloud Autoscaler. https://cloud.google.com/compute/docs/autoscaler/.
- Kubernetes Autoscaling. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/.
- Kubernetes Vertical Pod Autoscaler. https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler#vertical-pod-autoscaler.
- Locust: An Open Source Load Testing Tool. https://locust.io.
- Sinan Open-sourced Repository. https://github.com/zyqCSL/sinan-local.
- Twitter Data for Academic Research. https://developer.twitter.com/en/use-cases/do-research/academic-research/resources. Accessed in 2022.
- Vowpal Wabbit. https://vowpalwabbit.org.
- Adam Gluck. Introducing Domain-Oriented Microservice Architecture, 2020.
- Taming the monster: A fast and simple algorithm for contextual bandits. In International Conference on Machine Learning, pages 1638β1646. PMLR, 2014.
- Providing SLOs for Resource-Harvesting VMs in Cloud Platforms. In OSDI. USENIX, 2020.
- A contextual bandit bake-off. J. Mach. Learn. Res., 22:133β1, 2021.
- Dave Chiluk. Unthrottled: Fixing CPU Limits in the Cloud (blog post). https://engineering.indeedblog.com/blog/2019/12/unthrottled-fixing-cpu-limits-in-the-cloud/.
- Overload Control for ΞΌπ\muitalic_ΞΌs-scale RPCs with Breakwater. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pages 299β314, 2020.
- pHPA: A Proactive Autoscaling Framework for Microservice Chain. In APNet. ACM, 2021.
- Tarcil: Reconciling Scheduling Speed and Quality in Large Shared Clusters. In SoCC. ACM, 2015.
- Characterizing Service Level Objectives for Cloud Services: Realities and Myths. In ICAC. IEEE, 2019.
- Doubly robust policy evaluation and learning. arXiv preprint arXiv:1103.4601, 2011.
- DavidΒ Lo etΒ al. Towards Energy Proportionality for Large-scale Latency-critical Workloads. In ISCA, 2014.
- An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud and Edge Systems. In ASPLOS. ACM, 2019.
- Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices. In ASPLOS. ACM, 2019.
- ATOM: Model-driven Autoscaling for Microservices. In ICDCS. IEEE, 2019.
- Giulio Santoli. Microservices Architectures: Become a Unicorn like Netflix, Twitter and Hailo, 2016.
- PRESS: Predictive Elastic Resource Scaling for Cloud Systems. In CNSM. IEEE, 2010.
- Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In NSDI. USENIX, 2011.
- AlphaR: Learning-Powered Resource Management for Irregular, Dynamic Microservice Graph. In IPDPS. IEEE, 2021.
- PerfIso: Performance isolation for commercial latency-sensitive services. In ATC. USENIX, 2018.
- Scavenger: A Black-Box Batch Workload Resource Manager for Improving Utilization in Cloud Environments. In SoCC, 2019.
- Jeremy Cloud. Decomposing Twitter: Adventures in Service Oriented Architecture, 2013.
- Morpheus: Towards Automated SLOs for Enterprise Clusters. In OSDI, 2016.
- HyScale: Hybrid and Network Scaling of Dockerized Microservices in Cloud Data Centres. In ICDCS. IEEE, 2019.
- Kubernetes CPU Throttling: The Silent Killer of Response Time β and What to Do About It (blog post). https://community.ibm.com/community/user/aiops/blogs/dina-henderson/2022/06/29/kubernetes-cpu-throttling-the-silent-killer-of-res.
- The Epoch-Greedy Algorithm for Multi-Armed Bandits with Side Information. NIPS, 2007.
- Autothrottle: Satisfying Network Performance Requirements for Containers. IEEE Transactions on Cloud Computing, 2022.
- Stuart Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129β137, 1982.
- Serverless Computing: An Investigation of Factors Influencing Microservice Performance. In ICCE. IEEE, 2018.
- Heracles: Improving Resource Efficiency at Scale. In ISCA, 2015.
- Characterizing Microservice Dependency and Performance: Alibaba Trace Analysis. In SoCC. ACM, 2021.
- Learning Scheduling Algorithms for Data Processing Clusters. In SIGCOMM. ACM, 2019.
- AGILE: Elastic distributed resource scaling for infrastructure-as-a-service. In 10th International Conference on Autonomic Computing (ICAC 13), pages 69β82, 2013.
- GRAF: A graph neural network based proactive resource allocation framework for SLO-oriented microservices. In Proceedings of the 17th International Conference on emerging Networking EXperiments and Technologies, pages 154β167, 2021.
- FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices. In OSDI. ACM, 2020.
- Exploring potential for non-disruptive vertical auto scaling and resource estimation in Kubernetes. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), pages 33β40. IEEE, 2019.
- Autopilot: workload autoscaling at Google. In Proceedings of the Fifteenth European Conference on Computer Systems, pages 1β16, 2020.
- Collective autoscaling for cloud microservices, 2021. arXiv:2112.14845.
- Recommendations as treatments: Debiasing learning and evaluation. In International Conference on Machine Learning, pages 1670β1679. PMLR, 2016.
- Omega: flexible, scalable schedulers for large compute clusters. In Proceedings of the 8th ACM European Conference on Computer Systems, pages 351β364, 2013.
- [SoK] identifying mismatches between microservice testbeds and industrial perceptions of microservices. Journal of Systems Research, 2(1), 2022.
- CloudScale: elastic resource scaling for multi-tenant cloud systems. In Proceedings of the 2nd ACM Symposium on Cloud Computing, pages 1β14, 2011.
- Software Engineering Laboratory of Fudan University. Train Ticket: A Benchmark Microservice System. https://github.com/FudanSELab/train-ticket.
- ΞΌπ\muitalic_ΞΌTune: Auto-tuned threading for OLDI microservices. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 177β194, 2018.
- Reinforcement learning: An introduction (second edition). MIT press, 2020.
- Apache Hadoop YARN: Yet Another Resource Negotiator. In SoCC, pages 1β16. ACM, 2013.
- Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems, pages 1β17, 2015.
- SmartHarvest: Harvesting Idle CPUs Safely and Efficiently in the Cloud. In EuroSys. ACM, 2021.
- John Wilkes. Google cluster data β 2019 traces. https://github.com/google/cluster-data/blob/master/ClusterData2019.md, 2020.
- Genet: Automatic curriculum generation for learning adaptation in networking. In Proceedings of the ACM SIGCOMM 2022 Conference, pages 397β413, 2022.
- Learning in situ: a randomized experiment in video streaming. In 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI 20), pages 495β511, Santa Clara, CA, February 2020. USENIX Association.
- PowerChief: Intelligent Power Allocation for Multi-Stage Applications to Improve Responsiveness on Power Constrained CMP. In Proceedings of the 44th Annual International Symposium on Computer Architecture, pages 133β146, 2017.
- Faster and Cheaper Serverless Computing on Harvested Resources. In SOSP. ACM, 2021.
- Sinan: ML-based and QoS-Aware Resource Management for Cloud Microservices. In ASPLOS. ACM, 2021.
- Overload Control for Scaling WeChat Microservices. In SoCC. ACM, 2018.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.