Auto Scaling (Automatic Scaling) is a cloud computing feature that allows the automatic adjustment of computing resources, such as virtual machines or containers, based on changing demand or workloads. The goal is to ensure optimal performance and resource utilization while minimizing costs. Auto scaling systems monitor the performance metrics of applications or services and automatically adjust the number of resources (scaling out or in) to handle changes in demand.

Posts

Scale Up while Scaling Out Microservices in Video Analytics Pipelines

Modern video analytics applications comprise multiple microservices chained together as pipelines and executed on container orchestration platforms like Kubernetes. Kubernetes automatically handles the scaling of these microservices for efficient application execution. There are two popular choices for scaling microservices in Kubernetes i.e. scaling Out using Horizontal Pod Autoscaler (HPA) and scaling Up using Vertical Pod Autoscaler (VPA). Both these have been studied independently, but there isn’t much prior work studying the joint scaling of these two. This paper investigates joint scaling, i.e., scaling up while scaling out (HPA) is in action. In particular, we focus on scaling up CPU resources allocated to the application microservices. We show that allocating fixed resources does not work well for different workloads for video analytics pipelines. We also show that Kubernetes’ VPA in conjunction with HPA does not work well for varying application workloads. As a remedy to this problem, in this paper, we propose DataX AutoScaleUp, which performs efficiently scaling up of CPU resources allocated to microservices in video analytics pipelines while Kubernetes’ HPA is operational. DataX AutoScaleUp uses novel techniques to adjust the allocated computing resources to different microservices in video analytics pipelines to improve overall application performance. Through real-world video analytics applications like Face Recognition and Human Attributes, we show that DataX AutoScaleUp can achieve up to 1.45X improvement in application processing rate when compared to alternative approaches with fixed CPU allocation and dynamic CPU allocation using VPA.

DataXe: A System for Application Self-optimization in Serverless Edge Computing Environments

A key barrier to building performant, remotely managed and self-optimizing multi-sensor, distributed stream processing edge applications is high programming complexity. We recently proposed DataX [1], a novel platform that improves programmer productivity by enabling easy exchange, transformations, and fusion of data streams on virtualized edge computing infrastructure. This paper extends DataX to include (a) serverless computing that automatically scales stateful and stateless analytics units (AUs) on virtualized edge environments, (b) novel communication mechanisms that efficiently communicate data among analytics units, and (c) new techniques to promote automatic reuse and sharing of analytics processing across multiple applications in a lights out, serverless computing environment. Synthesizing these capabilities into a single platform has been substantially more transformative than any available stream processing system for the edge. We refer to this enhanced and efficient version of DataX as DataXe. To the best of our knowledge, this is the first serverless system for stream processing. For a real-world video analytics application, we observed that the performance of the DataXe implementation of the analytics application is about 3X faster than a standalone implementation of the analytics application with custom, handcrafted communication, multiprocessing and allocation of edge resources.