IoT stands for the “Internet of Things.” It refers to the interconnected network of physical devices, vehicles, appliances, and other objects embedded with sensors, software, and network connectivity, allowing them to collect and exchange data. The fundamental idea behind the Internet of Things is to enable everyday objects to communicate with each other and with central systems through the internet.

Posts

Scale Up while Scaling Out Microservices in Video Analytics Pipelines

Modern video analytics applications comprise multiple microservices chained together as pipelines and executed on container orchestration platforms like Kubernetes. Kubernetes automatically handles the scaling of these microservices for efficient application execution. There are two popular choices for scaling microservices in Kubernetes i.e. scaling Out using Horizontal Pod Autoscaler (HPA) and scaling Up using Vertical Pod Autoscaler (VPA). Both these have been studied independently, but there isn’t much prior work studying the joint scaling of these two. This paper investigates joint scaling, i.e., scaling up while scaling out (HPA) is in action. In particular, we focus on scaling up CPU resources allocated to the application microservices. We show that allocating fixed resources does not work well for different workloads for video analytics pipelines. We also show that Kubernetes’ VPA in conjunction with HPA does not work well for varying application workloads. As a remedy to this problem, in this paper, we propose DataX AutoScaleUp, which performs efficiently scaling up of CPU resources allocated to microservices in video analytics pipelines while Kubernetes’ HPA is operational. DataX AutoScaleUp uses novel techniques to adjust the allocated computing resources to different microservices in video analytics pipelines to improve overall application performance. Through real-world video analytics applications like Face Recognition and Human Attributes, we show that DataX AutoScaleUp can achieve up to 1.45X improvement in application processing rate when compared to alternative approaches with fixed CPU allocation and dynamic CPU allocation using VPA.

DataX Allocator: Dynamic resource management for stream analytics at the Edge

Serverless edge computing aims to deploy and manage applications so that developers are unaware of challenges associated with dynamic management, sharing, and maintenance of the edge infrastructure. However, this is a non-trivial task because the resource usage by various edge applications varies based on the content in their input sensor data streams. We present a novel reinforcement-learning (RL) technique to maximize the processing rates of applications by dynamically allocating resources (like CPU cores or memory) to microservices in these applications. We model applications as analytics pipelines consisting of several microservices, and a pipeline’s processing rate directly impacts the accuracy of insights from the application. In our unique problem formulation, the state space or the number of actions of RL is independent of the type of workload in the microservices, the number of microservices in a pipeline, or the number of pipelines. This enables us to learn the RL model only once and use it many times to improve the accuracy of insights for a diverse set of AI/ML engines like action recognition or face recognition and applications with varying microservices. Our experiments with real-world applications, i.e., face recognition and action recognition, show that our approach outperforms other widely-used alternative approaches and achieves up to 2.5X improvement in the overall application processing rate. Furthermore, when we apply our RL model trained on a face recognition pipeline to a different and more complex action recognition pipeline, we obtain a 2X improvement in processing rate, thus showing the versatility and robustness of our RL model to pipeline changes.

ROMA: Resource Orchestration for Microservices-based 5G Applications

With the growth of 5G, Internet of Things (IoT), edge computing and cloud computing technologies, the infrastructure (compute and network) available to emerging applications (AR/VR, autonomous driving, industry 4.0, etc.) has become quite complex. There are multiple tiers of computing (IoT devices, near edge, far edge, cloud, etc.) that are connected with different types of networking technologies (LAN, LTE, 5G, MAN, WAN, etc.). Deployment and management of applications in such an environment is quite challenging. In this paper, we propose ROMA, which performs resource orchestration for microservices-based 5G applications in a dynamic, heterogeneous, multi-tiered compute and network fabric. We assume that only application-level requirements are known, and the detailed requirements of the individual microservices in the application are not specified. As part of our solution, ROMA identifies and leverages the coupling relationship between compute and network usage for various microservices and solves an optimization problem in order to appropriately identify how each microservice should be deployed in the complex, multi-tiered compute and network fabric, so that the end-to-end application requirements are optimally met. We implemented two real-world 5G applications in video surveillance and intelligent transportation system (ITS) domains. Through extensive experiments, we show that ROMA is able to save up to 90%, 55% and 44% compute and up to 80%, 95% and 75% network bandwidth for the surveillance (watchlist) and transportation application (person and car detection), respectively. This improvement is achieved while honoring the application performance requirements, and it is over an alternative scheme that employs a static and overprovisioned resource allocation strategy by ignoring the resource coupling relationships.

SmartSlice: Dynamic, Self-optimization of Application’s QoS requests to 5G networks

Applications can tailor a network slice by specifying a variety of QoS attributes related to application-specific performance, function or operation. However, some QoS attributes like guaranteed bandwidth required by the application do vary over time. For example, network bandwidth needs of video streams from surveillance cameras can vary a lot depending on the environmental conditions and the content in the video streams. In this paper, we propose a novel, dynamic QoS attribute prediction technique that assists any application to make optimal resource reservation requests at all times. Standard forecasting using traditional cost functions like MAE, MSE, RMSE, MDA, etc. don’t work well because they do not take into account the direction (whether the forecasting of resources is more or less than needed), magnitude (by how much the forecast deviates, and in which direction), or frequency (how many times the forecast deviates from actual needs, and in which direction). The direction, magnitude and frequency have a direct impact on the application’s accuracy of insights, and the operational costs. We propose a new, parameterized cost function that takes into account all three of them, and guides the design of a new prediction technique. To the best of our knowledge, this is the first work that considers time-varying application requirements and dynamically adjusts slice QoS requests to 5G networks in order to ensure a balance between application’s accuracy and operational costs. In a real-world deployment of a surveillance video analytics application over 17 cameras, we show that our technique outperforms other traditional forecasting methods, and it saves 34% of network bandwidth (over a ~24 hour period) when compared to a static, one-time reservation.