Ensuring high-quality video content for wireless users has become increasingly vital. Nevertheless, maintaining a consistent level of video quality faces challenges due to the fluctuating encoded bitrate, primarily caused by dynamic video content, especially in live streaming scenarios. Video compression is typically employed to eliminate unnecessary redundancies within and between video frames, thereby reducing the required bandwidth for video transmission. The encoded bitrate and the quality of the compressed video depend on encoder parameters, specifically, the quantization parameter (QP). Poor choices of encoder parameters can result in reduced bandwidth efficiency and high likelihood of non-conformance. Non-conformance refers to the violation of the peak signal-to-noise ratio (PSNR) constraint for an encoded video segment. To address these issues, a real-time deep learning-based H.264 controller is proposed. This controller dynamically estimates the optimal encoder parameters based on the content of a video chunk with minimal delay. The objective is to maintain video quality in terms of PSNR above a specified threshold while minimizing the average bitrate of the compressed video. Experimental results, conducted on both QCIF dataset and a diverse range of random videos from public datasets, validate the effectiveness of this approach. Notably, it achieves improvements of up to 2.5 times in average bandwidth usage compared to the state-of-the-art adaptive bitrate video streaming, with a negligible non-conformance probability below 10?2.
Data crowdsourcing is an increasingly pervasive and lifestyle-changing technology due to the flywheel effect that results from the interaction between the Internet of Things and Cloud Computing. This paper presents the Citizen Science for the Sea with Information Technologies (C4Sea-IT) framework. It is an open platform for gathering marine data from leisure boat instruments. C4Sea-IT aims to provide a coastal marine data gathering, moving, processing, exchange, and sharing platform using the existing navigation instruments and sensors for todays leisure and professional vessels. In this work, a use case for the detection and tracking of marine litter is shown. The final goal is weather/ocean forecasts argumentation with Artificial Intelligence prediction models trained with crowdsourced data.
Deep Video Codec Control Lossy video compression is commonly used when transmitting and storing video data. Unified video codecs (e.g., H.264 or H.265) remain the emph(Unknown sysvar: (de facto)) standard, despite the availability of advanced (neural) compression approaches. Transmitting videos in the face of dynamic network bandwidth conditions requires video codecs to adapt to vastly different compression strengths. Rate control modules augment the codec’s compression such that bandwidth constraints are satisfied and video distortion is minimized. While, both standard video codes and their rate control modules are developed to minimize video distortion w.r.t. human quality assessment, preserving the downstream performance of deep vision models is not considered. In this paper, we present the first end-to-end learnable deep video codec control considering both bandwidth constraints and downstream vision performance, while not breaking existing standardization. We demonstrate for two common vision tasks (semantic segmentation and optical flow estimation) and on two different datasets that our deep codec control better preserves downstream performance than using 2-pass average bit rate control while meeting dynamic bandwidth constraints and adhering to standardizations.
Enabling Cooperative Hybrid Beamforming in TDD-based Distributed MIMO Systems Distributed massive MIMO networks are envisioned to realize cooperative multi-point transmission in next-generation wireless systems. For efficient cooperative hybrid beamforming, the cluster of access points (APs) needs to obtain precise estimates of the uplink channel to perform reliable downlink precoding. However, due to the radio frequency (RF) impairments between the transceivers at the two en-points of the wireless channel, full channel reciprocity does not hold which results in performance degradation in the cooperative hybrid beamforming (CHBF) unless a suitable reciprocity calibration mechanism is in place. We propose a two-step approach to calibrate any two hybrid nodes in the distributed MIMO system. We then present and utilize the novel concept of reciprocal tandem to propose a low-complexity approach for jointly calibrating the cluster of APs and estimating the downlink channel. Finally, we validate our calibration technique’s effectiveness through numerical simulation.
Blind Cyclic Prefix-based CFO Estimation in MIMO-OFDM Systems Low-complexity estimation and correction of carrier frequency offset (CFO) are essential in orthogonal frequency division multiplexing (OFDM). In this paper, we propose a low-overhead blind CFO estimation technique based on cyclic prefix (CP), in multi-input multi-output (MIMO)-OFDM systems. We propose to use antenna diversity for CFO estimation. Given that the RF chains for all antenna elements at a communication node share the same clock, the carrier frequency offset (CFO) between two points may be estimated by using the combination of the received signal at all antennas. We improve our method by combining the antenna diversity with time diversity by considering the CP for multiple OFDM symbols. We provide a closed-form expression for CFO estimation and present algorithms that can considerably improve the CFO estimation performance at the expense of a linear increase in computational complexity. We validate the effectiveness of our estimation scheme via extensive numerical analysis.
Semantic Multi-Resolution Communications Deep learning based joint source-channel coding (JSCC) has demonstrated significant advancements in data reconstruction compared to separate source-channel coding (SSCC). This superiority arises from the suboptimality of SSCC when dealing with finite block-length data. Moreover, SSCC falls short in reconstructing data in a multi-user and/or multi-resolution fashion, as it only tries to satisfy the worst channel and/or the highest quality data. To overcome these limitations, we propose a novel deep learning multi-resolution JSCC framework inspired by the concept of multi-task learning (MTL). This proposed framework excels at encoding data for different resolutions through hierarchical layers and effectively decodes it by leveraging both current and past layers of encoded data. Moreover, this framework holds great potential for semantic communication, where the objective extends beyond data reconstruction to preserving specific semantic attributes throughout the communication process. These semantic features could be crucial elements such as class labels, essential for classification tasks, or other key attributes that require preservation. Within this framework, each level of encoded data can be carefully designed to retain specific data semantics. As a result, the precision of a semantic classifier can be progressively enhanced across successive layers, emphasizing the preservation of targeted semantics throughout the encoding and decoding stages. We conduct experiments on MNIST and CIFAR10 dataset. The experiment with both datasets illustrates that our proposed method is capable of surpassing the SSCC method in reconstructing data with different resolutions, enabling the extraction of semantic features with heightened confidence in successive layers. This capability is particularly advantageous for prioritizing and preserving more crucial semantic features within the datasets.
Retrospective : A dynamically configurable coprocessor for convolutional neural networks In 2008, parallel computing posed significant challenges due to the complexities of parallel programming and the bottlenecks associated with efficient parallel execution. Inspired by the remarkable scalability achieved by networking and storage systems in handling extensive packet traffic and persistent data respectively by leveraging best-effort service, we proposed a new and fundamentally different approach of best-effort computing.Having observed that a broad spectrum of existing and emerging computing workloads were from applications that had an inherent forgiving nature , , we proposed best effort computing. The new approach resulted in disproportionate gains in power, energy and latency, while improving performance. While contemplating the concept of best-effort computing , we noticed the resurgence of convolutional neural networks, which generated approximate but acceptable outcomes for numerous recognition, mining, and synthesis tasks. The lead author of this retrospective had previously conducted research on neural networks for his doctoral dissertation over a decade ago, and the reemergence of neural networks proved both surprising and exciting. Recognizing the connection between best-effort computing and convolutional neural networks, in 2008 we embarked on developing a programmable and dynamically reconfigurable convolutional neural network capable of performing best effort computing for various machine learning tasks that inherently allow for multiple acceptable answers. This combination of our thoughts on best-effort computing and the gradual evolution of convolutional neural networks (deep neural networks emerged much later) culminated in our 2010 ISCA work on dynamically reconfigurable convolutional neural networks.
Elixir: A system to enhance data quality for multiple analytics on a video stream IoT sensors, especially video cameras, are ubiquitously deployed around the world to perform a variety of computer vision tasks in several verticals including retail, health- care, safety and security, transportation, manufacturing, etc. To amortize their high deployment effort and cost, it is desirable to perform multiple video analytics tasks, which we refer to as Analytical Units (AUs), off the video feed coming out of every camera. As AUs typically use deep learning-based AI/ML models, their performance depend on the quality of the input video, and recent work has shown that dynamically adjusting the camera setting exposed by popular network cameras can help improve the quality of the video feed and hence the AU accuracy, in a single AU setting. In this paper, we first show that in a multi-AU setting, changing the camera setting has disproportionate impact on different AUs performance. In particular, the optimal setting for one AU may severely degrade the performance for another AU, and further the impact on different AUs varies as the environmental condition changes. We then present Elixir, a system to enhance the video stream quality for multiple analytics on a video stream. Elixir leverages Multi-Objective Reinforcement Learning (MORL), where the RL agent caters to the objectives from different AUs and adjusts the camera setting to simultaneously enhance the performance of all AUs. To define the multiple objectives in MORL, we develop new AU-specific quality estimator values for each individual AU. We evaluate Elixir through real-world experiments on a testbed with three cameras deployed next to each other (overlooking a large enterprise parking lot) running Elixir and two baseline approaches, respectively. Elixir correctly detects 7.1% (22,068) and 5.0% (15,731) more cars, 94% (551) and 72% (478) more faces, and 670.4% (4975) and 158.6% (3507) more persons than the default-setting and time-sharing approaches, respectively. It also detects 115 license plates, far more than the time-sharing approach (7) and the default setting (0).
FactionFormer: Context-Driven Collaborative Vision Transformer Models for Edge Intelligence Edge Intelligence has received attention in the recent times for its potential towards improving responsiveness, reducing the cost of data transmission, enhancing security and privacy, and enabling autonomous decisions by edge devices. However, edge devices lack the power and compute resources necessary to execute most Al models. In this paper, we present FactionFormer, a novel method to deploy resource-intensive deep-learning models, such as vision transformers (ViT), on resource-constrained edge devices. Our method is based on a key observation: edge devices are often deployed in settings where they encounter only a subset of the classes that the resource intensive Al model is trained to classify, and this subset changes across deployments. Therefore, we automatically identify this subset as a faction, devise on-the fly a bespoke resource-efficient ViT called a modelette for the faction and set up an efficient processing pipeline consisting of a modelette on the device, a wireless network such as 5G, and the resource-intensive ViT model on an edge server, all of which work collaboratively to do the inference. For several ViT models pre-trained on benchmark datasets, FactionFormer’s modelettes are up to 4× smaller than the corresponding baseline models in terms of the number of parameters, and they can infer up to 2.5× faster than the baseline setup where every input is processed by the resource-intensive ViT on the edge server. Our work is the first of its kind to propose a device-edge collaborative inference framework where bespoke deep learning models for the device are automatically devised on-the-fly for most frequently encountered subset of classes.
AnB: Application-in-a-Box to rapidly deploy and self-optimize 5G apps We present Application in a Box (AnB) product concept aimed at simplifying the deployment and operation of remote 5G applications. AnB comes pre-configured with all necessary hardware and software components, including sensors like cameras, hardware and software components for a local 5G wireless network, and 5G-ready apps. Enterprises can easily download additional apps from an App Store. Setting up a 5G infrastructure and running applications on it is a significant challenge, but AnB is designed to make it fast, convenient, and easy, even for those without extensive knowledge of software, computers, wireless networks, or AI-based analytics. With AnB, customers only need to open the box, set up the sensors, turn on the 5G networking and edge computing devices, and start running their applications. Our system software automatically deploys and optimizes the pipeline of microservices in the application on a tiered computing infrastructure that includes device, edge, and cloud computing. Dynamic resource management, placement of critical tasks for low-latency response, and dynamic network bandwidth allocation for efficient 5G network usage are all automatically orchestrated. AnB offers cost savings, simplified setup and management, and increased reliability and security. We’ve implemented several real-world applications, such as collision prediction at busy traffic light intersections and remote construction site monitoring using video analytics. With AnB, deployment and optimization effort can be reduced from several months to just a few minutes. This is the first-of-its-kind approach to easing deployment effort and automating self-optimization of the application during system operation.
StreetAware: A High-Resolution Synchronized Multimodal Urban Scene Dataset Access to high-quality data is an important barrier in the digital analysis of urban settings, including applications within computer vision and urban design. Diverse forms of data collected from sensors in areas of high activity in the urban environment, particularly at street intersections, are valuable resources for researchers interpreting the dynamics between vehicles, pedestrians, and the built environment. In this paper, we present a high-resolution audio, video, and LiDAR dataset of three urban intersections in Brooklyn, New York, totaling almost 8 unique hours. The data were collected with custom Reconfigurable Environmental Intelligence Platform (REIP) sensors that were designed with the ability to accurately synchronize multiple video and audio inputs. The resulting data are novel in that they are inclusively multimodal, multi-angular, high-resolution, and synchronized. We demonstrate four ways the data could be utilized — (1) to discover and locate occluded objects using multiple sensors and modalities, (2) to associate audio events with their respective visual representations using both video and audio modes, (3) to track the amount of each type of object in a scene over time, and (4) to measure pedestrian speed using multiple synchronized camera views. In addition to these use cases, our data are available for other researchers to carry out analyses related to applying machine learning to understanding the urban environment (in which existing datasets may be inadequate), such as pedestrian-vehicle interaction modeling and pedestrian attribute recognition. Such analyses can help inform decisions made in the context of urban sensing and smart cities, including accessibility-aware urban design and Vision Zero initiatives.
RIS-aided mmWave Beamforming for Two-way Communications of Multiple Pairs Millimeter‑wave (mmWave) communications is a key enabler towards realizing enhanced Mobile Broadband (eMBB) as a key promise of 5G and beyond, due to the abundance of bandwidth available at mmWave bands. An mmWave coverage map consists of blind spots due to shadowing and fading especially in dense urban environments. Beamformingemploying massive MIMO is primarily used to address high attenuation in the mmWave channel. Due to their ability in manipulating the impinging electromagnetic waves in an energy‑efficient fashion, Reconfigurable Intelligent Surfaces (RISs) are considered a great match to complement the massive MIMO systems in realizing the beamforming task and therefore effectively filling in the mmWave coverage gap. In this paper, we propose a novel RIS architecture, namely RIS‑UPA where the RIS elements are arranged in a Uniform Planar Array (UPA). We show how RIS‑UPA can be used in an RIS‑aided MIMO system to fill the coverage gap in mmWave by forming beams of a custom footprint, with optimized main lobe gain, minimum leakage, and fairly sharp edges. Further, we propose a configuration for RIS‑UPA that can support multiple two‑way communication pairs, simultaneously. We theoretically obtain closed‑form low‑complexity solutions for our design and validate our theoretical findings by extensive numerical experiments.
Channel Reciprocity Calibration for Hybrid Beamforming in Distributed MIMO Systems Time Division Duplex (TDD)-based distributed massive MIMO systems are envisioned as candidate solution for the physical layer of 6G multi-antenna systems supporting cooperative hybrid beamforming that heavily relies on the obtained uplink channel estimates for efficient coherent downlink precoding. However, due to the hardware impairment between the transmitter and the receiver, full channel reciprocity does not hold between the downlink and uplink direction. Such reciprocity mismatch deteriorates the performance of mm-Wave hybrid beamforming and has to be estimated and compensated for, to avoid performance degradation in the co-operative hybrid beamforming. In this paper, we address the channel reciprocity calibration between any two nodes at two levels. We decompose the problem into two sub-problems. In the first sub-problem, we calibrate the digital chain, i.e. obtain the mismatch coefficients of the (DAC/ADC) up to a constant scaling factor. In the second subproblem, we obtain the (PA/LNA) mismatch coefficients. At each step, we formulate the channel reciprocity calibration as a least square optimization problem that can efficiently be solved via conventional methods such as alternative optimization with high accuracy. Finally, we verify the performance of our channel reciprocity calibration approach through extensive numerical experiments.
Content-aware auto-scaling of stream processing applications on container orchestration platforms Modern applications are designed as an interacting set of microservices, and these applications are typically deployed on container orchestration platforms like Kubernetes. Several attractive features in Kubernetes make it a popular choice for deploying applications, and automatic scaling is one such feature. The default horizontal scaling technique in Kubernetes is the Horizontal Pod Autoscaler (HPA). It scales each microservice independently while ignoring the interactions among the microservices in an application. In this paper, we show that ignoring such interactions by HPA leads to inefficient scaling, and the optimal scaling of different microservices in the application varies as the stream content changes. To automatically adapt to variations in stream content, we present a novel system called DataX AutoScaler that leverages knowledge of the entire stream processing application pipeline to efficiently auto-scale different microservices by taking into account their complex interactions. Through experiments on real-world video analytics applications, such as face recognition and pose classification, we show that DataX AutoScaler adapts to variations in stream content and achieves up to 43% improvement in overall application performance compared to a baseline system that uses HPA.
DyCo: Dynamic, Contextualized AI Models Devices with limited computing resources use smaller AI models to achieve low-latency inferencing. However, model accuracy is typically much lower than the accuracy of a bigger model that is trained and deployed in places where the computing resources are relatively abundant. We describe DyCo, a novel system that ensures privacy of stream data and dynamically improves the accuracy of small models used in devices. Unlike knowledge distillation or federated learning, DyCo treats AI models as black boxes. DyCo uses a semi-supervised approach to leverage existing training frameworks and network model architectures to periodically train contextualized, smaller models for resource-constrained devices. DyCo uses a bigger, highly accurate model in the edge-cloud to auto-label data received from each sensor stream. Training in the edge-cloud (as opposed to the public cloud) ensures data privacy, and bespoke models for thousands of live data streams can be designed in parallel by using multiple edge-clouds. DyCo uses the auto-labeled data to periodically re-train, stream-specific, bespoke small models. To reduce the periodic training costs, DyCo uses different policies that are based on stride, accuracy, and confidence information.We evaluate our system, and the contextualized models, by using two object detection models for vehicles and people, and two datasets (a public benchmark and another real-world proprietary dataset). Our results show that DyCo increases the mAP accuracy measure of small models by an average of 16.3% (and up to 20%) for the public benchmark and an average of 19.0% (and up to 64.9%) for the real-world dataset. DyCo also decreases the training costs for contextualized models by more than an order of magnitude.
DataX Allocator: Dynamic resource management for stream analytics at the Edge Serverless edge computing aims to deploy and manage applications so that developers are unaware of challenges associated with dynamic management, sharing, and maintenance of the edge infrastructure. However, this is a non-trivial task because the resource usage by various edge applications varies based on the content in their input sensor data streams. We present a novel reinforcement-learning (RL) technique to maximize the processing rates of applications by dynamically allocating resources (like CPU cores or memory) to microservices in these applications. We model applications as analytics pipelines consisting of several microservices, and a pipeline’s processing rate directly impacts the accuracy of insights from the application. In our unique problem formulation, the state space or the number of actions of RL is independent of the type of workload in the microservices, the number of microservices in a pipeline, or the number of pipelines. This enables us to learn the RL model only once and use it many times to improve the accuracy of insights for a diverse set of AI/ML engines like action recognition or face recognition and applications with varying microservices. Our experiments with real-world applications, i.e., face recognition and action recognition, show that our approach outperforms other widely-used alternative approaches and achieves up to 2.5X improvement in the overall application processing rate. Furthermore, when we apply our RL model trained on a face recognition pipeline to a different and more complex action recognition pipeline, we obtain a 2X improvement in processing rate, thus showing the versatility and robustness of our RL model to pipeline changes.
APT: Adaptive Perceptual quality based camera Tuning using reinforcement learning Cameras are increasingly being deployed in cities, enterprises and roads world-wide to enable many applications in public safety, intelligent transportation, retail, healthcare and manufacturing. Often, after initial deployment of the cameras, the environmental conditions and the scenes around these cameras change, and our experiments show that these changes can adversely impact the accuracy of insights from video analytics. This is because the camera parameter settings, though optimal at deployment time, are not the best settings for good-quality video capture as the environmental conditions and scenes around a camera change during operation. Capturing poor-quality video adversely affects the accuracy of analytics. To mitigate the loss in accuracy of insights, we propose a novel, reinforcement-learning based system APT that dynamically, and remotely (over 5G networks), tunes the camera parameters, to ensure a high-quality video capture, which mitigates any loss in accuracy of video analytics. As a result, such tuning restores the accuracy of insights when environmental conditions or scene content change. APT uses reinforcement learning, with no-reference perceptual quality estimation as the reward function. We conducted extensive real-world experiments, where we simultaneously deployed two cameras side-by-side overlooking an enterprise parking lot (one camera only has manufacturer-suggested default setting, while the other camera is dynamically tuned by APT during operation). Our experiments demonstrated that due to dynamic tuning by APT, the analytics insights are consistently better at all times of the day: the accuracy of object detection video analytics application was improved on average by ∼ 42%. Since our reward function is independent of any analytics task, APT can be readily used for different video analytics tasks.
Enhancing Video Analytics Accuracy via Real-time Automated Camera Parameter Tuning In Video Analytics Pipelines (VAP), Analytics Units (AUs) such as object detection and face recognition running on remote servers critically rely on surveillance cameras to capture high-quality video streams in order to achieve high accuracy. Modern IP cameras come with a large number of camera parameters that directly affect the quality of the video stream capture. While a few of such parameters, e.g., exposure, focus, white balance are automatically adjusted by the camera internally, the remaining ones are not. We denote such camera parameters as non-automated (NAUTO) parameters. In this paper, we first show that environmental condition changes can have significant adverse effect on the accuracy of insights from the AUs, but such adverse impact can potentially be mitigated by dynamically adjusting NAUTO camera parameters in response to changes in environmental conditions. We then present CamTuner, to our knowledge, the first framework that dynamically adapts NAUTO camera parameters to optimize the accuracy of AUs in a VAP in response to adverse changes in environmental conditions. CamTuner is based on SARSA reinforcement learning and it incorporates two novel components: a light-weight analytics quality estimator and a virtual camera that drastically speed up offline RL training. Our controlled experiments and real-world VAP deployment show that compared to a VAP using the default camera setting, CamTuner enhances VAP accuracy by detecting 15.9% additional persons and 2.6%–4.2% additional cars (without any false positives) in a large enterprise parking lot and 9.7% additional cars in a 5G smart traffic intersection scenario, which enables a new usecase of accurate and reliable automatic vehicle collision prediction (AVCP). CamTuner opens doors for new ways to significantly enhance video analytics accuracy beyond incremental improvements from refining deep-learning models.
Why is the video analytics accuracy fluctuating, and what can we do about it? It is a common practice to think of a video as a sequence of images (frames), and re-use deep neural network models that are trained only on images for similar analytics tasks on videos. In this paper, we show that this “leap of faith” that deep learning models that work well on images will also work well on videos is actually flawed. We show that even when a video camera is viewing a scene that is not changing in any human-perceptible way, and we control for external factors like video compression and environment (lighting), the accuracy of video analytics application fluctuates noticeably. These fluctuations occur because successive frames produced by the video camera may look similar visually but are perceived quite differently by the video analytics applications. We observed that the root cause for these fluctuations is the dynamic camera parameter changes that a video camera automatically makes in order to capture and produce a visually pleasing video. The camera inadvertently acts as an “unintentional adversary” because these slight changes in the image pixel values in consecutive frames, as we show, have a noticeably adverse impact on the accuracy of insights from video analytics tasks that re-use image-trained deep learning models. To address this inadvertent adversarial effect from the camera, we explore the use of transfer learning techniques to improve learning in video analytics tasks through the transfer of knowledge from learning on image analytics tasks. Our experiments with a number of different cameras, and a variety of different video analytics tasks, show that the inadvertent adversarial effect from the camera can be noticeably offset by quickly re-training the deep learning models using transfer learning. In particular, we show that our newly trained Yolov5 model reduces fluctuation in object detection across frames, which leads to better tracking of objects (∼40% fewer mistakes in tracking). Our paper also provides new directions and techniques to mitigate the camera’s adversarial effect on deep learning models used for video analytics applications.
Efficient Compression Method for Roadside LiDAR Data Roadside LiDAR (Light Detection and Ranging) sensors are recently being explored for intelligent transportation systems aiming at safer and faster traffic management and vehicular operations. A key challenge in such systems is to efficiently transfer massive point-cloud data from the roadside LiDAR devices to the edge connected through a 5G network for real-time processing. In this paper, we consider the problem of compressing roadside (i.e. static) LiDAR data in real-time that provides a unique condition unexplored by current methods. Existing point-cloud compression methods assume moving LiDARs (that are mounted on vehicles) and do not exploit spatial consistency across frames over time.To this end, we develop a novel grouped wavelet technique for static roadside LiDAR data compression (i.e. SLiC). Our method compresses LiDAR data both spatially and temporally using a kd-tree data structure based on Haar wavelet coefficients. Experimental results show that SLiC can compress up to 1.9× more effectively than the state-of-the-art compression method can do. Moreover, SLiC is computationally more efficient to achieve 2× improvement in bandwidth usage over the best alternative. Even with this impressive gain in communication and storage efficiency, SLiC retains down-the-pipeline application’s accuracy.
5GLoR: 5G LAN Orchestration for enterprise IoT applications 5G-LAN is an enterprise local area network (LAN) that leverages 5G technology for wireless connectivity instead of WiFi. 5G technology is unique: it uses network slicing to distinguish customers in the same traffic class using new QoS technologies in the RF domain. This unique ability is not supported by most enterprise LANs, which rely primarily on DiffServ-like technologies that distinguish among traffic classes rather than customers. We first show that this mismatch in QoS between the 5G network and the LAN affects the accuracy of insights from the LAN-resident analytics applications. We systematically analyze the root causes of the QoS mismatch and propose a first-of-a-kind 5G-LAN orchestrator (5GLoR). 5GLoR is a middleware that applications can use to preserve the QoS of their 5G data streams through the enterprise LAN. In most cases, the loss of QoS is not due to the oversubscription of LAN switches but primarily due to the inefficient assignment of 5G data to queues at ingress and egress ports. 5GLoR periodically analyzes the status of these queues, provides suitable DSCP identifiers to the application, and installs relevant switch re-write rules (to change DSCP identifiers between switches) to continuously preserve the QoS of the 5G data through the LAN. 5GLoR improves the RTP frame level delay and inter-frame delay by 212% and 122%, respectively, for the WebRTC application. Additionally, with 5GLoR, the accuracy of two example applications (face detection and recognition) improved by 33%, while the latency was reduced by about 25%. Our experiments show that the performance (accuracy and latency) of applications on a 5G-LAN performs well with the proposed 5GLoR compared to the same applications on MEC. This is significant because 5G-LAN offers an order of magnitude more computing, networking, and storage resources to the applications than the resource-constrained MEC, and mature enterprise technologies can be used to deploy, manage, and update IoT applications.
DataXc: Flexible and efficient communication in microservices-based stream analytics pipelines A big challenge in changing a monolithic application into a performant microservices-based application is the design of efficient mechanisms for microservices to communicate with each other. Prior proposals range from custom point-to-point communication among microservices using protocols like gRPC to service meshes like Linkerd to a flexible, many-to-many communication using broker-based messaging systems like NATS. We propose a new communication mechanism, DataXc, that is more efficient than prior proposals in terms of message latency, jitter, message processing rate and use of network resources. To the best of our knowledge, DataXc is the first communication design that has the desirable flexibility of a broker-based messaging systems like NATS and the high-performance of a rigid, custom point-to-point communication method. DataXc proposes a novel “pull” based communication method (i.e consumers fetch messages from producers). This is unlike prior proposals like NATS, gRPC or Linkerd, all of which are “push” based (i.e. producers send messages to consumers). Such communication methods make it difficult to take advantage of differential processing rates of consumers like video analytics tasks. In contrast, DataXc proposes a “pull” based design that avoids unnecessary communication of messages that are eventually discarded by the consumers. Also, unlike prior proposals, DataXc successfully addresses several key challenges in streaming video analytics pipelines like non-uniform processing of frames from multiple cameras, and high variance in latency of frames processed by consumers, all of which adversely affect the quality of insights from streaming video analytics. We report results on two popular real-world, streaming video analytics pipelines (video surveillance, and video action recognition). Compared to NATS, DataXc is just as flexible, but it has far superior performance: upto 80% higher processing rate, 3X lower latency, 7.5X lower jitter and 4.5X lower network bandwidth usage. Compared to gRPC or Linkerd, DataXc is highly flexible, achieves up to 2X higher processing rate, lower latency and lower jitter, but it also consumes more network bandwidth.
Application-specific, Dynamic Reservation of 5G Compute and Network Resources by using Reinforcement Learning 5G services and applications explicitly reserve compute and network resources in today’s complex and dynamic infrastructure of multi-tiered computing and cellular networking to ensure application-specific service quality metrics, and the infrastructure providers charge the 5G services for the resources reserved. A static, one-time reservation of resources at service deployment typically results in extended periods of under-utilization of reserved resources during the lifetime of the service operation. This is due to a plethora of reasons like changes in content from the IoT sensors (for example, change in number of people in the field of view of a camera) or a change in the environmental conditions around the IoT sensors (for example, time of the day, rain or fog can affect data acquisition by sensors). Under-utilization of a specific resource like compute can also be due to temporary inadequate availability of another resource like the network bandwidth in a dynamic 5G infrastructure. We propose a novel Reinforcement Learning-based online method to dynamically adjust an application’s compute and network resource reservations to minimize under-utilization of requested resources, while ensuring acceptable service quality metrics. We observe that a complex application-specific coupling exists between the compute and network usage of an application. Our proposed method learns this coupling during the operation of the service, and dynamically modulates the compute and network resource requests to mimimize under-utilization of reserved resources. Through experimental evaluation using real-world video analytics application, we show that our technique is able to capture complex compute-network coupling relationship in an online manner i.e. while the application is running, and dynamically adapts and saves up to 65% compute and 93% network resources on average (over multiple runs), without significantly impacting application accuracy.
Cosine Similarity based Few-Shot Video Classifier with Attention-based Aggregation Meta learning algorithms for few-shot video recognition use complex, episodic training but they often fail to learn effective feature representations. In contrast, we propose a new and simpler few-shot video recognition method that does not use meta-learning, but its performance compares well with the best meta-learning proposals. Our new few-shot video classification pipeline consists of two distinct phases. In the pre-training phase, we learn a good video feature extraction network that generates a feature vector for each video. After a sparse sampling strategy selects frames from the video, we generate a video feature vector from the sampled frames. Our proposed video feature extractor network, which consists of an image feature extraction network followed by a new transformer encoder, is trained end-to-end by including a classifier head that uses cosine similarity layer instead of the traditional linear layer to classify a corpus of labeled video examples. Unlike prior work in meta learning, we do not use episodic training to learn the image feature vector. Also, unlike prior work that averages frame-level feature vectors into a single video feature vector, we combine individual frame-level feature vectors by using a new Transformer encoder that explicitly captures the key, temporal properties in the sequence of sampled frames. End-to-end training of the video feature extractor ensures that the proposed Transformer encoder captures important temporal properties in the video, while the cosine similarity layer explicitly reduces the intra-class variance of videos that belong to the same class. Next, in the few-shot adaptation phase, we use the learned video feature extractor to train a new video classifier by using the few available examples from novel classes. Results on SSV2-100 and Kinetics-100 benchmarks show that our proposed few-shot video classifier outperforms the meta-learning-based methods and achieves the best state-of-the-art accuracy. We also show that our method can easily discern between actions and their inverse (for example, picking something up vs. putting something down), while prior art, which averages image feature vectors, is unable to do so.
Chimera: Context-Aware Splittable Deep Multitasking Models for Edge Intelligence Design of multitasking deep learning models has mostly focused on improving the accuracy of the constituent tasks, but the challenges of efficiently deploying such models in a device-edge collaborative setup (that is common in 5G deployments) has not been investigated. Towards this end, in this paper, we propose an approach called Chimera 1 for training (done Offline) and deployment (done Online) of multitasking deep learning models that are splittable across the device and edge. In the offline phase, we train our multi-tasking setup such that features from a pre-trained model for one of the tasks (called the Primary task) are extracted and task-specific sub-models are trained to generate the other (Secondary) tasks’ outputs through a knowledge distillation like training strategy to mimic the outputs of pre-trained models for the tasks. The task-specific sub-models are designed to be significantly lightweight than the original pre-trained models for the Secondary tasks. Once the sub-models are trained, during deployment, for given deployment context, characterized by the configurations, we search for the optimal (in terms of both model performance and cost) deployment strategy for the generated multitasking model, through finding one or multiple suitable layer(s) for splitting the model, so that inference workloads are distributed between the device and the edge server and the inference is done in a collaborative manner. Extensive experiments on benchmark computer vision tasks demonstrate that Chimera generates splittable multitasking models that are at least ~ 3 x parameter efficient than the existing such models, and the end-to-end device-edge collaborative inference becomes ~ 1.35 x faster with our choice of context-aware splitting decisions.
ROMA: Resource Orchestration for Microservices-based 5G Applications With the growth of 5G, Internet of Things (IoT), edge computing and cloud computing technologies, the infrastructure (compute and network) available to emerging applications (AR/VR, autonomous driving, industry 4.0, etc.) has become quite complex. There are multiple tiers of computing (IoT devices, near edge, far edge, cloud, etc.) that are connected with different types of networking technologies (LAN, LTE, 5G, MAN, WAN, etc.). Deployment and management of applications in such an environment is quite challenging. In this paper, we propose ROMA, which performs resource orchestration for microservices-based 5G applications in a dynamic, heterogeneous, multi-tiered compute and network fabric. We assume that only application-level requirements are known, and the detailed requirements of the individual microservices in the application are not specified. As part of our solution, ROMA identifies and leverages the coupling relationship between compute and network usage for various microservices and solves an optimization problem in order to appropriately identify how each microservice should be deployed in the complex, multi-tiered compute and network fabric, so that the end-to-end application requirements are optimally met. We implemented two real-world 5G applications in video surveillance and intelligent transportation system (ITS) domains. Through extensive experiments, we show that ROMA is able to save up to 90%, 55% and 44% compute and up to 80%, 95% and 75% network bandwidth for the surveillance (watchlist) and transportation application (person and car detection), respectively. This improvement is achieved while honoring the application performance requirements, and it is over an alternative scheme that employs a static and overprovisioned resource allocation strategy by ignoring the resource coupling relationships.
DataXe: A System for Application Self-optimization in Serverless Edge Computing Environments A key barrier to building performant, remotely managed and self-optimizing multi-sensor, distributed stream processing edge applications is high programming complexity. We recently proposed DataX , a novel platform that improves programmer productivity by enabling easy exchange, transformations, and fusion of data streams on virtualized edge computing infrastructure. This paper extends DataX to include (a) serverless computing that automatically scales stateful and stateless analytics units (AUs) on virtualized edge environments, (b) novel communication mechanisms that efficiently communicate data among analytics units, and (c) new techniques to promote automatic reuse and sharing of analytics processing across multiple applications in a lights out, serverless computing environment. Synthesizing these capabilities into a single platform has been substantially more transformative than any available stream processing system for the edge. We refer to this enhanced and efficient version of DataX as DataXe. To the best of our knowledge, this is the first serverless system for stream processing. For a real-world video analytics application, we observed that the performance of the DataXe implementation of the analytics application is about 3X faster than a standalone implementation of the analytics application with custom, handcrafted communication, multiprocessing and allocation of edge resources.
Read edge based fever screening system over private 5G (arXiv). Edge computing and 5G have made it possible to perform analytics closer to the source of data and achieve super low latency response times, which is not possible with centralized cloud deployment. In this paper, we present a novel fever screening system, which uses edge machine learning techniques and leverages private 5G to accurately identify and screen individuals with fever in real time. Particularly, we present deep learning based novel techniques for fusion and alignment of cross spectral visual and thermal data streams at the edge. Our novel Cross Spectral Generative Adversarial Network (CS GAN) synthesizes visual images that have the key, representative object level features required to uniquely associate objects across visual and thermal spectrum. Two key features of CS GAN are a novel, feature preserving loss function that results in high quality pairing of corresponding cross spectral objects, and dual bottleneck residual layers with skip connections (a new, network enhancement) to not only accelerate real time inference, but to also speed up convergence during model training at the edge. To the best of our knowledge, this is the first technique that leverages 5G networks and limited edge resources to enable real time feature level association of objects in visual and thermal streams (30 ms per full HD frame on an Intel Core i7 8650 4 core, 1.9GHz mobile processor). To the best of our knowledge, this is also the first system to achieve real time operation, which has enabled fever screening of employees and guests in arenas, theme parks, airports and other critical facilities. By leveraging edge computing and 5G, our fever screening system is able to achieve 98.5% accuracy and is able to process about 5X more people when compared to a centralized cloud deployment.
Read ROMA: Resource Orchestration for Microservices based 5G Applications (arXiv). With the growth of 5G, Internet of Things (IoT), edge computing and cloud computing technologies, the infrastructure (compute and network) available to emerging applications (AR/VR, autonomous driving, industry 4.0, etc.) has become quite complex. There are multiple tiers of computing (IoT devices, near edge, far edge, cloud, etc.) that are connected with different types of networking technologies (LAN, LTE, 5G, MAN, WAN, etc.). Deployment and management of applications in such an environment is quite challenging. In this paper, we propose ROMA, which performs resource orchestration for microservices based 5G applications in a dynamic, heterogeneous, multi tiered compute and network fabric. We assume that only application level requirements are known, and the detailed requirements of the individual microservices in the application are not specified. As part of our solution, ROMA identifies and leverages the coupling relationship between compute and network usage for various microservices and solves an optimization problem in order to appropriately identify how each microservice should be deployed in the complex, multi tiered compute and network fabric, so that the end to end application requirements are optimally met. We implemented two real world 5G applications in video surveillance and intelligent transportation system (ITS) domains. Through extensive experiments, we show that ROMA is able to save up to 90%, 55% and 44% compute and up to 80%, 95% and 75% network bandwidth for the surveillance (watchlist) and transportation application (person and car detection), respectively. This improvement is achieved while honoring the application performance requirements, and it is over an alternative scheme that employs a static and overprovisioned resource allocation strategy by ignoring the resource coupling relationships.
AQuA: Analytical Quality Assessment for Optimizing Video Analytics Systems Millions of cameras at edge are being deployed to power a variety of different deep learning applications. However, the frames captured by these cameras are not always pristine – they can be distorted due to lighting issues, sensor noise, compression etc. Such distortions not only deteriorate visual quality, they impact the accuracy of deep learning applications that process such video streams. In this work, we introduce AQuA, to protect application accuracy against such distorted frames by scoring the level of distortion in the frames. It takes into account the analytical quality of frames, not the visual quality, by learning a novel metric, classifier opinion score, and uses a lightweight, CNN-based, object-independent feature extractor. AQuA accurately scores distortion levels of frames and generalizes to multiple different deep learning applications. When used for filtering poor quality frames at edge, it reduces high-confidence errors for analytics applications by 17%. Through filtering, and due to its low overhead (14ms), AQuA can also reduce computation time and average bandwidth usage by 25%.
Edge-based fever screening system over private 5G Edge computing and 5G have made it possible to perform analytics closer to the source of data and achieve super-low latency response times, which isn’t possible with centralized cloud deployment. In this paper, we present a novel fever screening system, which uses edge machine learning techniques and leverages private 5G to accurately identify and screen individuals with fever in real-time. Particularly, we present deep-learning based novel techniques for fusion and alignment of cross-spectral visual and thermal data streams at the edge. Our novel Cross-Spectral Generative Adversarial Network (CS-GAN) synthesizes visual images that have the key, representative object level features required to uniquely associate objects across visual and thermal spectrum. Two key features of CS-GAN are a novel, feature-preserving loss function that results in high-quality pairing of corresponding cross-spectral objects, and dual bottleneck residual layers with skip connections (a new, network enhancement) to not only accelerate real-time inference, but to also speed up convergence during model training at the edge. To the best of our knowledge, this is the first technique that leverages 5G networks and limited edge resources to enable real-time feature-level association of objects in visual and thermal streams (30 ms per full HD frame on an Intel Core i7-8650 4-core, 1.9GHz mobile processor). To the best of our knowledge, this is also the first system to achieve real-time operation, which has enabled fever screening of employees and guests in arenas, theme parks, airports and other critical facilities. By leveraging edge computing and 5G, our fever screening system is able to achieve 98.5% accuracy and is able to process ∼ 5X more people when compared to a centralized cloud deployment.
Magic-Pipe: Self-optimizing video analytics pipelines Microservices-based video analytics pipelines routinely use multiple deep convolutional neural networks. We observe that the best allocation of resources to deep learning engines (or microservices) in a pipeline, and the best configuration of parameters for each engine vary over time, often at a timescale of minutes or even seconds based on the dynamic content in the video. We leverage these observations to develop Magic-Pipe, a self-optimizing video analytic pipeline that leverages AI techniques to periodically self-optimize. First, we propose a new, adaptive resource allocation technique to dynamically balance the resource usage of different microservices, based on dynamic video content. Then, we propose an adaptive microservice parameter tuning technique to balance the accuracy and performance of a microservice, also based on video content. Finally, we propose two different approaches to reduce unnecessary computations due to unavoidable mismatch of independently designed, re-usable deep-learning engines: a deep learning approach to improve the feature extractor performance by filtering inputs for which no features can be extracted, and a low-overhead graph-theoretic approach to minimize redundant computations across frames. Our evaluation of Magic-Pipe shows that pipelines augmented with self-optimizing capability exhibit application response times that are an order of magnitude better than the original pipelines, while using the same hardware resources, and achieving similar high accuracy.
SmartSlice: Dynamic, self-optimization of application’s QoS requests to 5G networks Applications can tailor a network slice by specifying a variety of QoS attributes related to application-specific performance, function or operation. However, some QoS attributes like guaranteed bandwidth required by the application do vary over time. For example, network bandwidth needs of video streams from surveillance cameras can vary a lot depending on the environmental conditions and the content in the video streams. In this paper, we propose a novel, dynamic QoS attribute prediction technique that assists any application to make optimal resource reservation requests at all times. Standard forecasting using traditional cost functions like MAE, MSE, RMSE, MDA, etc. don’t work well because they do not take into account the direction (whether the forecasting of resources is more or less than needed), magnitude (by how much the forecast deviates, and in which direction), or frequency (how many times the forecast deviates from actual needs, and in which direction). The direction, magnitude and frequency have a direct impact on the application’s accuracy of insights, and the operational costs. We propose a new, parameterized cost function that takes into account all three of them, and guides the design of a new prediction technique. To the best of our knowledge, this is the first work that considers time-varying application requirements and dynamically adjusts slice QoS requests to 5G networks in order to ensure a balance between application’s accuracy and operational costs. In a real-world deployment of a surveillance video analytics application over 17 cameras, we show that our technique outperforms other traditional forecasting methods, and it saves 34% of network bandwidth (over a ~24 hour period) when compared to a static, one-time reservation.
Read SmartSlice: Dynamic, self optimization of application’s QoS requests to 5G networks (arXiv). Applications can tailor a network slice by specifying a variety of QoS attributes related to application specific performance, function or operation. However, some QoS attributes like guaranteed bandwidth required by the application do vary over time. For example, network bandwidth needs of video streams from surveillance cameras can vary a lot depending on the environmental conditions and the content in the video streams. In this paper, we propose a novel, dynamic QoS attribute prediction technique that assists any application to make optimal resource reservation requests at all times. Standard forecasting using traditional cost functions like MAE, MSE, RMSE, MDA, etc. don’t work well because they do not take into account the direction (whether the forecasting of resources is more or less than needed), magnitude (by how much the forecast deviates, and in which direction), or frequency (how many times the forecast deviates from actual needs, and in which direction). The direction, magnitude and frequency have a direct impact on the application’s accuracy of insights, and the operational costs. We propose a new, parameterized cost function that takes into account all three of them, and guides the design of a new prediction technique. To the best of our knowledge, this is the first work that considers time varying application requirements and dynamically adjusts slice QoS requests to 5G networks in order to ensure a balance between application’s accuracy and operational costs. In a real world deployment of a surveillance video analytics application over 17 cameras, we show that our technique outperforms other traditional forecasting methods, and it saves 34% of network bandwidth (over a ~24 hour period) when compared to a static, one time reservation.
Read DataX: A system for Data eXchange and transformation of streams (arXiv). The exponential growth in smart sensors and rapid progress in 5G networks is creating a world awash with data streams. However, a key barrier to building performant multi sensor, distributed stream processing applications is high programming complexity. We propose DataX, a novel platform that improves programmer productivity by enabling easy exchange, transformations, and fusion of data streams. DataX abstraction simplifies the application’s specification and exposes parallelism and dependencies among the application functions (microservices). DataX runtime automatically sets up appropriate data communication mechanisms, enables effortless reuse of microservices and data streams across applications, and leverages serverless computing to transform, fuse, and auto scale microservices. DataX makes it easy to write, deploy and reliably operate distributed applications at scale. Synthesizing these capabilities into a single platform is substantially more transformative than any available stream processing system.
CamTuner: Reinforcement Learning based System for Camera Parameter Tuning to enhance Analytics Video analytics systems critically rely on video cameras, which capture high quality video frames, to achieve high analytics accuracy. Although modern video cameras often expose tens of configurable parameter settings that can be set by end users, deployment of surveillance cameras today often uses a fixed set of parameter settings because the end users lack the skill or understanding to reconfigure these parameters. In this paper, we first show that in a typical surveillance camera deployment, environmental condition changes can significantly affect the accuracy of analytics units such as person detection, face detection and face recognition, and how such adverse impact can be mitigated by dynamically adjusting camera settings. We then propose CAMTUNER, a framework that can be easily applied to an existing video analytics pipeline (VAP) to enable automatic and dynamic adaptation of complex camera settings to changing environmental conditions, and autonomously optimize the accuracy of analytics units (AUs) in the VAP. CAMTUNER is based on SARSA reinforcement learning (RL) and it incorporates two novel components: a light weight analytics quality estimator and a virtual camera. CAMTUNER is implemented in a system with AXIS surveillance cameras and several VAPs (with various AUs) that processed day long customer videos captured at airport entrances. Our evaluations show that CAMTUNER can adapt quickly to changing environments. We compared CAMTUNER with two alternative approaches where either static camera settings were used, or a strawman approach where camera settings were manually changed every hour (based on human perception of quality). We observed that for the face detection and person detection AUs, CAMTUNER is able to achieve up to 13.8% and 9.2% higher accuracy, respectively, compared to the best of the two approaches (average improvement of 8% for both AUs).
UAC: An Uncertainty-Aware Face Clustering Algorithm We investigate ways to leverage uncertainty in face images to improve the quality of the face clusters. We observe that popular clustering algorithms do not produce better quality clusters when clustering probabilistic face representations that implicitly model uncertainty – these algorithms predict up to 9.6X more clusters than the ground truth for the IJB-A benchmark. We empirically analyze the causes for this unexpected behavior and identify excessive false-positives and false-negatives (when comparing face-pairs) as the main reasons for poor quality clustering. Based on this insight, we propose an uncertainty-aware clustering algorithm, UAC, which explicitly leverages uncertainty information during clustering to decide when a pair of faces are similar or when a predicted cluster should be discarded. UAC considers (a) uncertainty of faces in face-pairs, (b) bins face-pairs into different categories based on an uncertainty threshold, (c) intelligently varies the similarity threshold during clustering to reduce false-negatives and false-positives, and (d) discards predicted clusters that exhibit a high measure of uncertainty. Extensive experimental results on several popular benchmarks and comparisons with state-of-the-art clustering methods show that UAC produces significantly better clusters by leveraging uncertainty in face images – predicted number of clusters is up to 0.18X more of the ground truth for the IJB-A benchmark.
AppSlice: A system for application-centric design of 5G and edge computing applications Applications that use edge computing and 5G to improve response times consume both compute and network resources. However, 5G networks manage only network resources without considering the application’s compute requirements, and container orchestration frameworks manage only compute resources without considering the application’s network requirements. We observe that there is a complex coupling between an application’s compute and network usage, which can be leveraged to improve application performance and resource utilization. We propose a new, declarative abstraction called app slice that jointly considers the application’s compute and network requirements. This abstraction leverages container management systems to manage edge computing resources, and 5G network stacks to manage network resources, while the joint consideration of coupling between compute and network usage is explicitly managed by a new runtime system, which delivers the declarative semantics of the app slice. The runtime system also jointly manages the edge compute and network resource usage automatically across different edge computing environments and 5G networks by using two adaptive algorithms. We implement a complex, real-world, real-time monitoring application using the proposed app slice abstraction, and demonstrate on a private 5G/LTE testbed that the proposed runtime system significantly improves the application performance and resource usage when compared with the case where the coupling between the compute and network resource usage is ignored.
DataX: A system for Data eXchange and transformation of streams The exponential growth in smart sensors and rapid progress in 5G networks is creating a world awash with data streams. However, a key barrier to building performant multi-sensor, distributed stream processing applications is high programming complexity. We propose DataX, a novel platform that improves programmer productivity by enabling easy exchange, transformations, and fusion of data streams. DataX abstraction simplifies the application’s specification and exposes parallelism and dependencies among the application functions (microservices). DataX runtime automatically sets up appropriate data communication mechanisms, enables effortless reuse of microservices and data streams across applications, and leverages serverless computing to transform, fuse, and auto-scale microservices. DataX makes it easy to write, deploy and reliably operate distributed applications at scale. Synthesizing these capabilities into a single platform is substantially more transformative than any available stream processing system.
Read ECO: Edge Cloud Optimization of 5G applications (arXiv). Centralized cloud computing with 100+ milliseconds network latencies cannot meet the tens of milliseconds to sub millisecond response times required for emerging 5G applications like autonomous driving, smart manufacturing, tactile internet, and augmented or virtual reality. We describe a new, dynamic runtime that enables such applications to make effective use of a 5G network, computing at the edge of this network, and resources in the centralized cloud, at all times. Our runtime continuously monitors the interaction among the microservices, estimates the data produced and exchanged among the microservices, and uses a novel graph min cut algorithm to dynamically map the microservices to the edge or the cloud to satisfy application specific response times. Our runtime also handles temporary network partitions, and maintains data consistency across the distributed fabric by using microservice proxies to reduce WAN bandwidth by an order of magnitude, all in an (Unknown sysvar: (it application specific manner)) by leveraging knowledge about the application’s functions, latency critical pipelines and intermediate data. We illustrate the use of our runtime by successfully mapping two complex, representative real world video analytics applications to the AWS/Verizon Wavelength edge cloud architecture, and improving application response times by 2x when compared with a static edge cloud implementation.
Read F3S: Free Flow Fever Screening (arXiv). Identification of people with elevated body temperature can reduce or dramatically slow down the spread of infectious diseases like COVID 19. We present a novel fever screening system, F3S, that uses edge machine learning techniques to accurately measure core body temperatures of multiple individuals in a free flow setting. F3S performs real time sensor fusion of visual camera with thermal camera data streams to detect elevated body temperature, and it has several unique features: (a) visual and thermal streams represent very different modalities, and we dynamically associate semantically equivalent regions across visual and thermal frames by using a new, dynamic alignment technique that analyzes content and context in real time, (b) we track people through occlusions, identify the eye (inner canthus), forehead, face and head regions where possible, and provide an accurate temperature reading by using a prioritized refinement algorithm, and (c) we robustly detect elevated body temperature even in the presence of personal protective equipment like masks, or sunglasses or hats, all of which can be affected by hot weather and lead to spurious temperature readings. F3S has been deployed at over a dozen large commercial establishments, providing contact less, free flow, real time fever screening for thousands of employees and customers in indoor and outdoor settings.
Read AppSlice: A system for application centric design of 5G and edge computing applications (arXiv) that use edge computing and 5G to improve response times consume both compute and network resources. However, 5G networks manage only network resources without considering the application’s compute requirements, and container orchestration frameworks manage only compute resources without considering the application’s network requirements. We observe that there is a complex coupling between an application’s compute and network usage, which can be leveraged to improve application performance and resource utilization. We propose a new, declarative abstraction called AppSlice that jointly considers the application’s compute and network requirements. This abstraction leverages container management systems to manage edge computing resources, and 5G network stacks to manage network resources, while the joint consideration of coupling between compute and network usage is explicitly managed by a new runtime system, which delivers the declarative semantics of the app slice. The runtime system also jointly manages the edge compute and network resource usage automatically across different edge computing environments and 5G networks by using two adaptive algorithms. We implement a complex, real world, real time monitoring application using the proposed app slice abstraction, and demonstrate on a private 5G/LTE testbed that the proposed runtime system significantly improves the application performance and resource usage when compared with the case where the coupling between the compute and network resource usage is ignored.
F3S: Free Flow Fever Screening Identification of people with elevated body temperature can reduce or dramatically slow down the spread of infectious diseases like COVID-19. We present a novel fever-screening system, F 3 S, that uses edge machine learning techniques to accurately measure core body temperatures of multiple individuals in a free-flow setting. F 3 S performs real-time sensor fusion of visual camera with thermal camera data streams to detect elevated body temperature, and it has several unique features: (a) visual and thermal streams represent very different modalities, and we dynamically associate semantically-equivalent regions across visual and thermal frames by using a new, dynamic alignment technique that analyzes content and context in real-time, (b) we track people through occlusions, identify the eye (inner canthus), forehead, face and head regions where possible, and provide an accurate temperature reading by using a prioritized refinement algorithm, and (c) we robustly detect elevated body temperature even in the presence of personal protective equipment like masks, or sunglasses or hats, all of which can be affected by hot weather and lead to spurious temperature readings. F 3 S has been deployed at over a dozen large commercial establishments, providing contact-less, free-flow, real-time fever screening for thousands of employees and customers in indoors and outdoor settings.
ECO: Edge-Cloud Optimization of 5G applications Centralized cloud computing with 100+ milliseconds network latencies cannot meet the tens of milliseconds to sub-millisecond response times required for emerging 5G applications like autonomous driving, smart manufacturing, tactile internet, and augmented or virtual reality. We describe a new, dynamic runtime that enables such applications to make effective use of a 5G network, computing at the edge of this network, and resources in the centralized cloud, at all times. Our runtime continuously monitors the interaction among the microservices, estimates the data produced and exchanged among the microservices, and uses a novel graph min-cut algorithm to dynamically map the microservices to the edge or the cloud to satisfy application-specific response times. Our runtime also handles temporary network partitions, and maintains data consistency across the distributed fabric by using microservice proxies to reduce WAN bandwidth by an order of magnitude, all in an application-specific manner by leveraging knowledge about the application’s functions, latency-critical pipelines and intermediate data. We illustrate the use of our runtime by successfully mapping two complex, representative real-world video analytics applications to the AWS/Verizon Wavelength edge-cloud architecture, and improving application response times by 2x when compared with a static edge-cloud implementation.
BAFFLE: Decentralized Blockchain based Aggregator-Free Federated Learning A key aspect of Federated Learning (FL) is the requirement of a centralized aggregator to maintain and update the global model. However, in many cases orchestrating a centralized aggregator might be infeasible due to numerous operational constraints. In this paper, we introduce BAFFLE, an aggregator free, blockchain driven, FL environment that is inherently decentralized. BAFFLE leverages Smart Contracts (SC) to coordinate the round delineation, model aggregation and update tasks in FL. BAFFLE boosts computational performance by decomposing the global parameter space into distinct chunks followed by a score and bid strategy. In order to characterize the performance of BAFFLE, we conduct experiments on a private Ethereum network and use the centralized and aggregator driven methods as our benchmark. We show that BAFFLE significantly reduces the gas costs for FL on the blockchain as compared to a direct adaptation of the aggregator based method. Our results also show that BAFFLE achieves high scalability and computational efficiency while delivering similar accuracy as the benchmark methods.
Stochastic Decision-Making Model for Aggregation of Residential Units with PV-Systems and Storages Many residential energy consumers have installed photovoltaic (PV) panels and energy storage systems. These residential users can aggregate and participate in the energy markets. A stochastic decision making model for an aggregation of these residential units for participation in two-settlement markets is proposed in this paper. Scenarios are generated using Seasonal Autoregressive Integrated Moving Average (SARIMA) model and joint probability distribution function of the forecast errors to model the uncertainties of the real-time prices, PV generations and demands. The proposed scenario generation model of this paper treats forecast errors as random variable, which allows to reflect new information observed in the real-time market into scenario generation process without retraining SARIMA or re-fitting probability distribution functions over the forecast errors. This approach significantly improves the computational time of the proposed model. A simulation study is conducted for an aggregation of 6 residential units, and the results highlights the benefits of aggregation as well as the proposed stochastic decision-making model.
Beam Training Optimization in Millimeter-wave Systems under Beamwidth, Modulation and Coding Constraints Millimeter-wave (mmWave) bands have the potential to enable significantly high data rates in wireless systems. In order to overcome intense path loss and severe shadowing in these bands, it is essential to employ directional beams for data transmission. Furthermore, it is known that the mmWave channel incorporates a few number of spatial clusters necessitating additional time to align the corresponding beams with the channel prior to data transmission. This procedure is known as beam training (BT). While a longer BT leads to more directional beams (equivalently higher beamforming gains), there is less time for data communication. In this paper, this trade-off is investigated for a time slotted system under practical constraints such as finite beamwidth resolution and discrete modulation and coding schemes. At each BT time slot, the access point (AP) scans a region of uncertainty by transmitting a probing packet and refines angle of arrival (AoA) estimate based on user equipment (UE) feedback. Given a total number time slots, the objective is to find the optimum allocation between BT and data transmission and a feasible beamwidth for the estimation of AoA at each BT time slot such that the expected throughput is maximized. It is shown that the problem satisfies the optimal substructure property enabling the use of a backward dynamic programming approach to find the optimal solution with polynomial computational complexity. Simulation results reveal that in practical scenarios, the proposed approach outperforms existing techniques such as exhaustive and bisection search.
Opportunistic Temporal Fair Mode Selection and User Scheduling for Full-duplex Systems In-band full-duplex (FD) communications – enabled by recent advances in antenna and RF circuit design – has emerged as one of the promising techniques to improve data rates in wireless systems. One of the major roadblocks in enabling high data rates in FD systems is the inter-user interference (IUI) due to activating pairs of uplink and downlink users at the same time-frequency resource block. Opportunistic user scheduling has been proposed as a means to manage IUI and fully exploit the multiplexing gains in FD systems. In this paper, scheduling under long-term and short-term temporal fairness for single-cell FD wireless networks is considered. Temporal fair scheduling is of interest in delay-sensitive applications, and leads to predictable latency and power consumption. The feasible region of user temporal demand vectors is derived, and a scheduling strategy maximizing the system utility while satisfying long-term temporal fairness is proposed. Furthermore, a short-term temporal fair scheduling strategy is devised which satisfies user temporal demands over a finite window-length. It is shown that the strategy achieves optimal average system utility as the window-length is increased asymptotically. Subsequently, practical construction algorithms for long-term and short-term temporal fair scheduling are introduced. Simulations are provided to verify the derivations and investigate the multiplexing gains. It is observed that using successive interference cancellation at downlink users improves FD gains significantly in the presence of strong IUI.
Robust Beam Tracking and Data Communication in Millimeter Wave Mobile Networks Millimeter-wave (mmWave) bands have shown the potential to enable high data rates for next generation mobile networks. In order to cope with high path loss and severe shadowing in mmWave frequencies, it is essential to employ massive antenna arrays and generate narrow transmission patterns (beams). When narrow beams are used, mobile user tracking is indispensable for reliable communication. In this paper, a joint beam tracking and data communication strategy is proposed in which, the base station (BS) increases the beamwidth during data transmission to compensate for location uncertainty caused by user mobility. In order to evade low beamforming gains due to widening the beam pattern, a probing scheme is proposed in which the BS transmits a number of probing packets to refine the estimation of angle of arrival based on the user feedback, which enables reliable data transmission through narrow beams again. In the proposed scheme, time is divided into similar frames each consisting of a probing phase followed by a data communication phase. A steady state analysis is provided based on which, the duration of data transmission and probing phases are optimized. Furthermore, the results are generalized to consider practical constraints such as minimum feasible beamwidth. Simulation results reveal that the proposed method outperforms well-known approaches such as optimized beam sweeping.
TrackIO: Tracking First Responders Inside-Out First responders, a critical lifeline of any society, often find themselves in precarious situations. The ability to track them real-time in unknown indoor environments, would significantly contributes to the success of their mission as well as their safety. In this work, we present the design, implementation and evaluation of TrackIO–a system capable of accurately localizing and tracking mobile responders real-time in large indoor environments. TrackIO leverages the mobile virtual infrastructure offered by unmanned aerial vehicles (UAVs), coupled with the balanced penetration-accuracy tradeoff offered by ultra-wideband (UWB), to accomplish this objective directly from outside, without relying on access to any indoor infrastructure. Towards a practical system, TrackIO incorporates four novel mechanisms in its design that address key challenges to enable tracking responders (i) who are mobile with potentially non-uniform velocities (e.g. during turns), (ii) deep indoors with challenged reachability, (iii) in real-time even for a large network, and (iv) with high accuracy even when impacted by UAVs position error. TrackIOs real-world performance reveals that it can track static nodes with a median accuracy of about 11.5m and mobile (even running) nodes with a median accuracy of 22.5m in large buildings in real-time.
SkyRAN: A Self-Organizing LTE RAN in the Sky We envision a flexible, dynamic airborne LTE infrastructure built upon Unmanned Autonomous Vehicles (UAVs) that will provide on-demand, on-time, network access, anywhere. In this paper, we design, implement and evaluate SkyRAN, a self-organizing UAV-based LTE RAN (Radio Access Network) that is a key component of this UAV LTE infrastructure network. SkyRAN determines the UAV’s operating position in 3D airspace so as to optimize connectivity to all the UEs on the ground. It realizes this by overcoming various challenges in constructing and maintaining radio environment maps to UEs that guide the UAV’s position in real-time. SkyRAN is designed to be scalable in that it can be quickly deployed to provide efficient connectivity even over a larger area. It is adaptive in that it reacts to changes in the terrain and UE mobility, to maximize LTE coverage performance while minimizing operating overhead. We implement SkyRAN on a DJI Matrice 600 Pro drone and evaluate it over a 90 000 m2 operating area. Our testbed results indicate that SkyRAN can place the UAV in the optimal location with about 30 secs of a measurement flight. On an average, SkyRAN achieves a throughput of 0.9 – 0.95X of optimal, which is about 1.5 – 2X over other popular baseline schemes.
SkyCore: Moving Core to the Edge for Untethered and Reliable UAV-based LTE Networks The advances in unmanned aerial vehicle (UAV) technology have empowered mobile operators to deploy LTE base stations (BSs) on UAVs, and provide on-demand, adaptive connectivity to hotspot venues as well as emergency scenarios. However, today’s evolved packet core (EPC) that orchestrates the LTE RAN faces fundamental limitations in catering to such a challenging, wireless and mobile UAV environment, particularly in the presence of multiple BSs (UAVs). In this work, we argue for and propose an alternate, radical edge EPC design, called SkyCore that pushes the EPC functionality to the extreme edge of the core network – collapses the EPC into a single, light-weight, self-contained entity that is co-located with each of the UAV BS. SkyCore incorporates elements that are designed to address the unique challenges facing such a distributed design in the UAV environment, namely the resource-constraints of UAV platforms, and the distributed management of pronounced UAV and UE mobility. We build and deploy a fully functional version of SkyCore on a two-UAV LTE network and showcase its (i) ability to interoperate with commercial LTE BSs as well as smartphones, (ii) support for both hotspot and standalone multi-UAV deployments, and (iii) superior control and data plane performance compared to other EPC variants in this environment.
ELI: Empowering LTE with Interference Awareness in Unlicensed Spectrum The advent of LTE into the unlicensed spectrum has necessitated the understanding of its operational efficiency when sharing spectrum with different radio access technologies. Our study reveals that LTE, owing to its inherent transmission characteristics, suffers significant performance degradation in the presence of interference caused by hidden terminals. This motivates the need for interference-awareness in LTE’s channel access in unlicensed spectrum. To address this problem, we propose ELI. ELI’s three-pronged solution equips the LTE base station with novel techniques to: (a) accurately detect and measure interference caused by hidden terminals, (b) collect interference statistics from clients across different channels with affordable overhead, and (c) leverage interference-awareness to improve its channel access performance. Our evaluations show that ELI can achieve 1.5-2x throughput gains over baseline schemes. Finally, ELI is LTE-LAA/MulteFire-standard compliant and can be deployed over the existing LTE-LAA implementation without any modifications.
SkyLiTE: End-to-End Design of Low-altitutde UAV Networks for Providing LTE Connectivity Un-manned aerial vehicle (UAVs) have the potential to change the landscape of wide-area wireless connectivity by bringing them to areas where connectivity was sparing or non-existent (e.g. rural areas) or has been compromised due to disasters. While Google’s Project Loon and Facebook’s Project Aquila are examples of high-altitude, long-endurance UAV-based connectivity efforts in this direction, the telecom operators (e.g. AT&T and Verizon) have been exploring low-altitude UAV-based LTE solutions for on-demand deployments. Understandably, these projects are in their early stages and face formidable challenges in their realization and deployment. The goal of this document is to expose the reader to both the challenges as well as the potential offered by these unconventional connectivity solutions. We aim to explore the end-to-end design of such UAV-based connectivity networks particularly in the context of low-altitude UAV networks providing LTE connectivity. Specifically, we aim to highlight the challenges that span across multiple layers (access, core network, and backhaul) in an inter-twined manner as well as the richness and complexity of the design space itself. To help interested readers navigate this complex design space towards a solution, we also articulate the overview of one such end-to-end design, namely SkyLiTE– a self-organizing network of low-altitude UAVs that provide optimized LTE connectivity in a desired region.