Using Global Fiber Networks for Environmental Sensing

We review recent advances in distributed fiber optic sensing (DFOS) and their applications. The scattering mechanisms in glass, which are exploited for reflectometry-based DFOS, are Rayleigh, Brillouin, and Raman scatterings. These are sensitive to either strain and/or temperature, allowing optical fiber cables to monitor their ambient environment in addition to their conventional role as a medium for telecommunications. Recently, DFOS leveraged technologies developed for telecommunications, such as coherent detection, digital signal processing, coding, and spatial/frequency diversity, to achieve improved performance in terms of measurand resolution, reach, spatial resolution, and bandwidth. We review the theory and architecture of commonly used DFOS methods. We provide recent experimental and field trial results where DFOS was used in wide-ranging applications, such as geohazard monitoring, seismic monitoring, traffic monitoring, and infrastructure health monitoring. Events of interest often have unique signatures either in the spatial, temporal, frequency, or wavenumber domains. Based on the temperature and strain raw data obtained from DFOS, downstream postprocessing allows the detection, classification, and localization of events. Combining DFOS with machine learning methods, it is possible to realize complete sensor systems that are compact, low cost, and can operate in harsh environments and difficult-to-access locations, facilitating increased public safety and smarter cities.

APT: Adaptive Perceptual quality based camera Tuning using reinforcement learning

Cameras are increasingly being deployed in cities, enterprises and roads world-wide to enable many applications in public safety, intelligent transportation, retail, healthcare and manufacturing. Often, after initial deployment of the cameras, the environmental conditions and the scenes around these cameras change, and our experiments show that these changes can adversely impact the accuracy of insights from video analytics. This is because the camera parameter settings, though optimal at deployment time, are not the best settings for good-quality video capture as the environmental conditions and scenes around a camera change during operation. Capturing poor-quality video adversely affects the accuracy of analytics. To mitigate the loss in accuracy of insights, we propose a novel, reinforcement-learning based system APT that dynamically, and remotely (over 5G networks), tunes the camera parameters, to ensure a high-quality video capture, which mitigates any loss in accuracy of video analytics. As a result, such tuning restores the accuracy of insights when environmental conditions or scene content change. APT uses reinforcement learning, with no-reference perceptual quality estimation as the reward function. We conducted extensive real-world experiments, where we simultaneously deployed two cameras side-by-side overlooking an enterprise parking lot (one camera only has manufacturer-suggested default setting, while the other camera is dynamically tuned by APT during operation). Our experiments demonstrated that due to dynamic tuning by APT, the analytics insights are consistently better at all times of the day: the accuracy of object detection video analytics application was improved on average by ∼ 42%. Since our reward function is independent of any analytics task, APT can be readily used for different video analytics tasks.

DataX Allocator: Dynamic resource management for stream analytics at the Edge

Serverless edge computing aims to deploy and manage applications so that developers are unaware of challenges associated with dynamic management, sharing, and maintenance of the edge infrastructure. However, this is a non-trivial task because the resource usage by various edge applications varies based on the content in their input sensor data streams. We present a novel reinforcement-learning (RL) technique to maximize the processing rates of applications by dynamically allocating resources (like CPU cores or memory) to microservices in these applications. We model applications as analytics pipelines consisting of several microservices, and a pipeline’s processing rate directly impacts the accuracy of insights from the application. In our unique problem formulation, the state space or the number of actions of RL is independent of the type of workload in the microservices, the number of microservices in a pipeline, or the number of pipelines. This enables us to learn the RL model only once and use it many times to improve the accuracy of insights for a diverse set of AI/ML engines like action recognition or face recognition and applications with varying microservices. Our experiments with real-world applications, i.e., face recognition and action recognition, show that our approach outperforms other widely-used alternative approaches and achieves up to 2.5X improvement in the overall application processing rate. Furthermore, when we apply our RL model trained on a face recognition pipeline to a different and more complex action recognition pipeline, we obtain a 2X improvement in processing rate, thus showing the versatility and robustness of our RL model to pipeline changes.

Availability Analysis for Reliable Distributed Fiber Optic Sensors Placement

We perform the availability analysis for various reliable distributed fiber optic sensor placement schemes in the circumstances of multiple failures. The study can help the network carriers to select the optimal protection scheme for their network sensing services, considering both service availability and hardware cost.

Distributed Optical Fiber Sensing Using Specialty Optical Fibers

Distributed fiber optic sensing systems use long section of optical fiber as the sensing media. Therefore, the fiber characteristics determines the sensing capability and performance. In this presentation, various types of specialty optical fibers and their sensing applications will be introduced and discussed.

A Multi-sensor Feature Fusion Network Model for Bearings Grease Life Assessment in Accelerated Experiments

This paper presents a multi-sensor feature fusion (MSFF) neural network comprised of two inception layer-type multiple channel feature fusion (MCFF) networks for both inner-sensor and cross-sensor feature fusion in conjunction with a deep residual neural network (ResNet) for accurate grease life assessment and bearings health monitoring. The single MCFF network is designed for low-level feature extraction and fusion of either vibration or acoustic emission signals at multi-scales. The concatenation of MCFF networks serves as a cross-sensor feature fusion layer to combine extracted features from both vibration and acoustic emission sources. A ResNet is developed for high-level feature extraction from the fused feature maps and prediction. Besides, to handle the large volume of collected data, original time-series data are transformed to the frequency domain with different sampling intervals and targeted ranges. The proposed MSFF network outperforms other models based on different fusion methods, fully connected network predictors and/or a single sensor source.

Enhancing Video Analytics Accuracy via Real-time Automated Camera Parameter Tuning

In Video Analytics Pipelines (VAP), Analytics Units (AUs) such as object detection and face recognition running on remote servers critically rely on surveillance cameras to capture high-quality video streams in order to achieve high accuracy. Modern IP cameras come with a large number of camera parameters that directly affect the quality of the video stream capture. While a few of such parameters, e.g., exposure, focus, white balance are automatically adjusted by the camera internally, the remaining ones are not. We denote such camera parameters as non-automated (NAUTO) parameters. In this paper, we first show that environmental condition changes can have significant adverse effect on the accuracy of insights from the AUs, but such adverse impact can potentially be mitigated by dynamically adjusting NAUTO camera parameters in response to changes in environmental conditions. We then present CamTuner, to our knowledge, the first framework that dynamically adapts NAUTO camera parameters to optimize the accuracy of AUs in a VAP in response to adverse changes in environmental conditions. CamTuner is based on SARSA reinforcement learning and it incorporates two novel components: a light-weight analytics quality estimator and a virtual camera that drastically speed up offline RL training. Our controlled experiments and real-world VAP deployment show that compared to a VAP using the default camera setting, CamTuner enhances VAP accuracy by detecting 15.9% additional persons and 2.6%–4.2% additional cars (without any false positives) in a large enterprise parking lot and 9.7% additional cars in a 5G smart traffic intersection scenario, which enables a new usecase of accurate and reliable automatic vehicle collision prediction (AVCP). CamTuner opens doors for new ways to significantly enhance video analytics accuracy beyond incremental improvements from refining deep-learning models.

Semi-supervised Identification and Mapping of Water Accumulation Extent using Street-level Monitoring Videos

Urban flooding is becoming a common and devastating hazard, which causes life loss and economic damage. Monitoring and understanding urban flooding in a highly localized scale is a challenging task due to the complicated urban landscape, intricate hydraulic process, and the lack of high-quality and resolution data. The emerging smart city technology such as monitoring cameras provides an unprecedented opportunity to address the data issue. However, estimating water ponding extents on land surfaces based on monitoring footage is unreliable using the traditional segmentation technique because the boundary of the water ponding, under the influence of varying weather, background, and illumination, is usually too fuzzy to identify, and the oblique angle and image distortion in the video monitoring data prevents georeferencing and object-based measurements. This paper presents a novel semi-supervised segmentation scheme for surface water extent recognition from the footage of an oblique monitoring camera. The semi-supervised segmentation algorithm was found suitable to determine the water boundary and the monoplotting method was successfully applied to georeference the pixels of the monitoring video for the virtual quantification of the local drainage process. The correlation and mechanism-based analysis demonstrate the value of the proposed method in advancing the understanding of local drainage hydraulics. The workflow and created methods in this study have a great potential to study other street level and earth surface processes.

The Trade-off between Scanning Beam Penetration and Transmission Beam Gain in mmWave Beam Alignment

Beam search algorithms have been proposed to align the beams from an access point to a user equipment. The process relies on sending beams from a set of scanning beams (SB) and tailoring a transmission beam (TB) using the received feedback. In this paper, we discuss a fundamental trade-off between the gain of SBs and TBs. The higher the gain of an SB, the better the penetration of the SB and the higher the gain of the TB the better the communication link performance. However, TB depends on the set of SBs and by increasing the coverage of each SB and in turn reducing its penetration, there is more opportunity to find a sharper TB to increase its beamforming gain. We define a quantitative measure for such trade-off in terms of a trade-off curve. We introduce SB set design namely Tulip design and formally prove it achieves this fundamental trade-off curve for channels with a single dominant path. We also find closed-form solutions for the trade-off curve for special cases and provide an algorithm with its performance evaluation results to find the trade-off curve revealing the need for further optimization on the SB sets in the state-of-the-art beam search algorithms.

Exploiting Unlabeled Data with Vision and Language Models for Object Detection

Building robust and generic object detection frameworks requires scaling to larger label spaces and bigger training datasets. However, it is prohibitively costly to acquire annotations for thousands of categories at a large scale. We propose a novel method that leverages the rich semantics available in recent vision and language models to localize and classify objects in unlabeled images, effectively generating pseudo labels for object detection. Starting with a generic and class-agnostic region proposal mechanism, we use vision and language models to categorize each region of an image into any object category that is required for downstream tasks. We demonstrate the value of the generated pseudo labels in two specific tasks, open-vocabulary detection, where a model needs to generalize to unseen object categories, and semi-supervised object detection, where additional unlabeled images can be used to improve the model. Our empirical evaluation shows the effectiveness of the pseudo labels in both tasks, where we outperform competitive baselines and achieve a novel state-of-the-art for open-vocabulary object detection. Our code is available at https://github.com/xiaofeng94/VL-PLM.