Content-aware auto-scaling of stream processing applications on container orchestration platforms

Modern applications are designed as an interacting set of microservices, and these applications are typically deployed on container orchestration platforms like Kubernetes. Several attractive features in Kubernetes make it a popular choice for deploying applications, and automatic scaling is one such feature. The default horizontal scaling technique in Kubernetes is the Horizontal Pod Autoscaler (HPA). It scales each microservice independently while ignoring the interactions among the microservices in an application. In this paper, we show that ignoring such interactions by HPA leads to inefficient scaling, and the optimal scaling of different microservices in the application varies as the stream content changes. To automatically adapt to variations in stream content, we present a novel system called DataX AutoScaler that leverages knowledge of the entire stream processing application pipeline to efficiently auto-scale different microservices by taking into account their complex interactions. Through experiments on real-world video analytics applications, such as face recognition and pose classification, we show that DataX AutoScaler adapts to variations in stream content and achieves up to 43% improvement in overall application performance compared to a baseline system that uses HPA.

Exploring the limits of ChatGPT for Query or Aspect based Text Summarization

Text summarization has been a crucial problem in natural language processing (NLP) for several decades. It aims to condense lengthy documents into shorter versions while retaining the most critical information. Various methods have been proposed for text summarization, including extractive and abstractive summarization. The emergence of large language models (LLMs) like GPT3 and ChatGPT has recently created significant interest in using these models for text summarization tasks. Recent studies (Goyal et al., 2022, Zhang et al., 2023) have shown that LLMs generated news summaries are already on par with humans. However, the performance of LLMs for more practical applications like aspect or query based summaries is underexplored. To fill this gap, we conducted an evaluation of ChatGPT’s performance on four widely used benchmark datasets, encompassing diverse summaries from Reddit posts, news articles, dialogue meetings, and stories. Our experiments reveal that ChatGPT’s performance is comparable to traditional fine tuning methods in terms of Rouge scores. Moreover, we highlight some unique differences between ChatGPT generated summaries and human references, providing valuable insights into the superpower of ChatGPT for diverse text summarization tasks. Our findings call for new directions in this area, and we plan to conduct further research to systematically examine the characteristics of ChatGPT generated summaries through extensive human evaluation.

DAS over 1,007-km Hybrid Link with 10-Tb/s DP-16QAM Co-propagation using Frequency-Diverse Chirped Pulses

We report the first distributed acoustic sensing (DAS) experiment with over >1,000 km reach on a hybrid link comprising of a mixture of field and lab fibers with bi-directional inline Raman amplification after each span. We used 20× frequency-diversity chirped-pulses for the probe signal,and recovered the Rayleigh backscatter using a coherent receiver with correlation detection and diversity combining. A measurand resolution of ∼100 pϵ/√ Hz at a gauge length of 20 meters achieved in the offline experiment. We also demonstrate the first real-time FPGA implementation of chirped-pulse DAS without frequency diversity over a range of 210 km.

Time Series Contrastive Learning with Information-Aware Augmentations

Various contrastive learning approaches have been proposed in recent years and have achieved significant empirical success. While effective and prevalent, contrastive learning has been less explored for time series data. A key component of contrastive learning is to select appropriate augmentations, imposing some priors to construct feasible positive samples, such that an encoder can be trained to learn robust and discriminative representations. Unlike image and language domains where “desired” augmented samples can be generated with the rule of thumb guided by prefabricated human priors, the ad-hoc manual selection of time series augmentations is hindered by their diverse and human-unrecognizable temporal structures. How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question. In this work, we address the problem by encouraging both high fidelity and variety based on information theory. A theoretical analysis leads to the criteria for selecting feasible data augmentations. On top of that, we propose a new contrastive learning approach with information-aware augmentations, InfoTS, that adaptively selects optimal augmentations for time series representation learning. Experiments on various datasets show highly competitive performance with up to a 12.0% reduction in MSE on forecasting tasks and up to 3.7% relative improvement in accuracy on classification tasks over the leading baselines.

Adversarial Alignment for Source Free Object Detection

Source-free object detection (SFOD) aims to transfer a detector pre-trained on a label-rich source domain to an unlabeled target domain without seeing source data. While most existing SFOD methods generate pseudo labels via a source-pretrained model to guide training, these pseudo labels usually contain high noises due to heavy domain discrepancy. In order to obtain better pseudo supervisions, we divide the target domain into source-similar and source-dissimilar parts and align them in the feature space by adversarial learning. Specifically, we design a detection variance-based criterion to divide the target domain. This criterion is motivated by a finding that larger detection variances denote higher recall and larger similarity to the source domain. Then we incorporate an adversarial module into a mean teacher framework to drive the feature spaces of these two subsets indistinguishable. Extensive experiments on multiple cross-domain object detection datasets demonstrate that our proposed method consistently outperforms the compared SFOD methods. Our implementation is available at https://github.com/ChuQiaosong

Ambient Noise based Weakly Supervised Manhole Localization Methods over Deployed Fiber Networks

We present a manhole localization method based on distributed fiber optic sensing and weakly supervised machine learning techniques. For the first time to our knowledge, ambient environment data is used for underground cable mapping with the promise of enhancing operational efficiency and reducing field work. To effectively accommodate the weak informativeness of ambient data, a selective data sampling scheme and an attention-based deep multiple instance classification model are adopted, which only requires weakly annotated data. The proposed approach is validated on field data collected by a fiber sensing system over multiple existing fiber networks.

Drone Detection and Localization using Enhanced Fiber-Optic Acoustic Sensor and Distributed Acoustic Sensing Technology

In recent years, the widespread use of drones has led to serious concerns about safety and privacy. Drone detection using microphone arrays has proven to be a promising method. However, it is challenging for microphones to serve large-scale applications due to the issues of synchronization, complexity, and data management. Moreover, distributed acoustic sensing (DAS) using optical fibers has demonstrated its advantages in monitoring vibrations over long distances but does not have the necessary sensitivity for weak airborne acoustics. In this work, we present, to the best of our knowledge, the first fiber-optic quasi-distributed acoustic sensing demonstration for drone surveillance. We develop enhanced fiber-optic acoustic sensors (FOASs) for DAS to detect drone sound. The FOAS shows an ultra-high measured sensitivity of −101.21 re. 1rad/µPa, as well as the capability for high-fidelity speech recovery. A single DAS can interrogate a series of FOASs over a long distance via optical fiber, enabling intrinsic synchronization and centralized signal processing.We demonstrate the field test of drone detection and localization by concatenating four FOASs as DAS. Both the waveforms and spectral features of the drone sound are recognized. With acoustic field mapping and data fusion, accurate drone localization is achieved with a root-mean-square error (RMSE) of 1.47 degrees. This approach holds great potential in large-scale sound detection applications, such as drone detection or city event monitoring.

Distributed fiber optic sensing over readily available telecom fiber networks

Distributed Fiber Optic Sensing (DFOS) systems rely on measuring and analyzing different properties of the backscattered light of an optical pulse propagating along a fiber cable. DFOS systems can measure temperature, strain, vibrations, or acoustic excitations on the fiber cable and to their unique specifications, they have many applications and advantages over competing technologies. In this talk we will focus on the challenges and applications of DFOS systems using outdoor grade telecom fiber networks instead of standard indoor or some specialty fiber cables.

Binding Peptide Generation for MHC Class I Proteins with Deep Reinforcement Learning

Motivation: MHC Class I protein plays an important role in immunotherapy by presenting immunogenic peptides to anti-tumor immune cells. The repertoires of peptides for various MHC Class I proteins are distinct, which can be reflected by their diverse binding motifs. To characterize binding motifs for MHC Class I proteins, in vitro experiments have been conducted to screen peptides with high binding affinities to hundreds of given MHC Class I proteins. However, considering tens of thousands of known MHC Class I proteins, conducting in vitro experiments for extensive MHC proteins is infeasible, and thus a more efficient and scalable way to characterize binding motifs is needed.Results: We presented a de novo generation framework, coined PepPPO, to characterize binding motif for any given MHC Class I proteins via generating repertoires of peptides presented by them. PepPPO leverages a reinforcement learning agent with a mutation policy to mutate random input peptides into positive presented ones. Using PepPPO, we characterized binding motifs for around 10 000 known human MHC Class I proteins with and without experimental for the rapid screening of neoantigens at a much lower time cost than previous deep-learning methods.

Real-time ConcealedWeapon Detection on 3D Radar Images forWalk-through Screening System

This paper presents a framework for real-time concealed weapon detection (CWD) on 3D radar images for walk-through screening systems. The walk-through screening system aims to ensure security in crowded areas by performing CWD on walking persons, hence it requires an accurate and real-time detection approach. To ensure accuracy, a weapon needs to be detected irrespective of its 3D orientation, thus we use the 3D radar images as detection input. For achieving real-time, we reformulate classic U-Net based segmentation networks to perform 3D detection tasks. Our 3D segmentation network predicts peak-shaped probability map, instead of voxel-wise masks, to enable position inference by elementary peak detection operation on the predicted map. In the peak-shaped probability map, the peak marks the weapon’s position. So, weapon detection task translates to peak detection on the probability map. A Gaussian function is used to model weapons in the probability map. We experimentally validate our approach on realistic 3D radar images obtained from a walk-through weapon screening system prototype. Extensive ablation studies verify the effectiveness of our proposed approach over existing conventional approaches. The experimental results demonstrate that our proposed approach can perform accurate and real-time CWD, thus making it suitable for practical applications of walk-through screening.