Attentive Variational Information Bottleneck for TCR–peptide interaction prediction

We present a multi-sequence generalization of Variational Information Bottleneck and call the resulting model Attentive Variational Information Bottleneck (AVIB). Our AVIB model leverages multi-head self-attention to implicitly approximate a posterior distribution over latent encodings conditioned on multiple input sequences. We apply AVIB to a fundamental immuno-oncology problem: predicting the interactions between T-cell receptors (TCRs) and peptides.ResultsExperimental results on various datasets show that AVIB significantly outperforms state-of-the-art methods for TCR–peptide interaction prediction. Additionally, we show that the latent posterior distribution learned by AVIB is particularly effective for the unsupervised detection of out-of-distribution amino acid sequences.

Towards Robust Graph Neural Networks via Adversarial Contrastive Learning

Graph Neural Network (GNN), as a powerful representation learning model on graph data, attracts much attention across various disciplines. However, recent studies show that GNN is vulnerable to adversarial attacks. How to make GNN more robust? What are the key vulnerabilities in GNN? How to address the vulnerabilities and defend GNN against the adversarial attacks? Adversarial training has shown to be effective in improving the robustness of traditional Deep Neural Networks (DNNs). However, existing adversarial training works mainly focus on the image data, which consists of continuous features, while the features and structures of graph data are often discrete. Moreover, rather than assuming each sample is independent and identically distributed as in DNN, GNN leverages the contextual information across the graph (e.g., neighborhoods of a node). Thus, existing adversarial training techniques cannot be directly applied to defend GNN. In this paper, we propose ContrastNet, an effective adversarial defense framework for GNN. In particular, we propose an adversarial contrastive learning method to train the GNN over the adversarial space. To further improve the robustness of GNN, we investigate the latent vulnerabilities in every component of a GNN encoder and propose corresponding refining strategies. Extensive experiments on three public datasets demonstrate the effectiveness of ContrastNet in improving the robustness of popular GNN variants, such as Graph Convolutional Network and GraphSage, under various types of adversarial attacks.

Deep Federated Anomaly Detection for Multivariate Time Series Data

Although many anomaly detection approaches have been developed for multivariate time series data, limited effort has been made in federated settings in which multivariate time series data are heterogeneously distributed among different edge devices while data sharing is prohibited. In this paper, we investigate the problem of federated unsupervised anomaly detection and present a Federated Exemplar-based Deep Neural Network (Fed-ExDNN) to conduct anomaly detection for multivariate time series data on different edge devices. Specifically, we first design an Exemplar-based Deep Neural network (ExDNN) for learning local time series representations based on their compatibility with an exemplar module which consists of hidden parameters learned to capture varieties of normal patterns on each edge device. Next, a constrained clustering mechanism (FedCC) is employed on the centralized server to align and aggregate the parameters of different local exemplar modules to obtain a unified global exemplar module. Finally, the global exemplar module is deployed together with a shared feature encoder to each edge device, and anomaly detection is conducted by examining the compatibility of testing data to the exemplar module. Fed-ExDNN captures local normal time series patterns with ExDNN and aggregates these patterns by FedCC, and thus can handle the heterogeneous data distributed over different edge devices simultaneously. Thoroughly empirical studies on six public datasets show that ExDNN and Fed-ExDNN can outperform state-of-the-art anomaly detection algorithms and federated learning techniques, respectively.

KGxBoard: Explainable and Interactive Leaderboard for Evaluation of Knowledge Graph Completion Models

Knowledge Graphs (KGs) store information in the form of (head, predicate, tail)-triples. To augment KGs with new knowledge, researchers proposed models for KG Completion (KGC) tasks such as link prediction, i.e., answering (h, p, ?) or (?, p, t) queries. Such models are usually evaluated with averaged metrics on a held-out test set. While useful for tracking progress, averaged single-score metrics cannotreveal what exactly a model has learned — or failed to learn. To address this issue, we propose KGxBoard: an interactive framework for performing fine-grained evaluation on meaningful subsets of the data, each of which tests individual and interpretable capabilities of a KGC model. In our experiments, we highlight the findings that we discovered with the use of KGxBoard, which would have been impossible to detect with standard averaged single-score metrics.

Personalized Federated Learning via Heterogeneous Modular Networks

Personalized Federated Learning (PFL) which collaboratively trains a federated model while considering local clients under privacy constraints has attracted much attention. Despite its popularity, it has been observed that existing PFL approaches result in sub-optimal solutions when the joint distribution among local clients diverges. To address this issue, we present Federated Modular Network (FedMN), a novel PFL approach that adaptively selects sub-modules from a module pool to assemble heterogeneous neural architectures for different clients. FedMN adopts a light-weighted routing hypernetwork to model the joint distribution on each client and produce the personalized selection of the module blocks for each client. To reduce the communication burden in existing FL, we develop an efficient way to interact between the clients and the server. We conduct extensive experiments on the real-world test beds and the results show both effectiveness and efficiency of the proposed FedMN over the baselines.

DeepGAR: Deep Graph Learning for Analogical Reasoning

Analogical reasoning is the process of discovering and mapping correspondences from a target subject to a base subject. As the most well-known computational method of analogical reasoning, Structure-Mapping Theory (SMT) abstracts both target and base subjects into relational graphs and forms the cognitive process of analogical reasoning by finding a corresponding subgraph (i.e., correspondence) in the target graph that is aligned with the base graph. However, incorporating deep learning for SMT is still under-explored due to several obstacles: 1) the combinatorial complexity of searching for the correspondence in the target graph, 2) the correspondence mining is restricted by various cognitive theory-driven constraints. To address both challenges, we propose a novel framework for Analogical Reasoning (DeepGAR) that identifies the correspondence between source and target domains by assuring cognitive theory-driven constraints. Specifically, we design a geometric constraint embedding space to induce subgraph relation from node embeddings for efficient subgraph search. Furthermore, we develop novel learning and optimization strategies that could end-to-end identify correspondences that are strictly consistent with constraints driven by the cognitive theory. Extensive experiments are conducted on synthetic and real-world datasets to demonstrate the effectiveness of the proposed DeepGAR over existing methods. The code and data are available at: https://github.com/triplej0079/DeepGAR.

Using Global Fiber Networks for Environmental Sensing

We review recent advances in distributed fiber optic sensing (DFOS) and their applications. The scattering mechanisms in glass, which are exploited for reflectometry-based DFOS, are Rayleigh, Brillouin, and Raman scatterings. These are sensitive to either strain and/or temperature, allowing optical fiber cables to monitor their ambient environment in addition to their conventional role as a medium for telecommunications. Recently, DFOS leveraged technologies developed for telecommunications, such as coherent detection, digital signal processing, coding, and spatial/frequency diversity, to achieve improved performance in terms of measurand resolution, reach, spatial resolution, and bandwidth. We review the theory and architecture of commonly used DFOS methods. We provide recent experimental and field trial results where DFOS was used in wide-ranging applications, such as geohazard monitoring, seismic monitoring, traffic monitoring, and infrastructure health monitoring. Events of interest often have unique signatures either in the spatial, temporal, frequency, or wavenumber domains. Based on the temperature and strain raw data obtained from DFOS, downstream postprocessing allows the detection, classification, and localization of events. Combining DFOS with machine learning methods, it is possible to realize complete sensor systems that are compact, low cost, and can operate in harsh environments and difficult-to-access locations, facilitating increased public safety and smarter cities.

DataX Allocator: Dynamic resource management for stream analytics at the Edge

Serverless edge computing aims to deploy and manage applications so that developers are unaware of challenges associated with dynamic management, sharing, and maintenance of the edge infrastructure. However, this is a non-trivial task because the resource usage by various edge applications varies based on the content in their input sensor data streams. We present a novel reinforcement-learning (RL) technique to maximize the processing rates of applications by dynamically allocating resources (like CPU cores or memory) to microservices in these applications. We model applications as analytics pipelines consisting of several microservices, and a pipeline’s processing rate directly impacts the accuracy of insights from the application. In our unique problem formulation, the state space or the number of actions of RL is independent of the type of workload in the microservices, the number of microservices in a pipeline, or the number of pipelines. This enables us to learn the RL model only once and use it many times to improve the accuracy of insights for a diverse set of AI/ML engines like action recognition or face recognition and applications with varying microservices. Our experiments with real-world applications, i.e., face recognition and action recognition, show that our approach outperforms other widely-used alternative approaches and achieves up to 2.5X improvement in the overall application processing rate. Furthermore, when we apply our RL model trained on a face recognition pipeline to a different and more complex action recognition pipeline, we obtain a 2X improvement in processing rate, thus showing the versatility and robustness of our RL model to pipeline changes.

APT: Adaptive Perceptual quality based camera Tuning using reinforcement learning

Cameras are increasingly being deployed in cities, enterprises and roads world-wide to enable many applications in public safety, intelligent transportation, retail, healthcare and manufacturing. Often, after initial deployment of the cameras, the environmental conditions and the scenes around these cameras change, and our experiments show that these changes can adversely impact the accuracy of insights from video analytics. This is because the camera parameter settings, though optimal at deployment time, are not the best settings for good-quality video capture as the environmental conditions and scenes around a camera change during operation. Capturing poor-quality video adversely affects the accuracy of analytics. To mitigate the loss in accuracy of insights, we propose a novel, reinforcement-learning based system APT that dynamically, and remotely (over 5G networks), tunes the camera parameters, to ensure a high-quality video capture, which mitigates any loss in accuracy of video analytics. As a result, such tuning restores the accuracy of insights when environmental conditions or scene content change. APT uses reinforcement learning, with no-reference perceptual quality estimation as the reward function. We conducted extensive real-world experiments, where we simultaneously deployed two cameras side-by-side overlooking an enterprise parking lot (one camera only has manufacturer-suggested default setting, while the other camera is dynamically tuned by APT during operation). Our experiments demonstrated that due to dynamic tuning by APT, the analytics insights are consistently better at all times of the day: the accuracy of object detection video analytics application was improved on average by ∼ 42%. Since our reward function is independent of any analytics task, APT can be readily used for different video analytics tasks.

NEC Labs America’s Time Series Data Research Drives Space Systems Innovation

With decreasing hardware costs and increasing demand for autonomic management, many of today’s physical systems are equipped with an extensive network of sensors, generating a considerable amount of time series data daily. A highly valuable source of information, time series data is used by businesses and governments to measure and analyze change over time in complex systems. Organizations must consolidate, integrate and organize a vast amount of time series data from multiple sources to generate insights and business value.