Entries by NEC Labs America

DiCE-M: Distributed Code Generation and Execution for Marine Applications – An Edge-Cloud Approach

Edge computing has emerged as a transformative technology that reduces application latency, improves cost efficiency, enhances security, and enables large-scale deployment of applications across various domains. In environmental monitoring, systems such as MegaSense[49], use low-cost sensors to gather and process real-time air quality data through edge-cloud collaboration, highlighting the critical role of edge computing in enabling scalable, efficient solutions. Similarly, marine science increasingly requires real-time processing and analysis of marine data from remote, resource-constrained environments. In this paper, we extend the power of edge computing by integrating it with Generative Artificial Intelligence(GenAI),specifically large language models (LLMs), to address challenges in marine science applications. We propose DiCE-M (Distributed Code generation and Execution for Marine applications), a robust system that uses LLM to generate distributed code for marine applications and then utilizes a runtime to efficiently execute it on an edge+cloud computing infrastructure. Specifically, DiCE-M leverages edge computing to execute lightweight AI models locally on unmanned surface vehicles(USVs)while offloading complex tasks to the cloud, thus balancing computational load and enabling realtime monitoring in marine environments. We use marine litter identification as an example application to demonstrate the utility of DiCE-M. Our results show that DiCE-M reduces latency by more than 2X when marine litter is not detected and cuts cloud computing costs by more than half compared to traditional cloud-based approaches. By selectively cropping and transmitting relevant image portions, DiCE-M further improves bandwidth efficiency, making it a reliable and cost-effective solution for deploying AI-driven applications on resource-constrained USVs in dynamic marine environments.

Understanding Transcriptional Regulatory Redundancy by Learnable Global Subset Perturbations

Transcriptional regulation through cis-regulatory elements (CREs) is crucial for numerous biological functions, with its disruption potentially leading to various diseases. It is well-known that these CREs often exhibit redundancy, allowing them to compensate for each other in response to external disturbances, highlighting the need for methods to identify CRE sets that collaboratively regulate gene expression effectively. To address this, we introduce GRIDS, an in silico computational method that approaches the task as a global feature explanation challenge to dissect combinatorial CRE effects in two phases. First, GRIDS constructs a differentiable surrogate function to mirror the complex gene regulatory process, facilitating cross-translations in single-cell modalities. It then employs learnable perturbations within a state transition framework to offer global explanations, efficiently navigating the combinatorial feature landscape. Through comprehensive bench marks, GRIDS demonstrates superior explanatory capabilities compared to other leading methods. Moreover, GRIDS s global explanations reveal intricate regulatory redundancy across cell types and states, underscoring its potential to advance our understanding ofcellular regulation in biological research.

NEC Labs America Team Attending NeurIPS24 in Vancouver

NEC Labs America is proud to attend NeurIPS 2024 in Vancouver, Canada from December 10-15. Zachary Izzo will present Subgroup Discovery with the Cox Model, Shaobo Han will present VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks and Jonathan Warrell will present Discrete-Continuous Variational Optimization with Local Gradients.

Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection

Action detection aims to detect (recognize and localize) human actions spatially and temporally in videos. Existing approaches focus on the closed-set setting where an action detector is trained and tested on videos from a fixed set of action categories. However, this constrained setting is not viable in an open world where test videos inevitably come beyond the trained action categories. In this paper, we address the practical yet challenging Open-Vocabulary Action Detection (OVAD) problem. It aims to detect any action in test videos while training a model on a fixed set of action categories. To achieve such an open-vocabulary capability, we propose a novel method OpenMixer that exploits the inherent semantics and localizability of large vision-language models (VLM) within the family of query-based detection transformers (DETR). Specifically, the OpenMixer is developed by spatial and temporal OpenMixer blocks (S-OMB and T-OMB), and a dynamically fused alignment (DFA) module. The three components collectively enjoy the merits of strong generalization from pre-trained VLMs and end-to-end learning from DETR design. Moreover, we established OVAD benchmarks under various settings, and the experimental results show that the OpenMixer performs the best over baselines for detecting seen and unseen actions.

Matching Confidences and Softened Target Occurrences for Calibration

The problem of calibrating deep neural networks (DNNs) is gaining attention, as these networks are becoming central to many real-world applications. Different attempts have been made to counter the poor calibration of DNNs. Amongst others, train-time calibration methods have unfolded as an effective class for improving model calibration. Motivated by this, we propose a novel train-time calibration method that is built on a new auxiliary loss formulation, namely multiclass alignment of confidences with the gradually softened ground truth occurrences (MACSO). It is developed on the intuition that, for a class, the gradually softened ground truth occurrences distribution is a suitable non-zero entropy signal whose better alignment withthe predicted confidences distribution is positively correlated with reducing the model calibration error. In our train-time approach, besides simply aligning the two distributions, e.g., via their means or KL divergence, we propose to quantify the linear correlation between the two distributions, which preserves the relations among them, thereby further improving the calibration performance. Finally, we also reveal that MACSO posses desirable theoretical properties. Extensive results on several challenging datasets, featuring in and out-of-domain scenarios, class imbalanced problem, and a medical image classification task, validate the efficacy of our method against state-of-the-art train-time calibration methods.

Apply for a Summer 2025 Internship

Our exciting internship opportunities for this Summer 2025 are now available. We are looking for students pursuing advanced degrees in Computer Science and Electrical Engineering. Internships are typically 3 months long in duration. The benefits of working for us include the opportunity to quickly become part of a project team applying cutting-edge technology to industry-leading concepts. We have opportunities in Data Science & System Security, Integrated Systems, Media Analytics, Machine Learning, and Optical Networking & Sensing.

A Variational Graph Partitioning Approach to Modeling Protein Liquid-liquid Phase Separation

Graph neural networks (GNNs) have emerged as powerful tools for representation learning. Their efficacy depends on their having an optimal underlying graph. In many cases, the most relevant information comes from specific subgraphs. In this work, we introduce a GNN-based framework (graph-partitioned GNN [GP-GNN]) to partition the GNN graph to focus on the most relevant subgraphs. Our approach jointly learns task-dependent graph partitions and node representations, making it particularly effective when critical features reside within initially unidentified subgraphs. Protein liquid-liquid phase separation (LLPS) is a problem especially well-suited to GP-GNNs because intrinsically disordered regions (IDRs) are known to function as protein subdomains in it, playing a key role in the phase separation process. In this study, we demonstrate how GP-GNN accurately predicts LLPS by partitioning protein graphs into task-relevant subgraphs consistent with known IDRs. Our model achieves state-of-the-art accuracy in predicting LLPS and offers biological insights valuable for downstream investigation.

Variational methods for Learning Multilevel Genetic Algorithms using the Kantorovich Monad

Levels of selection and multilevel evolutionary processes are essential concepts in evolutionary theory, and yet there is a lack of common mathematical models for these core ideas. Here, we propose a unified mathematical framework for formulating and optimizing multilevel evolutionary processes and genetic algorithms over arbitrarily many levels based on concepts from category theory and population genetics. We formulate a multilevel version of the Wright-Fisher process using this approach, and we show that this model can be analyzed to clarify key features of multilevel selection. Particularly, we derive an extended multilevel probabilistic version of Price’s Equation via the Kantorovich Monad, and we use this to characterize regimes of parameter space within which selection acts antagonistically or cooperatively across levels. Finally, we show how our framework can provide a unified setting for learning genetic algorithms (GAs), and we show how we can use a Variational Optimization and a multi-level analogue of coalescent analysis to fit multilevel GAs to simulated data.

Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024

Separating disinformation from fact on the web has long challenged both the search and the reasoning powers of humans. We show that the reasoning power of large language models (LLMs) and the retrieval power of modern search engines can be combined to automate this process and explainably verify claims. We integrate LLMs and search under a multi-hop evidence pursuit strategy. This strategy generates an initial question based on an input claim using a sequence to sequence model, searches and formulates an answer to the question, and iteratively generates follow-up questions to pursue the evidence that is missing using an LLM. We demonstrate our system on the FEVER 2024 (AVeriTeC) shared task. Compared to a strategy of generating all the questions at once, our method obtains .045 higher label accuracy and .155 higher AVeriTeC score (evaluating the adequacy of the evidence). Through ablations, we show the importance of various design choices, such as the question generation method, medium-sized context, reasoning with one document at a time, adding metadata, paraphrasing, reducing the problem to two classes, and reconsidering the final verdict. Our submitted system achieves .510 AVeriTeC score on the dev set and .477 AVeriTec score on the test set.

Large Language Models Can Be Contextual Privacy Protection Learners

The proliferation of Large Language Models (LLMs) has driven considerable interest in fine-tuning them with domain-specific data to create specialized language models. Nevertheless, such domain-specific fine-tuning data often contains contextually sensitive personally identifiable information (PII). Direct fine-tuning LLMs on this data without privacy protection poses a risk of data leakage of sensitive PII during inference time. To address this challenge, we introduce Contextual Privacy Protection Language Models (CPPLM), a novel paradigm for fine-tuning LLMs that effectively injects domain-specific knowledge while safeguarding inference-time data privacy. Our work offers a theoretical analysis for model design and delves into various techniques such as corpus curation, penalty-based unlikelihood in training loss, and instruction-based tuning, etc. Extensive experiments across diverse datasets and scenarios demonstrate the effectiveness of our approaches. In particular, instruction tuning with both positive and negative examples, stands out as a promising method, effectively protecting private data while enhancing the model s knowledge. Our work underscores the potential for Large Language Models as robust contextual privacy protection learners.