machine learning Archives | NEC Labs America

Our Machine Learning team has been at the forefront of machine learning developments, including deep learning, support vector machines, and semantic analysis, for over a decade. We develop innovative technologies integrated into NEC’s products and services. Machine learning is the critical technology for data analytics and artificial intelligence. Recent progress in this field opens opportunities for various new applications.

Deep learning will maintain prominence with more robust model architectures, training methods, and optimization techniques. Enhanced interpretability and explainability will be imperative, especially for AI systems in critical domains like healthcare and finance. Addressing bias and ensuring fairness in AI algorithms will be a top priority, leading to the development of tools and guidelines for ethical AI. Federated learning, quantum computing’s potential impact, and the growth of edge computing will diversify ML applications.

Natural language processing will continue to advance, driving progress in conversational AI, while healthcare, finance, education, and creative industries will witness profound AI integration. As quantum computing matures, it could revolutionize machine learning, while edge computing and federated learning will expand AI’s reach across various domains. Our machine learning research will produce innovation across industries, including more accurate medical diagnoses, safer autonomous systems, and efficient energy use while enabling personalized education and AI-generated creativity.

Read our news and publications from our world-class team of researchers from our Machine Learning department.

Posts

On Synthesizing Data for Context Attribution in Question Answering

April 7, 2025/in Publications/by NEC Labs America

Question Answering (QA) accounts for a significant portion of LLM usage “in the wild”. However, LLMs sometimes produce false or misleading responses, also known as “hallucinations”. Therefore, grounding the generated answers in contextually provided information — i.e., providing evidence for the generated text — is paramount for LLMs’ trustworthiness. Providing this information is the task of context attribution. In this paper, we systematically study LLM-based approaches for this task, namely we investigate (i) zero-shot inference, (ii) LLM ensembling, and (iii) fine-tuning of small LMs on synthetic data generated by larger LLMs. Our key contribution is SynQA: a novel generative strategy for synthesizing context attribution data. Given selected context sentences, an LLM generates QA pairs that are supported by these sentences. This leverages LLMs’ natural strengths in text generation while ensuring clear attribution paths in the synthetic training data. We show that the attribution data synthesized via SynQA is highly effective for fine-tuning small LMs for context attribution in different QA tasks and domains. Finally, with a user study, we validate the usefulness of small LMs (fine-tuned on synthetic data from SynQA) in context attribution for QA.

Enhancing EDFAs Greybox Modeling in Optical Multiplex Sections Using Few-Shot Learning

April 3, 2025/in Publications/by NEC Labs America

We combine few-shot learning and grey-box modeling for EDFAs in optical lines, training a single EDFA model on 500 spectral loads and transferring it to other EDFAs using 4-8 samples, maintaining low OSNR prediction error.

A Smart Sensing Grid for Road Traffic Detection Using Terrestrial Optical Networks and Attention-Enhanced Bi-LSTM

March 31, 2025/in Publications/by NEC Labs America

We demonstrate the use of existing terrestrial optical networks as a smart sensing grid, employing a bidirectional long short-term memory (Bi-LSTM) model enhanced with an attention mechanism to detect road vehicles. The main idea of our approach is to deploy a fast, accurate and reliable trained deep learning model in each network element that is constantly monitoring the state of polarization (SOP) of data signals traveling through the optical line system (OLS). Consequently, this deployment approach enables the creation of a sensing smart grid that can continuously monitor wide areas and respond with notifications/alerts for road traffic situations. The model is trained on the synthetic dataset and tested on the real dataset obtained from the deployed metropolitan fiber cable in the city of Turin. Our model is able to achieve 99% accuracy for both synthetic and real datasets.

Attribute-Centric Compositional Text-to-Image Generation

March 13, 2025/in Publications/by NEC Labs America

Despite the recent impressive breakthroughs in text-to-image generation, generative models have difficulty in capturing thedata distribution of underrepresented attribute compositions while over-memorizing overrepresented attribute compositions,which raises public concerns about their robustness and fairness. To tackle this challenge, we propose ACTIG, an attributecentriccompositional text-to-image generation framework. We present an attribute-centric feature augmentation and a novelimage-free training scheme, which greatly improves models ability to generate images with underrepresented attributes.Wefurther propose an attribute-centric contrastive loss to avoid overfitting to overrepresented attribute compositions.We validateour framework on the CelebA-HQ and CUB datasets. Extensive experiments show that the compositional generalization ofACTIG is outstanding, and our framework outperforms previous works in terms of image quality and text-image consistency

Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation

March 4, 2025/in Publications/by NEC Labs America

We consider the conditional generation of 3D drug-like molecules with explicit control over molecular properties such as drug-like properties (e.g., Quantitative Estimate of Druglikenessor Synthetic Accessibility score) and effectively binding to specific protein sites. To tackle this problem, we propose an E(3)-equivariant Wasserstein autoencoder and factorize thelatent space of our generative model into two disentangled aspects: molecular properties and the remaining structural context of 3D molecules. Our model ensures explicit control over these molecular attributes while maintaining equivariance of coordinate representation and invariance of data likelihood. Furthermore, we introduce a novel alignment-based coordinate loss to adapt equivariant networks for auto-regressive denovo 3D molecule generation from scratch. Extensive experiments validate our models effectiveness on property-guidedand context-guided molecule generation, both for de-novo 3D molecule design and structure-based drug discovery against protein targets.

Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection

March 4, 2025/in Publications/by NEC Labs America

Action detection aims to detect (recognize and localize) human actions spatially and temporally in videos. Existing approaches focus on the closed-set setting where an action detector is trained and tested on videos from a fixed set of action categories. However, this constrained setting is not viable in an open world where test videos inevitably come beyond the trained action categories. In this paper, we address the practical yet challenging Open-Vocabulary Action Detection (OVAD) problem. It aims to detect any action in test videos while training a model on a fixed set of action categories. To achieve such an open-vocabulary capability, we propose a novel method OpenMixer that exploits the inherent semantics and localizability of large vision-language models (VLM) within the family of query-based detection transformers (DETR). Specifically, the OpenMixer is developed by spatial and temporal OpenMixer blocks (S-OMBand T-OMB), and a dynamically fused alignment (DFA) module. The three components collectively enjoy the merits of strong generalization from pre-trained VLMs and end to-end learning from DETR design. Moreover, we established OVAD benchmarks under various settings, and the experimental results show that the OpenMixer performs the best over baselines for detecting seen and unseen actions.

Domain-Guided Weight Modulation for Semi-Supervised Domain Generalization

March 3, 2025/in Publications/by NEC Labs America

Unarguably deep learning models capable of generalizing to unseen domain data while leveraging a few labels are of great practical significance due to low developmental costs. In search of this endeavor we study the challenging problem of semi-supervised domain generalization (SSDG) where the goal is to learn a domain-generalizable model while using only a small fraction of labeled data and a relatively large fraction of unlabeled data. Domain generalization (DG) methods show subpar performance under the SSDG setting whereas semi-supervised learning (SSL) methods demonstrate relatively better performance however they are considerably poor compared to the fully-supervised DG methods. Towards handling this new but challenging problem of SSDG we propose a novel method that can facilitate the generation of accurate pseudo-labels under various domain shifts. This is accomplished by retaining the domain-level specialism in the classifier during training corresponding to each source domain. Specifically we first create domain-level information vectors on the fly which are then utilized to learn a domain-aware mask for modulating the classifier’s weights. We provide a mathematical interpretation for the effect of this modulation procedure on both pseudo-labeling and model training. Our method is plug-and-play and can be readily applied to different SSL baselines for SSDG. Extensive experiments on six challenging datasets in two different SSDG settings show that our method provides visible gains over the various strong SSL-based SSDG baselines. Our code is available at github.com/DGWM.

Reducing Hallucinations of Medical Multimodal Large Language Models with Visual Retrieval-Augmented Generation

February 25, 2025/in Publications/by NEC Labs America

Multimodal Large Language Models (MLLMs) have shown impressive performance in vision and text tasks. However, hallucination remains a major challenge, especially in fields like healthcare where details are critical. In this work, we show how MLLMs may be enhanced to support Visual RAG (V-RAG), a retrieval-augmented generation framework that incorporates both text and visual data from retrieved images. On the MIMIC-CXR chest X-ray report generation and Multicare medical image caption generation datasets, we show that Visual RAG improves the accuracy of entity probing, which asks whether a medical entities is grounded by an image. We show that the improvements extend both to frequent and rare entities, the latter of which may have less positive training data. Downstream, we apply V-RAG with entity probing to correct hallucinations and generate more clinically accurate X-ray reports, obtaining a higher RadGraph-F1 score.

Discrete-Continuous Variational Optimization with Local Gradients

December 15, 2024/in Publications/by NEC Labs America

Variational optimization (VO) offers a general approach for handling objectives which may involve discontinuities, or whose gradients are difficult to calculate. By introducing a variational distribution over the parameter space, such objectives are smoothed, and rendered amenable to VO methods. Local gradient information, though, may be available in certain problems, which is neglected by such an approach. We therefore consider a general method for incorporating local information via an augmented VO objective function to accelerate convergence and improve accuracy. We show how our augmented objective can be viewed as an instance of multilevel optimization. Finally, we show our method can train a genetic algorithm simulator, using a recursive Wasserstein distance objective

Subgroup Discovery with the Cox Model

December 15, 2024/in Publications/by NEC Labs America

We study the problem of subgroup discovery with Cox regression models and introduce a method for finding an interpretable subset of the data on which a Cox model is highly accurate. Our method relies on two technical innovations: the emph (Unknown sysvar: (expected prediction entropy)), a novel metric for evaluating survival models which predict a hazard function; and the emph (Unknown sysvar: (conditional rank distribution)), a statistical object which quantifies the deviation of an individual point to the distribution of survival times in an existing subgroup. Because of the interpretability of the discovered subgroups, in addition to improving the predictive accuracy of the model, they can also form meaningful, data-driven patient cohorts for further study in a clinical setting.

Posts

On Synthesizing Data for Context Attribution in Question Answering

Enhancing EDFAs Greybox Modeling in Optical Multiplex Sections Using Few-Shot Learning

A Smart Sensing Grid for Road Traffic Detection Using Terrestrial Optical Networks and Attention-Enhanced Bi-LSTM

Attribute-Centric Compositional Text-to-Image Generation

Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation

Domain-Guided Weight Modulation for Semi-Supervised Domain Generalization

Reducing Hallucinations of Medical Multimodal Large Language Models with Visual Retrieval-Augmented Generation

Discrete-Continuous Variational Optimization with Local Gradients

Subgroup Discovery with the Cox Model

Contact Us

About Us

Our Pages

Read Our Blog Posts