On Synthesizing Data for Context Attribution in Question Answering

Question Answering (QA) accounts for a significantportion of LLM usage “in the wild”.However, LLMs sometimes produce false ormisleading responses, also known as hallucinations.Therefore, grounding the generatedanswers in contextually provided information—i.e., providing evidence for the generated text—is paramount for LLMs’ trustworthiness. Providingthis information is the task of context attribution.In this paper, we systematically studyLLM-based approaches for this task, namelywe investigate (i) zero-shot inference, (ii) LLMensembling, and (iii) fine-tuning of small LMson synthetic data generated by larger LLMs.Our key contribution is SYNQA: a novel generativestrategy for synthesizing context attributiondata. Given selected context sentences, anLLM generates QA pairs that are supported bythese sentences. This leverages LLMs’ naturalstrengths in text generation while ensuring clearattribution paths in the synthetic training data.We show that the attribution data synthesizedvia SYNQA is highly effective for fine-tuningsmall LMs for context attribution in differentQA tasks and domains. Finally, with a userstudy, we validate the usefulness of small, efficientLMs (fine-tuned on synthetic data fromSYNQA) in context attribution for QA.

Feasibility study on scour monitoring for subsea cables of offshore wind turbines using distributed fiber optic sensors

Subsea cables are critical components of offshore wind turbines and are subjected to scour. Monitoring the scour conditions of subsea cables plays significant roles in improving safety and operation efficiency and reducing the levelized cost of electricity. This paper presents a feasibility study on monitoring subsea cables using distributed fiber optic sensors (DFOS), aiming to evaluate the technical and economic performance of utilizing DFOS to detect, locate, and quantify scour conditions. Laboratory experiments were conducted to test the response ofDFOS measurements to the change of support conditions which were used to simulate scour effects, and a finite element model was developed to investigate the impact of scour on the mechanical responses of subsea cables in different scour scenarios. Economic analysis of three methods, involving the use of DFOS, discrete sensors, and underwater robots, is performed via a case study. The results showed that the proposed method has technical and economic benefits for monitoring subsea cables. This research offers insights into monitoring subsea structuresfor offshore wind turbines.

Group Relative Augmentation for Data Efficient Action Detection

Adapting large Video-Language Models (VLMs) for action detection using only a few examples poses challenges like overfitting and the granularity mismatch between scene-level pre-training and required person-centric understanding. We propose an efficient adaptation strategy combining parameter-efficient tuning (LoRA) with a novel learnable internal feature augmentation. Applied within the frozen VLM backbone using FiLM, these augmentations generate diverse feature variations directly relevant to the task. Additionally, we introduce a group-weighted loss function that dynamically modulates the training contribution of each augmented sample based on its prediction divergence relative to the group average. This promotes robust learning by prioritizing informative yet reasonable augmentations. We demonstrate our method’s effectiveness on complex multi-label, multi-person action detection datasets (AVA, MOMA), achieving strong mAP performance and showcasing significant data efficiency for adapting VLMs from limited examples.

Uncertainty Propagation on LLM Agent

Large language models (LLMs) integrated into multi-step agent systems enable complex decision-making processes across various applications. However, their outputs often lack reliability, making uncertainty estimation crucial. Existing uncertainty estimation methods primarily focus on final-step outputs, which fail to account for cumulative uncertainty over the multi-step decision-making process and the dynamic interactions between agents and their environments. To address these limitations, we propose SAUP (Situation Awareness Uncertainty Propagation), a novel framework that propagates uncertainty through each step of an LLM-based agent’s reasoning process. SAUP incorporates situational awareness by assigning situational weights to each step’s uncertainty during the propagation. Our method, compatible with various one-step uncertainty estimation techniques, provides a comprehensive and accurate uncertainty measure. Extensive experiments on benchmark datasets demonstrate that SAUP significantly outperforms existing state-of-the-art methods, achieving up to 20% improvement in AUROC.

Exploring Multi-Modal Data with Tool-Augmented LLM Agents for Precise Causal Discovery

Causal discovery is an imperative foundation for decision-making across domains, such as smart health, AI for drug discovery and AIOps. Traditional statistical causal discovery methods, while well-established, predominantly rely on observational data and often overlook the semantic cues inherent in cause-and-effect relationships. The advent of Large Language Models (LLMs) has ushered in an affordable way of leveraging the semantic cues for knowledge-driven causal discovery, but the development of LLMs for causal discovery lags behind other areas, particularly in the exploration of multimodal data. To bridge the gap, we introduce MATMCD, a multi-agent system powered by tool-augmented LLMs. MATMCD has two key agents: a Data Augmentation agent that retrieves and processes modality-augmented data, and a Causal Constraint agent that integrates multi-modal data for knowledge-driven reasoning. The proposed design of the inner-workings ensures successful cooperation of the agents. Our empirical study across seven datasets suggests the significant potential of multi-modality enhanced causal discovery

Integration of Fiber Optic Sensing and Sparse Grid Sensors for Accurate Fault Localization in Distribution Systems

Fault localization in power distribution networks is essential for rapid recovery and enhancing system resilience. While Phasor Measurement Units (PMUs or ?PMUs) providehigh-resolution measurements for precise fault localization, their widespread deployment is cost-prohibitive. Distributed Fiber Optic Sensing (DFOS) offers a promising alternative for event detection along power lines using collocated optical fiber; however, it cannot independently differentiate between events and pinpoint exact fault locations. This paper introduces an innovative framework that combines DFOS with sparsely deployed PMUs for accurate fault localization. The proposed approach first utilizes a Graph Attention Network (GAT) model to capture spatial and temporal correlations from synchronized PMU and DFOS measurements, effectively identifying fault zones. High-spatial- resolution DFOS measurements further refine the fault locationwithin the identified zone. Singular Value Decomposition (SVD) is applied to extract feature vectors from DFOS measurements, enhancing the convergence speed of the GAT model. Thisintegrated solution significantly improves localization accuracy while minimizing reliance on extensive deployment of PMUs.

EcoDoc: A Cost-Efficient Multimodal Document Processing System for Enterprises Using LLMs

Enterprises are increasingly adopting Generative AI applications to extract insights from large volumes of multimodal documents in domains such as finance, law, healthcare, and industry. These documents contain structured and unstructured data (images, charts, handwritten texts, etc.) requiring robust AI systems for effective retrieval and comprehension. Recent advancements in Retrieval-Augmented Generation (RAG) frameworks and Vision-Language Models (VLMs) have improved retrieval performance on multimodal documents by processing pages as images. However, large-scale deployment remains challenging due to the high cost of LLM API usage and the slower inference speed of image-based processing of pages compared to text-based processing. To address these challenges, we propose EcoDoc, a cost-effective multimodal document processing system that dynamically selects the processing modalities for each page as an image or text based on page characteristics and query intent. Our experimental evaluation on TAT-DQA and DocVQA benchmarks shows that EcoDoc reduces average query processing latency by up to 2.29× and cost by up to 10×, without compromising accuracy.

National Intern Day at NEC Laboratories America: Celebrating the Next Generation of Innovators

On National Intern Day, NEC Laboratories America celebrates the bright minds shaping tomorrow’s technology. Each summer, interns from top universities work side-by-side with our researchers on real-world challenges in AI, cybersecurity, data science, and more. From groundbreaking research to team-building events, our interns contribute fresh ideas and bold thinking that power NEC’s innovation engine.

XPF: Agentic AI System for Business Workflow Automation

In this paper, we propose a novel agentic AI system called XPF, which enables users to create “agents” using just natural language, where each agent is capable of executing complex, real-world business workflows in an accurate and reliable manner. XPF provides an interface to develop and iterate over the agent creation process and then deploy the agent in production when satisfactory results are produced consistently. The key components of XPF include: (a) planner, which leverages LLM to generate a step-by-step plan, which can further be edited by a human (b) compiler, which leverages LLM to compile the plan into a flow graph (c) executor, which handles distributed execution of the flow graph (using LLM, tools, RAG, etc.) on an underlying cluster and (d) verifier, which helps in verification of the output (through human generated tests or auto-generated tests using LLM). We develop five different agents using XPF and conduct experiments to evaluate one particular aspect i.e. difference in accuracy and reliability of the five agents with “human-generated” vs “auto-generated” plans. Our experiments show that we can get much more accurate and reliable response for a business workflow when step-by-step instructions (in natural language) are given by a human familiar with the workflow, rather than letting the LLM figure out the execution plan steps. In particular, we observe that “human-generated” plan almost always gives 100% accuracy whereas “auto-generated” plan almost never gives 100% accuracy. In terms of reliability, we observe through Rouge-L, Blue and Meteor scores, that the output from “human-generated” plan is much more reliable than “auto-generated” plan.

Quantitative Bounds for Length Generalization in Transformers

We provide quantitative bounds on the length of sequences required to be observed during training for a transformer to length generalize, e.g., to continue to perform well on sequences unseen during training. Our results improve on Huang et al. [8], who show that there is a finite training length beyond which length generalization is guaranteed, but for which they do not provide quantitative bounds.