Apply for a Summer 2025 Internship

Our exciting internship opportunities for this Summer 2025 are now available. We are looking for students pursuing advanced degrees in Computer Science and Electrical Engineering. Internships are typically 3 months long in duration. The benefits of working for us include the opportunity to quickly become part of a project team applying cutting-edge technology to industry-leading concepts. We have opportunities in Data Science & System Security, Integrated Systems, Media Analytics, Machine Learning, and Optical Networking & Sensing.

Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024

Separating disinformation from fact on the web has long challenged both the search and the reasoning powers of humans. We show that the reasoning power of large language models (LLMs) and the retrieval power of modern search engines can be combined to automate this process and explainably verify claims. We integrate LLMs and search under a multi-hop evidence pursuit strategy. This strategy generates an initial question based on an input claim using a sequence to sequence model, searches and formulates an answer to the question, and iteratively generates follow-up questions to pursue the evidence that is missing using an LLM. We demonstrate our system on the FEVER 2024 (AVeriTeC) shared task. Compared to a strategy of generating all the questions at once, our method obtains .045 higher label accuracy and .155 higher AVeriTeC score (evaluating the adequacy of the evidence). Through ablations, we show the importance of various design choices, such as the question generation method, medium-sized context, reasoning with one document at a time, adding metadata, paraphrasing, reducing the problem to two classes, and reconsidering the final verdict. Our submitted system achieves .510 AVeriTeC score on the dev set and .477 AVeriTec score on the test set.

Large Language Models Can Be Contextual Privacy Protection Learners

The proliferation of Large Language Models (LLMs) has driven considerable interest in fine-tuning them with domain-specific data to create specialized language models. Nevertheless, such domain-specific fine-tuning data often contains contextually sensitive personally identifiable information (PII). Direct fine-tuning LLMs on this data without privacy protection poses a risk of data leakage of sensitive PII during inference time. To address this challenge, we introduce Contextual Privacy Protection Language Models (CPPLM), a novel paradigm for fine-tuning LLMs that effectively injects domain-specific knowledge while safeguarding inference-time data privacy. Our work offers a theoretical analysis for model design and delves into various techniques such as corpus curation, penalty-based unlikelihood in training loss, and instruction-based tuning, etc. Extensive experiments across diverse datasets and scenarios demonstrate the effectiveness of our approaches. In particular, instruction tuning with both positive and negative examples, stands out as a promising method, effectively protecting private data while enhancing the model s knowledge. Our work underscores the potential for Large Language Models as robust contextual privacy protection learners.

InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration (EMNLP 2024)

Large Language Models (LLMs) have achieved exceptional capabilities in open generation across various domains, yet they encounter difficulties with tasks that require intensive knowledge. To address these challenges, methods for integrating knowledge have been developed, which augment LLMs with domain-specific knowledge graphs through external modules. These approaches, however, face data inefficiency issues as they necessitate the processing of both known and unknown knowledge for fine-tuning. Thus, our research focuses on a novel problem: efficiently integrating unknown knowledge into LLMs without unnecessary overlap of known knowledge. A risk of introducing new knowledge is the potential forgetting of existing knowledge. To mitigate this risk, we propose the innovative InfuserKI framework. This framework employs transformer internal states to determine when to enrich LLM outputs with additional information, effectively preventing knowledge forgetting. Performance evaluations using the UMLS-2.5k and MetaQA domain knowledge graphs reveal that InfuserKI not only successfully integrates new knowledge but also outperforms state-of-the-art baselines, reducing knowledge forgetting by 9% and 6%, respectively.

A Survey on Detection of LLMs-Generated Content

The burgeoning capabilities of advanced large language models (LLMs) such as ChatGPT have led to an increase in synthetic content generation with implications across a variety of sectors, including media, cybersecurity, public discourse, and education. As such, the ability to detect LLMs-generated content has become of paramount importance. We aim to provide a detailed overview of existing detection strategies and benchmarks, scrutinizing their differences and identifying key challenges and prospects in the field, advocating for more adaptable and robust models to enhance detection accuracy. We also posit the necessity for a multi-faceted approach to defend against various attacks to counter the rapidly advancing capabilities of LLMs. To the best of our knowledge, this work is the first comprehensive survey on the detection in the era of LLMs. We hope it will provide a broad understanding of the current landscape of LLMs-generated content detection, and we have maintained a website to consistently update the latest research as a guiding reference for researchers and practitioners.

Exploring the Role of Reasoning Structures for Constructing Proofs in Multi-Step Natural Language Reasoning with Large Language Models

When performing complex multi-step reasoning tasks, the ability of Large Language Models (LLMs) to derive structured intermediate proof steps is important for ensuring that the models truly perform the desired reasoning and for improving models’ explainability. This paper is centered around a focused study: whether the current state-of-the-art generalist LLMs can leverage the structures in a few examples to better construct the proof structures with in-context learning. Our study specifically focuses on structure-aware demonstration and structure-aware pruning. We demonstrate that they both help improve performance. A detailed analysis is provided to help understand the results.

Field Verification of Fault Localization with Integrated Physical-Parameter-Aware Methodology

We report the first field verification of fault localization in an optical line system (OLS) by integrating digital longitudinal monitoring and OLS calibration, highlighting changes in physical metrics and parameters. Use cases shown are degradation of a fiber span loss and optical amplifier noise figure.

Enhancing Optical Multiplex Section QoT Estimation Using Scalable Gray-box DNN

In Optical Multiplex Section (OMS) control and optimization framework, end-to-end (Global) and span-by-span (Local) DNN gray-box strategies are compared in terms of scalability and accuracy of the output signal and noise power predictions. Experimental measurements are carried out in OMSs with increasing number of spans.

Characterization and Modeling of the Noise Figure Ripple in a Dual-Stage EDFA

The noise figure ripple of a dual-stage EDFA is studied starting from experimental measurements under full spectral load conditions and defining device characteristics. Asemi-analytical model is then proposed showing 0.1 dB standard deviation on the error distribution in all cases of operation.

DiCE: Distributed Code generation and Execution

Generative artificial intelligence (GenAI), specifically, Large Language Models (LLMs), have shown tremendous potential in automating several tasks and improving human productivity. Recent works have shown them to be quite useful in writing and summarizing text (articles, blogs, poems, stories, songs, etc.), answering questions, brainstorming ideas, and even writing code. Several LLMs have emerged specifically targeting code generation. Given a prompt, these LLMs can generate code in any desired programming language. Many tools like ChatGPT, CoPilot, CodeWhisperer, Cody, DeepSeek Coder, StarCoder, etc. are now routinely being used by software developers. However, most of the prior work in automatic code generation using LLMs is focused on obtaining “correct” and working code, and mainly runs on a single computer (serial code). In this paper, we take this to the next level, where LLMs are leveraged to generate code for execution on a distributed infrastructure. We propose a novel system called DiCE, which takes serial code as input and automatically generates distributed version of the code and efficiently executes it on a distributed setup. DiCE consists of two main components (a) LLM-based tool (Synthia) to understand dependencies in serial code and automatically generate distributed version of the code using specialized programming model and semantics, and (b) Runtime (Hermod) to understand the semantics in the distributed code and realize efficient execution on a cluster of machines (distributed infrastructure). DiCE currently focuses on visual programs synthesized by tools like ViperGPT [1] and VisReP [2] (serial code), automatically identifies higher-level task parallelism opportunities (e.g., parallel object detection), transforms the code to exploit the parallelism, and finally efficiently executes it on a cluster of machines. Through our experiments using 100 examples from the GQA dataset [3], we show that the serial codes generated by ViperGPT are successfully transformed into distributed codes which are then efficiently executed on a cluster of machines by DiCE. We note that DiCE correctly identifies opportunities for parallelism and distributes tasks on separate GPUs within the cluster. We observe an average speed-up of 2X, 2.95X, and 3.7X, and an average efficiency of 1, 0.74 and 0.48 for a cluster of 2 nodes, 4 nodes, and 8 nodes, respectively.