Publications
Read our published work with NEC, partner organizations and across our five departments: data science and system security, integrated systems, machine learning, media analytics, and optical networking and sensing.
Read our published work with NEC, partner organizations and across our five departments: data science and system security, integrated systems, machine learning, media analytics, and optical networking and sensing.
Procurement is a key function in supply chain management that involves acquiring goods and services to meet organizational needs. Efficient procurement is crucial for minimizing costs, ensuring timely delivery, and maintaining quality standards. This paper explores the integration of automated negotiation
In satellite applications, user queries often take the form of open-ended natural language, extending beyond a fixed set of predefined categories. This open-vocabulary nature poses significant challenges for retrieving relevant image tiles, as the retrieval system must generalize to a wide range of unseen
We develop closed-form formulas for PDL-induced SNR margins using solutions based on central limit theorem. Experimental validations confirm accurate and conservative performancepredictions, enabling precise quality of transmission assessment and margin-aware design in optical networks.
Humans solve problems by executing targeted plans, yet large language models (LLMs) remain unreliable for structured workflow execution. We propose RunAgent, a multiagent plan execution platform that interprets natural-language plans while enforcing stepwise execution through constraints and rubrics.
Distributed multichannel acoustic sensing (DMAS) enables large-scale sound event classification (SEC), but performance drops when many channels are degraded and when sensor layouts at test time differ from training layouts. We propose a learning-free, physics-informed inpainting frontend based on reverse
Accurately modeling optical signal transmission is critical foroptimizing network performance, particularly in large-scalefiber optic networks operated by Internet Service Providers.In this work, we develop a Gaussian Noise model for a NewYork state ISPs optical backbone. Our model accounts for allmajor
Real-world deployment requires sound event and acoustic scene classification systems to remain reliable in noisy, diverse environments on resource-constrained devices. Although contrastive language-audio pretraining (CLAP) models with Transformer-based audio encoders achieve strong zero-shot performance,
Modern video diffusion models excel at appearance synthesis but still struggle with physical consistency: objects drift, collisions lack realistic rebound, and material responses seldom match their underlying properties. We present PhyCo, a framework that introduces continuous, interpretable, and physically
The evolution toward open and partially disaggregated optical networks has introduced new, to our knowledge,requirements on how transmission performance is evaluated and compared across technologies, vendors, and deployment scenarios. In this context, sound benchmarking practices are essential to ensure
Diffusion models (DMs) have proven to be effective in modeling high-dimensional distributions, leading to their widespread adoption for representing complex priors in Bayesian inverse problems (BIPs). However, current DM-based posterior sampling methods proposed for solving common BIPs rely on heuristic
Humans solve problems by executing targeted plans, yet large language models (LLMs) remain unreliable for structured workflow execution. We propose RunAgent, a multi-agent plan execution platform that interprets natural-language plans while enforcing stepwise execution through constraints and rubrics.
We study the problem of length generalization (LG) in transformers: the ability of a model trained on shorter sequences to maintain performance when evaluated on much longer, previously unseen inputs. Prior work by Huang et al. (2024) established that transformers eventually achieve length generalization
Deploying microservices across the computing continuum (edgecloud) requires placement decisions that adapt to workload variation and heterogeneous infrastructure, yet existing solutions often rely on static policies or opaque heuristics. We present Bellona a system for reliable and auditable Large
Large Language Models (LLMs) have shown remarkable performance on general Question Answering (QA), yet they often struggle in domain-specific scenarios where accurate and up-to-date information is required. Retrieval-Augmented Generation (RAG) addresses this limitation by enriching LLMs with external
Clinical diagnosis requires sequential evidence acquisition under uncertainty. However, most Large Language Model (LLM) based diagnostic systems assume fully observed patient information and therefore do not explicitly model how clinical evidence should be sequentially acquired over time. Even when diagnosis
Distributed fiber optic sensing (DFOS) has emerged as a promising technology for wide-area monitoring by utilizing existing telecom cables as large-scale sensing media. This paper explores three sensing modalities, backscattering-based sensing, forward-transmission-based sensing, and hybrid sensing,
Ensuring safety in autonomous driving requires scalable generation of realistic, controllable driving scenes beyond what real-world testing provides. Yet existing instruction guided image editors, trained on object-centric or artistic data, struggle with dense, safety-critical driving layouts. We propose
Knowledge distillation establishes a learning paradigm that leverages both data supervision and teacher guidance. However, determining the optimal balance between learning from data and learning from the teacher is challenging, as some samples may be noisy while others are subject to teacher uncertainty.
Time series data is ubiquitous across various domains, including manufacturing, finance, and healthcare. High-quality annotations are essential for effectively understanding time series and facilitating downstream tasks. However, obtaining such annotations is challenging, particularly in mission-critical
Large Language Models (LLMs) excel at many reasoning tasks but struggle with knowledge-intensive queries due to their inability to dynamically access up-to-date or domain-specific information. Retrieval-Augmented Generation (RAG) has emerged as a promising solution, enabling LLMs to ground their responses
Automatically extracting workflows as procedural graphs from natural language is promising yet underexplored, demanding both structural validity and logical alignment. While recent large language models (LLMs) show potential for procedural graph extraction, they often produce ill-formed structures or
We introduce a framework to analyse interpretability in deep learning, by drawing on a formal notion of model semantics from the philosophy of science. We argue that interpretability is only one aspect of a models semantics and illustrate our framework with examples from biomedicine.
We demonstrate agnostic QoT probing for datacenter exchange in a metro network via receiver-side ASE loading. Knowing BER telemetry and the progressive ASEload, the device estimates GSNR, enabling IPoWDM operations and digital-twin calibration.
We report the first field study of the phase and polarization dynamics of deployed antiresonant hollow core fiber cable in a data center interconnect for real-world vibration sensing,revealing enhanced phase sensitivity and significantly faster polarization angular rate compared with standard single
We propose a method to compensate the phase offset between samples from different tributaries in time-interleaved phase OTDR using nested phase reference channels. We demonstrate our method for a four-span bidirectional link with high-loss loopback.
We propose a mobile orbital domain-based hierarchical routing scheme which addresses the challenges posed by constant satellite movement and the resulting dynamicnetwork topology, thus significantly improving the routing scalability and efficiency in satellite networks.
Vision Transformers (ViTs) have achieved state-of-the-art performance in offline video action detection, but their reliance on processing fixed-size clips with full spatio-temporal attention makes them computationally expensive and ill-suited for real-time streaming applications due to massive computational
Vision transformer-based models bring significant improvements for image segmentation tasks. Although these architectures offer powerful capabilities irrespective of specific segmentation tasks, their use of computational resources can be taxing on deployed devices. One way to overcome this challenge
We propose LOGDIFF (Logical Guidance for the Exact Composition of Diffusion Models), a guidance framework for diffusion models that enables principled constrained generation with complex logical expressions at inference time. We study when exact score-based guidance for complex logical formulas can
Controllable driving scene generation is critical for realistic and scalable autonomous driving simulation, yet existing approaches struggle to jointly achieve photorealism and precise control. We introduce HorizonForge, a unified framework that reconstructs scenes as editable Gaussian Splats and Meshes,
We present methods and field trial results demonstrating an integrated distributed acoustic sensing (DAS) and distributed temperature sensing (DTS) system for manhole localization, condition diagnostics, and anomaly detection in pre-deployed telecommunication fiber networks. The proposed system leverages
Vibration sensing based on forward transmission is an emerging topic for network protection and environmental monitoring, especially in long-haul submarine cables and urban fiber networks. However, previous field trials of this approach have mainly focused on localizing strong events under controlled
Uncovering causal structures from observational data is crucial for understanding complex systems and making informed decisions. While reinforcement learning (RL) has shown promise in identifying these structures in the form of a directed acyclic graph (DAG), existing methods often lack efficiency, making
Geological CO2 storage (GCS) involves injecting captured CO2 into deep sub-surface formations to support climate goals. The effective management of GCS relies on adaptive injection planning to dynamically control injection rates and well pressures to balance both storage safety and efficiency. Prior
In this talk, we will present recent technological advances in fiber sensing applications with long monitoring distances orextending multiple fiber spans. In forward-transmission-based sensing, adaptive beamforming techniques weredemonstrated to achieve multi-event vibration sensing in environments with
Recent advances in video diffusion models have enabled the generation of high-quality videos. However, these videos still suffer from unrealistic deformations, semantic violations, and physical inconsistencies that are largely rooted in the absence of 3D physical priors. To address these challenges,
Radiology report generation requires advanced medical image analysis, effective temporal reasoning, and accurate text generation. Although recent innovations, particularly multimodal large language models, have shown improved performance, their supervised fine-tuning (SFT) objective is not explicitly
Radiology Report Generation (RRG) is a critical step toward automating healthcare workflows, facilitating accurate patient assessments, and reducing the workload of medical professionals. Despite recent progress in Large Medical Vision-Language Models (Med-VLMs), generating radiology reports that are
Optical link tomography (OLT) is a rapidly evolving field that allows the multi-span, end-to-end visualization of optical power along fiber links in multiple dimensions from network endpoints, solely by processing signals received at coherent receivers. This paper has two objectives: (1) to report the
Root Cause Analysis (RCA) is essential for pinpointing the root causes of failures in microservice systems. Traditional data-driven RCA methods are typically limited to offline applications due to high computational demands, and existing online RCA methods handle only single-modal data, overlooking complex
The rapid advancement of large language models (LLMs) such as ChatGPT, DeepSeek, and Claude has significantly increased the presence of AI-generated text in digital communication. This trend has heightened the need for reliable detection methods to distinguish between human-authored and machine-generated
Time series, typically represented as numerical sequences, can also be transformed into images and texts, offering multi-modal views (MMVs) of the same underlying signal. These MMVs can reveal complementary patterns and enable the use of powerful pre-trained large models, such as large vision models
Large Language Models (LLMs) offer promising capabilities for tackling complex reasoning tasks, including optimization problems. However, existing methods either rely on prompt engineering, which leads to poor generalization across problem types, or require costly supervised training. We introduce SolverLLM,
Time series analysis provides essential insights for real-world system dynamics and informs downstream decision-making, yet most existing methods often overlook the rich contextual signals present in auxiliary modalities. To bridge this gap, we introduce TimeXL, a multi-modal prediction framework that
Large language models (LLMs) are becoming the centerpiece in the design and deployment of Agentic artificial intelligence (AI) systems. AI agents typically have (a) reasoning ability to analyze and think through the given task, (b) context/memory to remember things in the short-term and long-term, and
How many mistakes do published AI papers contain? Peer-reviewed publications form the foundation upon which new research and knowledge are built. Errors that persist in the literature can propagate unnoticed, creating confusion in follow-up studies and complicating reproducibility. The accelerating pace
Grounding large language models (LLMs) in domain-specific tasks like post-hoc dash-cam driving video analysis is challenging due to their general-purpose training and lack of structured inductive biases. As vision is often the sole modality available for such analysis (i.e. no LiDAR, GPS, etc.), existing
Low-Rank Adaptation (LoRA) has become the de facto parameter-efficient fine-tuning (PEFT) method for large language models (LLMs) by constraining weight updates to low-rank matrices. Recent works such as Tied-LoRA, VeRA, and VB-LoRA push efficiency further by introducing additional constraints to reduce
Silicon photonic neural networks can achieve higher throughputs and lower latencies than digital electronic alternatives.However, recently reported implementations of such networks have lacked integrated signal gain, instead utilizingoff-chip amplifiers or co-processors to complete the signal processing
Inference scaling methods for LLMs often rely on decomposing problems into steps (or groups of tokens), followed by sampling and selecting the best next steps. However, these steps and their sizes are often predetermined or manually designed based on domain knowledge. We propose dynamic decomposition,
Extreme events frequently occur in real-world time series and often carry significant practical implications. In domains such as climate and healthcare, these events, such as floods, heatwaves, or acute medical episodes, can lead to serious consequences. Accurate forecasting of such events is therefore
Change point detection aims to identify abrupt shifts occurring at multiple points within a data sequence. This task becomes particularly challenging in the online setting, where different types of change can occur, including shifts in both the marginal and joint distributions of the data. In this paper,
We study the problem of length generalization (LG) in transformers: the ability of a model trained on shorter sequences to maintain performance when evaluated on much longer, previously unseen inputs. Prior work by Huang et al. (2025) established that transformers eventually achieve length generalization
We present near-field radio-frequency (RF) sensing using microwave photonic canceler (MPC) for residual signal recovery and neuromorphic photonic recurrent neural network (PRNN)chip and FPGA hardware to implement machine learning for high-bandwidth and low-latency classification.
Automatic modulation classification (AMC) is becoming increasingly critical in the context of growing demands for ultra-wideband, low-latency signal intelligence in 5G/6G systems, with photonics addressing the bandwidth and real-time adaptability limitations faced by traditional radio-frequency (RF)
Distributed Fiber-Optic Sensing (DFOS) is a promising technique for large-scale acoustic monitoring. However, its wide variation in installation environments and sensor characteristics causes spatial heterogeneity. This heterogeneity makes it difficult to collect representative training data. It also
Creating effective slide presentations requires adapting both content and structure to match the communication context e.g. whether the presentation is for summarizing to executives, or reporting progress to research supervisors. In research and enterprise environments, this need for context-sensitive
Identifying subject-matter experts within organizations remains a challenging task due to the scale, heterogeneity, and unstructured nature of enterprise knowledge assets. We present TalentScout, an AI-driven expert identification system that constructs a unified, skill-centric knowledge graph by ingesting
This paper proposes AutoScape, a long-horizon driving scene generation framework. At its core is a novel RGB-D diffusion model that iteratively generates sparse, geometrically consistent keyframes, serving as reliable anchors for the scenes appearance and geometry. To maintain long-range geometric consistency,
Visual reasoning (VR), which is crucial in many fields for enabling human-like visual understanding, remains highly challenging. Recently, compositional visual reasoning approaches, which leverage the reasoning abilities of large language models (LLMs) with integrated tools to solve problems, have shown
Evaluating autonomous vehicles with controllability enables scalable testing in counterfactual or structured settings, enhancing both efficiency and safety. We introduce LangTraj, a language-conditioned scene-diffusion model that simulates the joint behavior of all agents in traffic scenarios. By conditioning
Obtaining high-quality fine-grained annotations for traffic signs is critical for accurate and safe decision-making in autonomous driving. Widely used datasets, such as Mapillary, often provide only coarse-grained labels without distinguishing semantically important types such as stop signs or speed
Since the inception of integrated photonics, multimaterial integration has served as a primary avenue for new technology innovations. Now, with an ever-increasing demand for integrated photonics as a platform for both high-performance links from/within datacenters and AI acceleration, multimaterial integration
Transformer-based methods have demonstrated strong potential in hyperspectral pansharpening by modeling long-range dependencies. However, their effectiveness is often limited by redundant token representations and a lack of multiscale feature modeling. Hyperspectral images exhibit intrinsic spectral
Grounding large language models (LLMs) in domain-specific tasks like post-hoc dash-cam driving video analysis is challenging due to their general-purpose training and lack of structured inductive biases. As vision is often the sole modality available for such analysis (i.e., no LiDAR, GPS, etc.), existing
This tutorial presents an architecture and methods for all-photonics networks-as-a-service in distributed Al data center infrastructures. We discuss server-based coherent transceiver architectures, remote transponder control, rapid end-to-end lightpath provisioning, digital longitudinal monitoring,
We introduce anchor vectors to monitor Rayleigh-backscattering variability in a fiber-optic computing system that performs nonlinear random projection for image classification. With a ~0.4-s calibration interval, system stability can be maintained with a linear decoder, achieving an average accuracy
GNPy advancements enable accurate and efficient modeling of multiband optical networks for digital twin applications. The developed solvers for Kerr nonlinearity and SRS have been validated through simulation and experimentally in C+L transmission, supporting real-world network planning, design, and
For the first time, we present an end-to-end AI framework for data analysis in distributed fiber optic sensing. The proposed model eliminates the need for optical phase computation and outperforms traditional data processing pipelines, achieving over 96% recognition accuracy on a diverse acoustic dataset.
Distributed fiber-optic sensing combined with machine learning enables continuous monitoring of telecom infrastructure. We employ generative modeling for event classification, supporting semi supervised learning, uncertainty calibration, and noise resilience. Our approach offers a scalable, data-efficient
We experimentally investigated variable spectral loading in an OMS, identifying performance under best and worst transmission conditions. Metrics and data visualization allowed correlation between channel configurations and OSNR variations, enabling the derivation of a simple spectrum allocation rule.
We report the first trial of network tomography over a live network in a multi-domain environment. We visualize end-to-end optical powers along multiple routes across multiple domains solely from a commercial B00G transponder, enabling performance bottleneck localization, power and routing optimization,
The 2021 emergence of Brood X cicadas was monitored in situ in our testbed using a DAS system connected to an outdoor telecom fiber over a 16-day period. The spectral and energy characteristics of the cicada calling signal has been measured and analyzed.
We report a record long 200.6 km distributed acoustic sensing (DAS) link without inline ampli-fication, 28.6% improvement of sensing range has been achieved by using three segments of enhanced-scattering fibre (ESF) with progressively higher scattering enhancements.
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused
Fiber sensing function was introduced in 2020 as one of the key technology features for the OpenAPN (all photonics network) developed by IOWN GF (Innovative Optical and Wireless NetworkGlobal Forum) in 2020.To our best knowledge, IOWN GF is the first global standard developmentorganization or technology
Agentic AI systems rely on Large Language Models (LLMs) to execute complex tasks by invoking external functions. The efficiency of these systems depends on how well function execution is managed, especially under heterogeneous and high-variance workloads, where function execution times can range from
Large Language Models (LLMs) have demonstrated remarkable capabilities in understanding and generating human-like text, yet they largely operate as reactive agents, responding only when directly prompted. This passivity creates an “awareness gap,” limiting their potential as truly collaborative partners
Time series analysis has witnessed the inspiring development from traditional autoregressive models, deep learning models, to recent Transformers and Large Language Models (LLMs). Efforts in leveraging vision models for time series analysis have also been made along the way but are less visible to the
Multi-modal time series analysis has recently emerged as a prominent research area, driven by the increasing availability of diverse data modalities, such as text, images, and structured tabular data from real-world sources. However, effective analysis of multi-modal time series is hindered by data heterogeneity,
Anomaly detection is essential for identifying unusual system behaviors and has wide-ranging applications, from fraud detection to system monitoring. In web servers, anomalies are typically detected using two types of data: metrics (numerical indicators of performance) and logs (records of system events).
Cell fate decisions are highly coordinated processes governed bycomplex interactions among numerous regulatory genes, whiledisruptions in these mechanisms can lead to developmental abnormalitiesand disease. Traditional methods often fail to capture suchcombinatorial interactions, limiting their ability
Roadside LiDAR (Light Detection and Ranging) sensors promise safer and faster traffic management and vehicular operations. However, occlusion and small view angles are significant challenges to widespread use of roadside LiDARs. We consider fusing data from multiple LiDARs at a traffic intersection to
Subsea cables are critical components of offshore wind turbines and are subjected to scour. Monitoring the scour conditions of subsea cables plays significant roles in improving safety and operation efficiency and reducing the levelized cost of electricity. This paper presents a feasibility study on
Question Answering (QA) accounts for a significantportion of LLM usage “in the wild”.However, LLMs sometimes produce false ormisleading responses, also known as hallucinations.Therefore, grounding the generatedanswers in contextually provided information—i.e., providing evidence for the generated
Adapting large Video-Language Models (VLMs) for action detection using only a few examples poses challenges like overfitting and the granularity mismatch between scene-level pre-training and required person-centric understanding. We propose an efficient adaptation strategy combining parameter-efficient
Large language models (LLMs) integrated into multi-step agent systems enable complex decision-making processes across various applications. However, their outputs often lack reliability, making uncertainty estimation crucial. Existing uncertainty estimation methods primarily focus on final-step outputs,
Causal discovery is an imperative foundation for decision-making across domains, such as smart health, AI for drug discovery and AIOps. Traditional statistical causal discovery methods, while well-established, predominantly rely on observational data and often overlook the semantic cues inherent in cause-and-effect
Enterprises are increasingly adopting Generative AI applications to extract insights from large volumes of multimodal documents in domains such as finance, law, healthcare, and industry. These documents contain structured and unstructured data (images, charts, handwritten texts, etc.) requiring robust
Fault localization in power distribution networks is essential for rapid recovery and enhancing system resilience. While Phasor Measurement Units (PMUs or ?PMUs) providehigh-resolution measurements for precise fault localization, their widespread deployment is cost-prohibitive. Distributed Fiber Optic
In this paper, we propose a novel agentic AI system called XPF, which enables users to create “agents” using just natural language, where each agent is capable of executing complex, real-world business workflows in an accurate and reliable manner. XPF provides an interface to develop and iterate over
We provide quantitative bounds on the length of sequences required to be observed during training for a transformer to length generalize, e.g., to continue to perform well on sequences unseen during training. Our results improve on Huang et al. [8], who show that there is a finite training length beyond
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating external knowledge to generate a response within a context with improved accuracy and reduced hallucinations. However, multi-modal RAG systems face unique challenges: (i) the retrieval process may select irrelevant
Symmetry in the parameter space of deep neural networks (DNNs) has proven beneficial for various deep learning applications. A well-known example is the permutation symmetry in Multi-Layer Perceptrons (MLPs), where permuting the rows of weight matrices in one layer and applying the inverse permutation
Designing protein-binding proteins with high affinity is critical in biomedical research and biotechnology. Despite recent advancements targeting specific proteins, the ability to create high-affinity binders for arbitrary protein targets on demand, without extensive rounds of wet-lab testing,remains
Tumor-infiltrating lymphocytes (TILs) are a provocative biomarker in melanoma, influencing diagnosis, prognosis, and immunotherapy outcomes; however, traditional pathologistreadTIL assessment on hematoxylin and eosin–stained slides is prone to interobserver variability, leading to inconsistent clinical
We propose to apply optical circuit switching to enable dynamic AllReduce reconfiguration for accelerating distributed machine learning. With simulated annealing-based optimization, theproposed AllReduce reconfiguration approach achieves 31% less average training time than existing solutions.
We demonstrate real-time fiber risk assessment and dynamic network routing in live metro networks using deployed DASs, satellite imagery, and large-scale AI, achieving the first significantreduction in fiber failures in four years
Distributed fiber optics sensing is applied for traffic management in the intersection. The high-definition fiber sensing data streaming is applied as source and YOLO computer vision model isemployed for event detection classification and localization.
This paper outlines QoT-driven optimization strategies in coherent fiber-optic WDM networks, addressing distinct transmission scenarios, QoT metrics, control-plane methodologies, and emerging trends to enhance network reliability, flexibility and capacity.
We jointly estimate the phase noise power spectral densities of multiple lasers using interferometry between different combinations of laser pairs. We demonstrate a beat-frequency trackingmethod that allows under-sampling of interferometric products without phase jumps.
Polarization-based, multi-span sensing over a link without reflection-back circuits is demonstrated experimentally. It is shown that distributed reflection from Rayleigh scattering can serveas an alternative to reflectors after spatial averaging of received state-of-polarization
Optical transmission networks require intelligent traffic adaptation and efficient spectrum usage. We present scalable machine learning (ML) methods for network performance modeling, andfield trials of distributed fiber sensing and classic optical network traffic coexistence.
Passive-Optical-Networks (PON) have emerged as a pivotal technology for broadband access network and are now expanding to wireless communication, supporting 5G and development of future 6G frameworks. PON systems are expected to find many new applications, including in electrical power grids, modern
We demonstrate a wavelength tunable Distributed-Vibration-Sensing over PON scheme using low-cost ITLA and Enhanced-Scattering-Fibers. Vibrations at frequency grids of 193.40THz and194.60THz in a PON with 1×16 splitter and 21 km feeder-fiber were successfully detected.
Publication Date: 6/29/2025 Event: OECC/PSC 2025 Reference: WC2-1: 1-3, 2025 Authors: Paul S. Westbrook, OFS Labs; Benyuan Zhu, OFS Labs; Kenneth S. Feder, OFS Labs; Zhou Shi, OFS Labs; Tristan Kremp, OFS Labs; Yaowen Li, NEC Laboratories America, Inc.; Ting Wang, NEC Laboratories America, Inc.; David
The recent proliferation of photorealistic images created by generative models has sparked both excitement and concern, as these images are increasingly indistinguishable from real ones to the human eye. While offering new creative and commercial possibilities, the potential for misuse, such as in misinformation
Scene understanding systems analyze visual contexts by detecting objects, their attributes, and the interactions among them to provide a holistic interpretation. Understanding a scene requires analyzing multiple salient regions within a single video frame. Recently, Vision-Language Models (VLMs) have
Diffusion models (DMs) have proven to be effective in modeling high-dimensional distributions, leading to their widespread adoption for representing complex priors in Bayesian inverse problems (BIPs). However, current DM-based posterior sampling methods proposed for solving common BIPs rely on heuristic
Subsea cables include a supervisory system that monitors the health of the amplifier pumps and fiber loss on per span basis. In some of the cables, the monitoring is achieved optically and passively using high-loss loop back paths and wavelength selective reflectors. By sending monitoring pulses through
We present a simple and accurate semi-analytical model predicting the gain of a single-stage erbium-doped fiber amplifier (EDFA) embedded with an unknown gain flattening filter (GFF). Characteristic wavelength-dependent gain coefficients and their scaling laws are extracted with a limited set of simple
We present techniques to extend the sensing range in unrepeatered submarine cable systems by utilizing Enhanced-Scattering Fibre (ESF), large-area ultra-low-loss (ULL) fibre, and a digital Distributed Acoustic Sensing (DAS) interrogator. A DAS sensing range of up to 200.6 km has been achieved using 156km
Transformers, known for their attention mechanisms, have proven highly effective in focusing on critical elements within complex data. This feature can effectively be used to address the time-varying channels in wireless communication systems. In this work, we introduce a channel-aware adaptive framework
We propose a novel Distributed Fiber Optic Sensing (DFOS) placement strategy tailored to the evolving needs of modern power grids, where fiber cables serve dual purposes: communication and real-time sensing. Our approach integrates a heuristic algorithm, PURE (Power Source-aware Route Exploration), with
A 100-meter-long fiber optic cable was installed at the bottom of a water tank at the Davidson Laboratory, together with a hydrophone for reference. The water tank is approximately 2.5 meters deep and 95 meters long; the tank also employs a 6-paddle wavemaker which can generate programmable surface waves.
We propose a novel optical flow processing technique for distributed temperature and strain sensing with the chirped-pulse coherent OTDR. Unlike conventional 1-dimensional cross-correlation methods, the technique treats the 2-dimensional waterfall data as sequential video frames, estimating local shifts
Latency-critical applications demand quick responses. Ideally, detailed insights are preferable for the best decision making and response actions. However, in situations when detailed insights cannot be provided quickly, even basic information goes a long way in tackling the situation effectively. For
Multi-camera multi-object tracking (MCMOT) faces significant challenges in maintaining consistent object identities across varying camera perspectives, particularly when precise calibration and extensive annotations are required. In this paper, we present CalibFree, a self-supervised representation learning
We propose a hybrid remote and distributed vibration sensing system based on phase-sensitive optical time-domain reflectometry with collimator-based sensor heads. We demonstrate dual-laser vibrometers that detects nm-scale displacements of remote targets.
The Out-of-Distribution (OOD) problem in graph-structured data is becoming increasingly important in various areas of research and applications, including social network recommendation [36], protein function detection [9, 21], etc. Furthermore, owing to the inherent multi-label properties of nodes, multi-label
Prompt tuning is highly effective in efficiently extracting knowledge from foundation models, encompassing both language, vision, and vision-language models. However, the efficacy of employing fixed soft prompts with a predetermined position for concatenation with inputs for all instances, irrespective
Large Language Models (LLMs) exhibit potential artificial generic intelligence recently, however, their usage is costly with high response latency. Given mixed LLMs with their own strengths and weaknesses, LLM routing aims to identify the most suitable model for each query in the stream to maximize response
Inference scaling methods often rely on decomposing problems into steps, followed by sampling and selecting the best next steps. However, these steps and their sizes are typically fixed or depend on domain knowledge. We propose dynamic decomposition, a method that adaptively and automatically breaks
Inference scaling methods often rely on decomposing problems into steps, followed by sampling and selecting the best next steps. However, these steps and their sizes are typically fixed or depend on domain knowledge. We propose dynamic decomposition, a method that adaptively and automatically breaks
Recent research has developed a number of eXplainable AI (XAI) techniques, such as gradient-based approaches, input perturbation-base methods, and black-box explanation methods. While these XAI techniques can extract meaningful insights from deep learning models, how to properly evaluate them remains
The advent of large language models (LLMs) has revolutionized the field of text generation, producing outputs that closely mimic human-like writing. Although academic and industrial institutions have developed detectors to prevent the malicious usage of LLM-generated texts, other research has doubt about
We frame code generation as a black-box optimization problem within the code space and demonstrate how optimization-inspired techniques can enhance inference scaling. Based on this perspective, we propose SCATTERED FOREST SEARCH (SFS), a novel approach that improves solution diversity and better exploits
Visual Language Models (VLMs) like GPT-4V have broadened the scope of LLM applications, yet they face significant challenges in accurately processing visual details, particularly in scientific diagrams. This paper explores the necessity of meticulous visual detail collection and region decomposition
A powerful architecture for universal segmentation relies on transformers that encode multi-scale image features and decode object queries into mask predictions. With efficiency being a high priority for scaling such models, we observed that the state-of-the-art method Mask2Former uses >50% of its compute
A major challenge in near-term quantum computing is its application to large real-world datasets due to scarce quantum hardware resources. One approach to enabling tractable quantum models for such datasets involves finding low-dimensional representations that preserve essential information for downstream
Real-world time series data often require analysis or interpretation from domain experts. Some tasks, like time series question answering, involve both time series and natural language questions, posing challenges for single-modality language models to understand their interaction. To this end, we present
Contrastive Language-Audio Pretraining (CLAP) models have demonstrated unprecedented performance in various acoustic signal recognition tasks. Fiber-optic-based acoustic recognition is one of the most important downstream tasks and plays a significant role in environmental sensing. Adapting CLAP for
Recent advancements in unique acoustic sensing devices and large-scale audio recognition models have unlocked new possibilities for environmental sound monitoring and detection. However, applying pretrained models to non-conventional acoustic sensors results in performance degradation due to domain shifts,
Deep neural network (DNN)-based models for environmental sound classification are not robust against a domain to which training data do not belong, that is, out-of-distribution or unseen data. To utilize pretrained models for the unseen domain, adaptation methods, such as finetuning and transfer learning,
The advancement of Generative Artificial Intelligence (AI), particularly Large Language Models (LLMs), is reshaping the software industry by automating code generation. Many LLM-driven distributed processing systems rely on serial code generation constrained by predefined libraries, limiting flexibility
We experimentally demonstrate real time mode division multiplexing free space optical communication with commercial 400G open and disaggregated transponders. As proof of concept,using HG00, HG10, and HG01 modes, we transmit 1.2 Tb/s/l (3´1l´400Gb/s) error free.
We present a generative AI framework based on a conditional diffusion model for distributed acoustic sensing (DAS) data imputation. The proposed DiffOptics model generates high-quality DAS data of various acoustic events using telecom fiber cables.
We propose a memory-augmented model architecture with disaggregated computation infrastructure for fiber sensing event recognition. By leveraging geo-distributed computingresources in optical networks, this approach empowers end-users to customize models while ensuring dual privacy protection.
We combine few-shot learning and grey-box modeling for EDFAs in optical lines, training a single EDFA model on 500 spectral loads and transferring it to other EDFAs using 4-8 samples, maintaining low OSNR prediction error.
This study demonstrates an AI-driven method for detecting road deformations using Distributed Acoustic Sensing (DAS) over existing telecom fiber networks. Utilizingambient traffic noise, it enables real-time, long-term, and scalable monitoring for road safety.
Field trials of ambient noise-based automated methods for manhole localization and condition diagnostics using a real-time DAS/DTS integrated system were conducted. Crossreferencingmultiple sensing data resulted in a 94.7% detection rate and enhanced anomaly identification.
Publication Date: 4/3/2025 Event: OFC 2025 Reference: Th4C.2: 1-3, 2025 Authors: Jian Fang, NEC Laboratories America, Inc.; Ming-Fang Huang, NEC Laboratories America, Inc.; Scott Kotrla, Verizon; Tiejun J. Xia, Verizon; Glenn A. Wellbrock, Verizon; Jeffrey A Mundt, Verizon; Ting Wang, NEC Laboratories
We present adaptive beamforming techniques to forward-transmission multi-event vibration sensing in environments with interference and jamming. Experimental validation over 100km fiber demonstrates significant improvements on signal reconstruction, noise reduction, and interference rejection from other
We implement a cascaded learning framework leveraging three different EDFA and fiber component models for OSNR and GSNR prediction, achieving MAEs of 0.20 and 0.14 dBover a 5-span network under dynamic channel loading.
A differential algorithm is proposed to calibrate the physical digital model of an optical line system from scratch at the commissioning phase, using minimal measurements and maximizing signal and OSNR estimation accuracy.
We propose building a spectrally resolved QoT Digital Twin for optical network domains where models and telemetry are unavailable, by probing transmission on a singlespectral slot, using GNPy, and demonstrating accurate experimental results.
Optical transmission systems require accurate modeling and performance estimation for autonomous adaption and reconfiguration. We present efficient and scalable machine learning (ML) methods for modeling optical networks at component- and network-level with minimizeddata collection.
We experimentally justify the need of analyzing stochastic PDL insertion inboptical metro network nodes. Consequently, we assess conservative OSNR margin comparingdifferent approaches to the case with maxwellian-distributed PDL, through Monte Carlo simulation.
We investigate the growth rate of phase power spectral density in fiber spools in the presence of ambient acoustic noise, observing a complex interplay between spool geometry, shielding effects, and phase cancellation at high acoustic frequencies.
We demonstrate fiber-optic acoustic data transmission using distributed acoustic sensing technology in an underwater environment. An acoustic orthogonal frequencydivisionmultiplexing (OFDM) signal transmitted through a fiber-optic cable deployed in a standard 40-meter-scale underwater testbed.
A simple and accurate semi-analytical model for predicting the gain of a single-stage erbium-doped fiber amplifier embedded with an unknown gain flattening filter is proposed for precise system equalization that is crucial for submarine systems.
LiDAR technology has emerged as a pivotal tool in Intelligent Transportation Systems (ITS), providing unique capabilities that have significantly transformed roadside traffic applications. However, this transformation comes with a distinct challenge: the immense volume of data generated by LiDAR sensors.
In Video Analytics Pipelines (VAP), Analytics Units (AUs) such as object detection and face recognition operating on remote servers rely heavily on surveillance cameras to capture high-quality video streams to achieve high accuracy. Modern network cameras offer an array of parameters that directly influence
We demonstrate the use of existing terrestrial optical networks as a smart sensing grid, employing a bidirectional long short-term memory (Bi-LSTM) model enhanced with an attention mechanism to detect road vehicles. The main idea of our approach is to deploy a fast, accurate and reliable trained deep
Communication in Millimeter wave (mmWave) band relies on narrow beams due to directionality, high path loss, and shadowing. One can use beam alignment (BA) techniques to find and adjust the direction of these narrow beams. In this paper, BA at the base station (BS) is considered, where the BS sends a
In this work, for the first time, we experimentally demonstrate mode division multiplexing-based bidirectional free space optical communication in real-time using commercial transponders. As proof of concept, via bidirectional pairs of Hermite-Gaussian modes (HG00, HG10, and HG01), using a Telecom Infra
Vector beams are spatial modes that have spatially inhomogeneous states of polarization. Any light beam is a linear combination of vector beams, the coefficients of which comprise a vector beam spectrum. In this work, through numerical calculations, a novel method of free-space optical sensing is
With the advancement of 5G, edge-assisted video analytics has become increasingly popular, driven by the technologys ability to support low-latency, high-bandwidth applications. However, in scenarios where multiple clients competing for network resources, network contention poses a significant challenge.
Despite the recent impressive breakthroughs in text-to-image generation, generative models have difficulty in capturing thedata distribution of underrepresented attribute compositions while over-memorizing overrepresented attribute compositions,which raises public concerns about their robustness and
Marine litter detection is crucial for environmental monitoring, yet the imbalance in existing datasets limits model performance in identifying various types of waste accurately. This paper presents an efficient data augmentation pipeline that combines generative diffusion models (e.g., Stable Diffusion)
Time series data is essential in various applications, including climate modeling, healthcare monitoring, and financial analytics. Understanding the contextual information associated with real-world time series data is often essential for accurate and reliable event predictions. In this paper, we introduce
We consider the conditional generation of 3D drug-like molecules with explicit control over molecular properties such as drug-like properties (e.g., Quantitative Estimate of Druglikenessor Synthetic Accessibility score) and effectively binding to specific protein sites. To tackle this problem, we propose
Action detection aims to detect (recognize and localize) human actions spatially and temporally in videos. Existing approaches focus on the closed-set setting where an action detector is trained and tested on videos from a fixed set of action categories. However, this constrained setting is not viable
The Internet-of-Things (IoT) is widely used in many applications such as smart city, transportation, healthcare, and environment monitoring. A key task of IoT maintenance is to analyze the abnormal sensor records and generate incident report. Traditionally, domain experts engage in such labor intensive
Unarguably deep learning models capable of generalizing to unseen domain data while leveraging a few labels are of great practical significance due to low developmental costs. In search of this endeavor we study the challenging problem of semi-supervised domain generalization (SSDG) where the goal is
Multimodal Large Language Models (MLLMs) have shown impressive performance in vision and text tasks. However, hallucination remains a major challenge, especially in fields like healthcare where details are critical. In this work, we show how MLLMs may be enhanced to support Visual RAG (V-RAG), a retrieval-augmented
Non-small cell lung cancer (NSCLC) shows variable responses to immunotherapy, highlighting the need for biomarkers to guide patient selection. We applied a spatial multi-omics approach to 234 advanced NSCLC patients treated with programmed death 1-based immunotherapy across three cohorts to identify
Spatio-temporal reasoning is essential in understanding real-world environments in various fields, eg, autonomous driving and sports analytics. Recent advances have improved the spatial reasoning ability of Vision-Language Models (VLMs) by introducing large-scale data, but these models still struggle
Contrastive Language-Audio Pretraining (CLAP) models have demonstrated unprecedented performance in various acoustic signal recognition tasks. Fiber optic-based acoustic recognition is one of the most important downstream tasks and plays a significant role in environmental sensing. Adapting CLAP for
Retrieval-augmented generation (RAG) improves large language models (LLMs) by using external knowledge to guide response generation, reducing hallucinations. However, RAG, particularly multi-modal RAG, can introduce new hallucination sources: (i) the retrieval process may select irrelevant pieces (e.g.,
Scalable methods for optical transmission performance prediction using machine learning (ML) are studied in metro reconfigurable optical add-drop multiplexer (ROADM) networks. A cascaded learning framework is introduced to encompass the use of cascaded component models for end-to-end (E2E) optical path
The recent advent of large-scale 3D data, e.g. Objaverse, has led to impressive progress in training pose-conditioned diffusion models for novel view synthesis. However, due to the synthetic nature of such 3D data, their performance drops significantly when applied to real-world images. This paper consolidates
Variational optimization (VO) offers a general approach for handling objectives which may involve discontinuities, or whose gradients are difficult to calculate. By introducing a variational distribution over the parameter space, such objectives are smoothed, and rendered amenable to VO methods. Local
We study the problem of subgroup discovery with Cox regression models and introduce a method for finding an interpretable subset of the data on which a Cox model is highly accurate. Our method relies on two technical innovations: the emph (Unknown sysvar: (expected prediction entropy)), a novel metric
As the adoption of large language models increases and the need for per-user or per-task model customization grows, the parameter-efficient fine-tuning (PEFT) methods, such as low-rank adaptation (LoRA) and its variants, incur substantial storage and transmission costs. To further reduce stored parameters,
The advent of Large Language Models (LLMs) has revolutionized text generation, producing outputs that closely mimic human writing. This blurring of lines between machine- and human-written text presents new challenges in distinguishing one from the other a task further complicated by the frequent
The advent of large language models (LLMs) has revolutionized the field of natural language processing, yet they might be attacked to produce harmful content. Despite efforts to ethically align LLMs, these are often fragile and can be circumvented by jailbreaking attacks through optimized or manual adversarial
Edge computing has emerged as a transformative technology that reduces application latency, improves cost efficiency, enhances security, and enables large-scale deployment of applications across various domains. In environmental monitoring, systems such as MegaSense[49], use low-cost sensors to gather
Transcriptional regulation through cis-regulatory elements (CREs) is crucial for numerous biological functions, with its disruption potentially leading to various diseases. It is well-known that these CREs often exhibit redundancy, allowing them to compensate for each other in response to external
Action detection aims to detect (recognize and localize) human actions spatially and temporally in videos. Existing approaches focus on the closed-set setting where an action detector is trained and tested on videos from a fixed set of action categories. However, this constrained setting is not viable
The problem of calibrating deep neural networks (DNNs) is gaining attention, as these networks are becoming central to many real-world applications. Different attempts have been made to counter the poor calibration of DNNs. Amongst others, train-time calibration methods have unfolded as an effective
Graph neural networks (GNNs) have emerged as powerful tools for representation learning. Their efficacy depends on their having an optimal underlying graph. In many cases, the most relevant information comes from specific subgraphs. In this work, we introduce a GNN-based framework (graph-partitioned
Separating disinformation from fact on the web has long challenged both the search and the reasoning powers of humans. We show that the reasoning power of large language models (LLMs) and the retrieval power of modern search engines can be combined to automate this process and explainably verify claims.
Levels of selection and multilevel evolutionary processes are essential concepts in evolutionary theory, and yet there is a lack of common mathematical models for these core ideas. Here, we propose a unified mathematical framework for formulating and optimizing multilevel evolutionary processes and genetic
The burgeoning capabilities of advanced large language models (LLMs) such as ChatGPT have led to an increase in synthetic content generation with implications across a variety of sectors, including media, cybersecurity, public discourse, and education. As such, the ability to detect LLMs-generated content
Large Language Models (LLMs) have achieved exceptional capabilities in open generation across various domains, yet they encounter difficulties with tasks that require intensive knowledge. To address these challenges, methods for integrating knowledge have been developed, which augment LLMs with domain-specific
The proliferation of Large Language Models (LLMs) has driven considerable interest in fine-tuning them with domain-specific data to create specialized language models. Nevertheless, such domain-specific fine-tuning data often contains contextually sensitive personally identifiable information (PII).
When performing complex multi-step reasoning tasks, the ability of Large Language Models (LLMs) to derive structured intermediate proof steps is important for ensuring that the models truly perform the desired reasoning and for improving models’ explainability. This paper is centered around a focused
The noise figure ripple of a dual-stage EDFA is studied starting from experimental measurements under full spectral load conditions and defining device characteristics. Asemi-analytical model is then proposed showing 0.1 dB standard deviation on the error distribution in all cases of operation.
In Optical Multiplex Section (OMS) control and optimization framework, end-to-end (Global) and span-by-span (Local) DNN gray-box strategies are compared in terms of scalability and accuracy of the output signal and noise power predictions. Experimental measurements are carried out in OMSs with increasing
We report the first field verification of fault localization in an optical line system (OLS) by integrating digital longitudinal monitoring and OLS calibration, highlighting changes in physical metrics and parameters. Use cases shown are degradation of a fiber span loss and optical amplifier noise figure.
Generative artificial intelligence (GenAI), specifically, Large Language Models (LLMs), have shown tremendous potential in automating several tasks and improving human productivity. Recent works have shown them to be quite useful in writing and summarizing text (articles, blogs, poems, stories, songs,
The transformer structure employed in large language models (LLMs), as a specialized category of deep neural networks (DNNs) featuring attention mechanisms, stands out for their ability to identify and highlight the most relevant aspects of input data. Such a capability is particularly beneficial in
Learning to perform physical tasks is ubiquitous yet challenging without expert guidance. While Augmented Reality (AR) has been adopted to overlay instructions directly onto the physical context, the natural authoring of such content remains unexplored. To address this, we developed WizARd and Apprentice,
Retrieval-augmented generation (RAG) systems combine the strengths of language generation and information retrieval to power many real-world applications like chatbots. Use of RAG for understanding of videos is appealing but there are two critical limitations. One-time, upfront conversion of all content
Symmetry breaking has been shown to reveal interesting phenomena in physical systems. A notable example is the fundamental work of Otto Stern and Walther Gerlach [Stern and Zerlach, Z. Physik 9, 349 (1922)] nearly 100 years ago demonstrating a spin angular momentum (SAM) deflection that differed from
Learning to localize temporal boundaries of procedure steps in instructional videos is challenging due to the limited availability of annotated large-scale training videos. Recent works focus on learning the cross-modal alignment between video segments and ASR-transcripted narration texts through contrastive
Evaluating the performance of autonomous vehicle planning algorithms necessitates simulating long-tail safety-critical traffic scenarios. However, traditional methods for generating such scenarios often fall short in terms of controllability and realism; they also neglect the dynamics of agent interactions.
Traffic cameras are essential in urban areas, playing a crucial role in intelligent transportation systems. Multiple cameras at intersections enhance law enforcement capabilities, traffic management, and pedestrian safety. However, efficiently managing and analyzing multi-camera feeds poses challenges
We propose an efficient routing strategy for AllReduce transfers, which compromise of the dominant traffic in machine learning-centric datacenters, to achieve fast parameter synchronization in distributed machine learning, improving the average training time by 9%.
We propose extending the LOGO strategy for launch power settings to multi-band scenarios, maintaining low complexity while addressing key inter-band nonlinear effects and accurate amplifier models. This methodology simplifies multi-band optical multiplex section control, providing an immediate, descriptive
We demonstrate a method for measuring the backscatter coefficient of hollow-core fibre (HCF), and show the feasibility of distributed acoustic sensing (DAS) with simultaneous 9.6-Tb/s DWDM transmission over a 1.6-km field-deployed HCF cable.
Experiments show that machine learning model of an EDFA is capable of modelling spectral hole burning effects accurately. As a result, it significantly outperforms black-box models that neglect inhomogeneous effects. Model achieves a record average RMSE of 0.0165 dB between the model predictions and
We propose a transceiver back-to-back BER-OSNR characterization method that requires only a single VOA; it leverages the receiver SNR degradation caused by received power attenuation. Experiments using commercial transceivers show that the measurement error is less than 0.2 dB in the Q-factor.
For the first time, we demonstrate remote sensing of pole-mounted fuse-cutout blowing in a power grid setup using telecom fiber cable. The proposed frequency-based AI model achieves over 98% detection accuracy using distributed fiber sensing data.
Spatial transcriptomics technologies, which generate a spatial map of gene activity, can deepen the understanding of tissue architecture and its molecular underpinnings in health and disease. However, the high cost makes these technologies difficult to use in practice. Histological images co-registered
Lensless cameras multiplex the incoming light before it is recorded by the sensor. This ability to multiplex the incoming light has led to the development of ultra-thin, high-speed, and single-shot 3D imagers. Recently, there have been various attempts at demonstrating another useful aspect of lensless
Multi-camera tracking plays a pivotal role in various real-world applications. While end-to-end methods have gained significant interest in single-camera tracking, multi-camera tracking remains predominantly reliant on heuristic techniques. In response to this gap, this paper introduces Multi-Camera
Achieving carbon neutrality within industrial operations has become increasingly imperative for sustainable development. It is both a significant challenge and a key opportunity for operational optimization in industry 4.0. In recent years, Deep Reinforcement Learning (DRL) based methods offer promising
Though Large Language Models (LLMs) have shown remarkable open-generation capabilities across diverse domains, they struggle with knowledge-intensive tasks. To alleviate this issue, knowledge integration methods have been proposed to enhance LLMs with domain-specific knowledge graphs using external modules.
In the context of long-tail classification on graphs, the vast majority of existing work primarily revolves around the development of model debiasing strategies, intending to mitigate class imbalances and enhance the overall performance. Despite the notable success, there is very limited literature that
Time series domain adaptation stands as a pivotal and intricate challenge with diverse applications, including but not limited to human activity recognition, sleep stage classification, and machine fault diagnosis. Despite the numerous domain adaptation techniques proposed to tackle this complex problem,
We propose methods and an architecture to conduct measurements and optimize newly installed optical fiber line systems semi-automatically using integrated physics-aware technologies in a data center interconnection (DCI) transmission scenario. We demonstrate, for the first time to our knowledge, digital
Joint entity and relation extraction is a process that identifies entity pairs and their relations using a single model. We focus on the problem of joint extraction in distantly-labeled data,whose labels are generated by aligning entity mentions with the corresponding entity and relation tags using a
We aim to improve the prediction of response or resistance to immunotherapies in patients with melanoma. This goal is based on the hypothesis that current gene signatures predicting immunotherapy outcomes show only modest accuracy due to the lack of spatial information about cellular functions and molecular
Recognizing domain generalization as a commonplace challenge in machine learning, data distribution might progressively evolve across a continuum of sequential domains in practical scenarios. While current methodologies primarily concentrate on bolstering model effectiveness within these new domains,
Optical fiber cables, initially designed for telecommunications, are increasingly repurposed for environmental monitoring using distributed fiber sensing technologies [1,2]. Distributed acoustic sensing (DAS) based on phase optical time domain reflectometry (?-OTDR) of Rayleigh backscatter enables various
A major challenge in near-term quantum computing is its application to large real-world datasets due to scarce quantum hardware resources. One approach to enabling tractable quantum models for such datasets involves compressing the original data to manageable dimensions while still representing essential
This paper introduces the retrieval-augmented large language model with Definite Finite Automaton (DFA-RAG), a novel framework designed to enhance the capabilities of conversational agents using large language models (LLMs). Traditional LLMs face challenges in generating regulated and compliant responses
The objective of change point detection is to identify abrupt changes at potentially multiple points within a data sequence. This task is particularly challenging in the online setting where various types of changes can occur, including shifts in both the marginal and joint distributions of the data.
Analog photonic information processing can be implemented with low chip area using wavelength-division multiplexed systems, which typically manipulate light using micro-ring resonators. Micro-rings are uniquely susceptible to thermal crosstalk, with negative system performance consequences if not addressed.
We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM). Our maximum likelihood model leverages expert-written and AI-generated reference texts to accurately and efficiently examine real-world
We report responsivity measurements of a multiterminal photodetection device in a commercial silicon photonics platform. The ratio of measured responsivities is found to track the relative terminal lengths. This can serve as a highly compact optoelectronic tap/diplexer. More importantly, complex biasing
The GNPy quality-of-transmission estimator has undergone improvements and rigorous experimental validation in a C+L multiband transmission scenario. This includes the incorporation of a disaggregated generalized Gaussian noise model, along with advanced modeling of amplifiers and transceivers. The recently
A control architecture for a partially disaggregated optical network is proposed using a GNPy-based digital twin for QoT estimation. The proposed implementation enables soft failure mitigation by autonomously adjusting the amplifier working points.
Neural language models for commonsense reasoning often formulate the problem as a QA task and make predictions based on learned representations of language after fine-tuning. However, without providing any fine-tuning data and pre-defined answer candidates, can neural language models still answer commonsense
Costs of LLM API usage rise rapidly when proprietary enterprise data is used as context for user queries to generate more accurate responses from LLMs. To reduce costs, we propose LeanContext, which generates query-aware, compact and AI model-friendly summaries of relevant enterprise data context. This
We propose a vision-LLM framework for automating development and deployment of computer vision solutions for pre-defined or custom-defined tasks. A foundational layer is proposed with a code-LLM AI orchestrator self-trained with reinforcement learning to create Python code based on its understanding
In this paper, we introduce efforts to apply large language models (LLMs) to the field of material development. NEC is advancing the development of a material development platform. By applying core technologies corresponding to two material development steps, namely investigation activities (Read paper/patent)
Large Language Models (LLMs) have exhibited remarkable proficiency across a wide array of NLP tasks. However, the escalation in model size also engenders substantial deployment costs. While few efforts have explored model pruning techniques to reduce the size of LLMs, they mainly center on general or
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs) and revolutionized various fields by providing a few task-relevant demonstrations in the prompt. However, trustworthy issues with LLMs response, such as hallucination, have also been actively discussed. Existing
Weakly Supervised Temporal Action Localization (WSTAL) aims to jointly localize and classify action segments in untrimmed videos with only video level annotations. To leverage video level annotations most existing methods adopt the multiple instance learning paradigm where frame/snippet level action
Recent studies have shown promising performance in open-vocabulary object detection (OVD) by utilizing pseudo labels (PLs) from pretrained vision and language models (VLMs). However, teacher-student self-training, a powerful and widely used paradigm to leverage PLs, is rarely explored for OVD.
Autonomous vehicle (AV) systems rely on robust perception models as a cornerstone of safety assurance. However, objects encountered on the road exhibit a long-tailed distribution, with rare or unseen categories posing challenges to a deployed perception model. This necessitates an expensive process of
Visual program synthesis is a promising approach to exploit the reasoning abilities of large language models for compositional computer vision tasks. Previous work has used few-shot prompting with frozen LLMs to synthesize visual programs. Training an LLM to write better visual programs is an attractive
The perception of 3D motion of surrounding traffic participants is crucial for driving safety. While existing works primarily focus on general large motions, we contend that the instantaneous detection and quantification of subtle motions is equally important as they indicate the nuances in driving behavior
Photorealistic simulation plays a crucial role in applications such as autonomous driving, where advances in neural radiance fields (NeRFs) may allow better scalability through the automatic creation of digital 3D assets. However, reconstruction quality suffers on street scenes due to largely collinear
Resource-constrained hardware such as edge devices or cell phones often rely on cloud servers to provide the required computational resources for inference in deep vision models. However transferring image and video data from an edge or mobile device to a cloud server requires coding to deal with network
Standardized lossy video coding is at the core of almost all real-world video processing pipelines. Rate control is used to enable standard codecs to adapt to different network bandwidth conditions or storage constraints. However standard video codecs (e.g. H.264) and their rate control modules aim to
Retrieval-augmented generation (RAG) is used in natural language processing (NLP) to provide query-relevant information in enterprise documents to large language models (LLMs). Such enterprise context enables the LLMs to generate more informed and accurate responses. When enterprise data is primarily
The study of complex human interactions and group activities has become a focal point in human-centric computer vision. However, progress in related tasks is often hindered by the challenges of obtaining large-scale labeled datasets from real-world scenarios. To address the limitation, we introduce M3Act,
In this paper we explore the capability of an agent to construct a logical sequence of action steps thereby assembling a strategic procedural plan. This plan is crucial for navigating from an initial visual observation to a target visual outcome as depicted in real-life instructional videos. Existing
This report aims to provide a comprehensive view on the inference efficiency of DETR-style detection models. We provide the effect of the basic efficiency techniques and identify the factors that are easily applicable yet effectively improve the efficiency-accuracy trade-off. Specifically, we explore
The various sensing technologies such as cameras LiDAR radar and satellites with advanced machine learning models offers a comprehensive approach to environmental perception and understanding. This paper introduces an innovative Distributed Fiber Optic Sensing (DFOS) technology utilizing the existing
The recent progress in language-based open-vocabulary object detection can be largely attributed to finding better ways of leveraging large-scale data with free-form text annotations. Training such models with a discriminative objective function has proven successful, but requires good positive and negative
We introduce two pioneering applications leveraging Distributed Fiber Optic Sensing (DFOS) and Machine Learning (ML) technologies. These innovations offer substantial benefits forfortifying telecom infrastructures and public safety. By harnessing existing telecom cables, our solutions excel in perimeter
Ensuring high-quality video content for wireless users has become increasingly vital. Nevertheless, maintaining a consistent level of video quality faces challenges due to the fluctuating encoded bitrate, primarily caused by dynamic video content, especially in live streaming scenarios. Video compression
AI/ML techniques have been used to solve systems problems, but their applicability to customize solutions on-the-fly has been limited. Traditionally, any customization required manually changing the AI/ML model or modifying the code, configuration parameters, application settings, etc. This incurs too
Extracting real-time insights from multi-modal data streams from various domains such as healthcare, intelligent transportation, and satellite remote sensing remains a challenge. High computational demands and limited knowledge scope restrict the applicability of Multi-Modal Large Language Models (MM-LLMs)
Motivation Spatial transcriptomics technologies, which generate a spatial map of gene activity, can deepen the understanding of tissue architecture and its molecular underpinnings in health and disease. However, the high cost makes these technologies difficult to use in practice. Histological images
Question-answering (QA) is a significant application of Large Language Models (LLMs), shaping chatbot capabilities across healthcare, education, and customer service. However, widespread LLM integration presents a challenge for small businesses due to the high expenses of LLM API usage. Costs rise rapidly
In today’s world, with its complex global supply chains, the difficulties and uncertainties we face offer both challenges and opportunities for making things better, especially in terms of efficiency and sustainability. These challenges grow due to unpredictable events, such as natural disasters, unexpected
In the field of histopathology, computer-aided systems face significant challenges due to diverse domain shifts. They include variations in tissue source organ, preparation and scanningprotocols. These domain shifts can significantly impact algorithms performance in histopathology tasks, such as cancer
Recent advances in spatial transcriptomics allow spatially resolved gene expression measurements with cellular or even sub-cellular resolution, directly characterizing the complex spatiotemporal gene expression landscape and cell-to-cell interactions in their native microenvironments. Due to technology
The development of next-generation sequencing (NGS) has enabled the discovery of cancer-specific driver gene alternations, making precision medicine possible. However, accurategenetic testing requires a sufficient amount of tumor cells in the specimen. The evaluation of tumor content ratio (TCR) from
Effective root cause analysis (RCA) is vital for swiftly restoring services, minimizing losses, and ensuring the smooth operation and management of complex systems. Previous data-driven RCA methods, particularly those employing causal discovery techniques, have primarily focused on constructing dependency
We aim to address key challenges in long-horizon embodied exploration and navigation by proposing a long-horizon object transport task called Long-HOT and a novel modular framework for temporally extended navigation. Agents in Long-HOT need to efficiently find and pick up target objects that are scattered
Large language models (LLMs) have notably enhanced the fluency and diversity of machine-generated text. However, this progress also presents a significant challenge in detecting the origin of a given text, and current research on detection methods lags behind the rapid evolution of LLMs. Conventional
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text, typically in the form of (subject, relation, object) triples. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods
Modern techniques like contrastive learning have been effectively used in many areas, including computer vision, natural language processing, and graph-structured data. Creating positive examples that assist the model in learning robust and discriminative representations is a crucial stage in contrastive
Graph Neural Networks (GNNs) are neural models that leverage the dependency structure in graphical data via message passing among the graph nodes. GNNs have emerged as pivotal architectures in analyzing graph-structured data, and their expansive application in sensitive domains requires a comprehensive
Providing wireless users with high-quality video content has become increasingly important. However, ensuring consistent video quality poses challenges due to variable encodedbitrate caused by dynamic video content and fluctuating channel bitrate caused by wireless fading effects. Suboptimal selection
Camouflaged object detection (COD) is the challenging task of identifying camouflaged objects visually blended into surroundings. Albeit achieving remarkable success, existing COD detectors still struggle to obtain precise results in some challenging cases. To handle this problem, we draw inspiration
For microservices-based real-time stream processing applications, computing at the edge delivers fast responses for low workloads, but as workload increases, the response time starts to slow down due to limited compute capacity. Abundant compute capacity in the cloud delivers fast responses even for
We propose a deep learning algorithm trained on varied spectral loads and EDFA working points to generate a digital twin of an optical line system able to optimize line control and to enhance OSNR predictions.
The ever-increasing demand for data traffic in recent decades has pushed network operators to give importance to the aspect of infrastructure control to facilitate its scalability and maximize its capacity. A generic lightpath (LP) is deployed starting from a traffic request between a given pair of nodes
Retrieval augmented generation (RAG) systems combine the strengths of language generation and information retrieval to power many real-world applications like chatbots. Use of RAG for combined understanding of multimodal data such as text, images and videos is appealing but two critical limitations exist:
Broadband analog signal processors utilizing silicon photonics have demonstrated a significant impact in numerous application spaces, offering unprecedented bandwidths, dynamic range, and tunability. In the past decade, microwave photonic techniques have been applied to neuromorphic processing, resulting
Vision transformer based models bring significant improvements for image segmentation tasks. Although these architectures offer powerful capabilities irrespective of specific segmentation tasks, their use of computational resources can be taxing on deployed devices. One way to overcome this challenge
Optical fiber sensing is a technology wherein audio, vibrations, and temperature are detected using an optical fiber; especially the audio/vibrations-aware sensing is called distributed acoustic sensing (DAS). In DAS, observed data, which is comprised of multichannel data, has suffered from severe noise
In applications involving sensitive data, such as finance and healthcare, the necessity for preserving data privacy can be a significant barrier to machine learning model development.Differential privacy (DP) has emerged as one canonical standard for provable privacy. However, DPs strong theoretical
Over the last decade, silicon photonic neural networks have demonstrated the possibility of photonic-enabled machine learning at the edge. These systems enable low-latency ultra-wideband classifications, channel estimations, and many other signal characterization tasks within wireless environments. While
We report the first field demonstration of 4D link tomography using a commercial transponder, which offers distance, time, frequency, and polarization-resolved monitoring. This scheme enables autonomous transponders that identify locations of multiple QoT degradation causes.
We develop a heuristic solution to effectively optimize the placement of multi-channel distributed fiber optic sensors in mesh optical fiber cable networks. The solution has beenimplemented in a field network to provide continuous monitoring.
We proposed the use of BOTDA as a monitoring tool to identify fiber types present in deployed hybrid-span fiber cables, to assist in network planning, setting optimal launch powers, and selecting correct modulation formats.
We propose a method to estimate the input power dependency of the transceiver BER-OSNR characteristic. Experiments using commercial transceivers show that estimation error in Q-factor is less than 0.2 dB.
We implement a cascaded learning framework using component-level EDFA models for optical power spectrum prediction in multi-span networks, achieving a mean absolute error of 0.17 dB across 6 spans and 12 EDFAs with only one-shot measurement.
A method is proposed in order to improve QoT-E by calibrating the physical model parameters of an optical link post-installation, using only total power monitors integrated into the EDFAs and an OSA at the receiver.
We introduce a novel scheme to detect and localize optical network anomaly using forward transmission sensing, and develop a heuristic algorithm to optimize the route selection. The performance is verified via simulations and network experiments.
In the evolving Artificial Intelligence (AI) era, the need for real-time algorithm processing in marine edge environments has become a crucial challenge. Data acquisition, analysis, and processing in complex marine situations require sophisticated and highly efficient platforms. This study optimizes
One of the key metrics of interest for stream processing applications is latency, which indicates the total time it takes for the application to process and generate insights from streaming input data. For mission-critical video analytics applications like surveillance and monitoring, it is of paramount
Imitation learning, which learns agent policy by mimicking expert demonstration, has shown promising results in many applications such as medical treatment regimes and self-driving vehicles. However, it remains a difficult task to interpret control policies learned by the agent. Difficulties mainly come
Self-consistency has emerged as a powerful method for improving the accuracy of short answers generated by large language models. As previously defined, it only concerns the accuracy of a final answer parsed from generated text. In this work, we extend the idea to open response generation, by integrating
Falling trees or their limbs can cause power lines to break or sag, sometimes resulting in devastating wildfires. Conventional protections such as circuit breakers, overcurrent relays and automatic circuit reclosers may clear short circuits caused by tree contact, but they may not detect cases where
Recent advances in optical fiber sensing have enabled telecom network operators to monitor their fiber infrastructure while generating new revenue in various application scenarios, including data center interconnect, public safety, smart cities, and seismic monitoring. However, given the high utilization
There are increasing requirements for data center interconnection (DCI) services, which use fiber to connect any DC distributed in a metro area and quickly establish high-capacity optical paths between cloud services and mobile edge computing and the users. In such networks, coherent transceivers with
Radio-frequency interference is a growing concern as wireless technology advances, with potentially life-threatening consequences like interference between radar altimeters and 5?G cellular networks. Mobile transceivers mix signals with varying ratios over time, posing challenges for conventional digital
Distributed massive MIMO networks are envisioned to realize cooperative multi-point transmission in next-generation wireless systems. For efficient cooperative hybrid beamforming, the cluster of access points (APs) needs to obtain precise estimates of the uplink channel to perform reliable downlink precoding.
JPEG remains one of the most widespread lossy image coding methods. However, the non-differentiable nature of JPEG restricts the application in deep learning pipelines. Several differentiable approximations of JPEG have recently been proposed to address this issue. This paper conducts a comprehensive
Time series domain adaptation stands as a pivotal and intricate challenge with diverse applications, including but not limited to human activity recognition, sleep stage classification, and machine fault diagnosis. Despite the numerous domain adaptation techniques proposed to tackle this complex problem,
The recent progress in language-based object detection with an open-vocabulary can be largely attributed to finding better ways of leveraging large-scale data with free-form text annotations. Training from image captions with grounded bounding boxes (ground truth or pseudo-labeled) enable the models
Modern video analytics applications comprise multiple microservices chained together as pipelines and executed on container orchestration platforms like Kubernetes. Kubernetes automatically handles the scaling of these microservices for efficient application execution. There are two popular choices for
Meta-learning enables quick adaptation of machine learning models to new tasks with limited data. While tasks could come from varying distributions in reality, most of the existing meta-learning methods consider both training and testing tasks as from the same uni-component distribution, overlooking
Visual question answering (VQA) has traditionally been treated as a single-step task where each question receives the same amount of effort, unlike natural human question-answering strategies. We explore a question decomposition strategy for VQA to overcome this limitation. We probe the ability of recently
Open-ended Commonsense Reasoning is defined as solving a commonsense question without providing 1) a short list of answer candidates and 2) a pre-defined answer scope. Conventional ways of formulating the commonsense question into a question-answering form or utilizing external knowledge to learn retrieval-based
In protein biophysics, the separation between the functionally important residues (forming the active site or binding surface) and those that create the overall structure (the fold) is a well-established and fundamental concept. Identifying and modifying those functional sites is critical for protein
Weakly-Supervised Concealed Object Segmentation (WSCOS) aims to segment objects well blended with surrounding environments using sparsely-annotated data for model training. It remains a challenging task since (1) it is hard to distinguish concealed objects from the background due to the intrinsic similarity
Data augmentation techniques, such as image transformations and combinations, are highly effective at improving the generalization of computer vision models, especially when training data is limited. However, such techniques are fundamentally incompatible with differentially private learning approaches,
Evaluating the performance of autonomous vehicle planning algorithms necessitates simulating long-tail traffic scenarios. Traditional methods for generating safety-critical scenarios often fall short in realism and controllability. Furthermore, these techniques generally neglect the dynamics of agent
Although planning is a crucial component of the autonomous driving stack, researchers have yet to develop robust planning algorithms that are capable of safely handling the diverse range of possible driving scenarios. Learning-based planners suffer from overfitting and poor long-tail performance. On
Lensless cameras multiplex the incoming light before it is recorded by the sensor. This ability to multiplex the incoming light has led to the development of ultra-thin, high-speed, and single-shot 3D imagers. Recently, there have been various attempts at demonstrating another useful aspect of lensless
Unmanned aerial vehicles (UAVs), such as drones, can carry high-performance computing devices (e.g., servers) to provide flexible and on-demand data processing services for theusers in the network edge, leading to the so-called mobile edge computing. In mobile edge computing, researchers have already
Low-complexity estimation and correction of carrier frequency offset (CFO) are essential in orthogonal frequency division multiplexing (OFDM). In this paper, we propose a low overhead blind CFO estimation technique based on cyclic prefix (CP), in multi-input multi-output (MIMO)-OFDM systems. We propose
Deep learning based joint source-channel coding (JSCC) has demonstrated significant advancements in data reconstruction compared to separate source-channel coding (SSCC). This superiority arises from the suboptimality of SSCC when dealing with finite block-length data. Moreover, SSCC falls short in reconstructing
Logs play a crucial role in system monitoring and debugging by recording valuable system information, including events and status. Although various methods have been proposed to detect anomalies in log sequences, they often overlook the significance of considering relationships among system components,
Brood X is the largest of the 15 broods of periodical cicadas, and individuals from this brood emerged across the Eastern United States in spring 2021. Using distributed acoustic sensing (DAS) technology, the activity of Brood X cicadas was monitored in their natural environment in Princeton, NJ. Critical
mmWave devices can broadcast multiple spatially-separated data streams simultaneously in order to increase data transfer rates. Data transfer can, however, be compromised by interference. Photonic blind interference cancellation systems offer a power-efficient means of mitigating interference, but previous
We present the field trial of an innovative neural network and DAS-based technique, employing a pre-trained CNN fine-tuning strategy for effective rain detection and classification within two practical scenarios.
A novel acoustic transmission technique using distributed acoustic sensors is introduced. By choosing better incident angles for smaller fading and employing an 8- channel beamformer, over 10KB data is transmitted at a 6.4kbps data rate.
Acoustic data transmission with the Orthogonal Frequency Division Multiplexing (OFDM) signal has been demonstrated using a Distributed Acoustic Sensor (DAS) based on Phase-sensitive Optical Time-Domain Reflectometry (?-OTDR).
For example, in machine translation tasks, to achieve bidirectional translation between two languages, the source corpus is often used as the target corpus, which involves the training of two models with opposite directions. The question of which one can adapt most quickly to a domain shift is of significant
Graph neural networks (GNNs) have achieved great success in dealing with graph-structured data that are prevalent in the real world. The core of graph neural networks is the message passing mechanism that aims to generate the embeddings of nodes by aggregating the neighboring node information. However,
Data crowdsourcing is an increasingly pervasive and lifestyle-changing technology due to the flywheel effect that results from the interaction between the Internet of Things and Cloud Computing. This paper presents the Citizen Science for the Sea with Information Technologies (C4Sea-IT) framework. It
This paper studies source-free domain adaptive fundus image segmentation which aims to adapt a pretrained fundus segmentation model to a target domain using unlabeled images. This is a challenging task because it is highly risky to adapt a model only using unlabeled data. Most existing methods tackle
Heterogeneous image fusion (HIF) aims to enhance image quality by merging complementary information of images captured by different sensors. Early model-based approaches have strong interpretability while being limited by non-adaptive feature extractors with poor generalizability.
Recent few-shot video classification (FSVC) works achieve promising performance by capturing similarity across support and query samples with different temporal alignment strategies or learning discriminative features via Transformer block within each episode. However, they ignore two important issues:
Few-Shot Segmentation FSS (Few-shot segmentation) aims to segment a target class using a small number of labeled images (support set). To extract information relevant to the target class, a dominant approach in best performing FSS methods removes background features using a support mask. We observe that
Federated learning casts a light on the collaboration of distributed local clients with privacy protected to attain a more generic global model. However, significant distribution shift in input/label space across different clients makes it challenging to well generalize to all clients, which motivates
Overfitting to the source domain is a common issue in gradient-based training of deep neural networks. To compensate for the over-parameterized models, numerous regularization techniques have been introduced such as those based on dropout. While these methods achieve significant improvements on classical
We aim to train a multi-task model such that users can adjust the desired compute budget and relative importance of task performances after deployment, without retraining. This enables optimizing performance for dynamically varying user needs, without heavy computational overhead to train and save models
Modern computer vision services often require users to share raw feature descriptors with an untrusted server. This presents an inherent privacy risk, as raw descriptors may be used to recover the source images from which they were extracted. To address this issue, researchers recently proposed privatizing
Language-based object detection is a promising direction towards building a natural interface to describe objects in images that goes far beyond plain category names. While recent methods show great progress in that direction, proper evaluation is lacking. With OmniLabel, we propose a novel task definition,
We report significant noise reduction in distributed acoustic sensing (DAS) link using enhanced-scatter fibre (ESF). The longest reach of 195km DAS link without inline amplifications is also demonstrated. We further present demonstration of simultaneous fibre-optic sensing and 400Gb/s data transmissions
We demonstrated under six minutes automatic provisioning of optical paths over field- deployed alien access links and WDM carrier links using commercial-grade ROADMs, whitebox mux-ponders, and multi-vendor transceivers. With channel probing, transfer learning, and Gaussian noise model, we achieved an
We review various use cases of distributed-fiber-optic-sensing and machine-learning technologies that offer advantages to telecom fiber networks on existing fiber infrastructures. Byleveraging an edge-AI platform, perimeter intrusion detection and impulsive acoustic event classification can be performed
Internet-of-things (IoTs) deploy a massive number of sensors to monitor the system and environment. Anomaly detection on sensor data is an important task for IoT maintenance and operation. In real applications, the occurrence of a system-level incident usually involves hundreds of abnormal sensors, making
Internet-of-things (IoTs) deploy a massive number of sensors to monitor the system and environment. Anomaly detection on sensor data is an important task for IoT maintenance and operation. In real applications, the occurrence of a system-level incident usually involves hundreds of abnormal sensors, making
Deep Video Codec Control Lossy video compression is commonly used when transmitting and storing video data. Unified video codecs (e.g., H.264 or H.265) remain the emph(Unknown sysvar: (de facto)) standard, despite the availability of advanced (neural) compression approaches. Transmitting videos in the
Read AutoTCL: Automated Time Series Contrastive Learning with Adaptive Augmentations publication. Modern techniques like contrastive learning have been effectively used in many areas, including computer vision, natural language processing, and graph-structured data. Creating positive examples that assist
Read FedSkill: Privacy Preserved Interpretable Skill Learning via Imitation publication. Imitation learning that replicates experts’ skills via their demonstrations has shown significant success in various decision-making tasks. However, two critical challenges still hinder the deployment of imitation
The task of root cause analysis (RCA) is to identify the root causes of system faults/failures by analyzing system monitoring data. Efficient RCA can greatly accelerate system failure recovery and mitigate system damages or financial losses. However, previous research has mostly focused on developing
The goal of root cause analysis is to identify the underlying causes of system problems by discovering and analyzing the causal structure from system monitoring data. It is indispensable for maintaining the stability and robustness of large-scale complex systems. Existing methods mainly focus on the
Imitation learning has achieved great success in many sequential decision-making tasks, in which a neural agent is learned by imitating collected human demonstrations. However, existing algorithms typically require a large number of high-quality demonstrations that are difficult and expensive to collect.
With the escalating prevalence of Internet of Things (IoTs) in critical infrastructure, the requirement for efficient and effective anomaly detection solution becomes increasingly important. Unfortunately, most prior research works have largely overlooked to adapt detection criteria for different operational
When faced with a large number of product reviews, it is not clear that a human can remember all of them and weight opinions representatively to write a good reference summary. Wepropose an automatic metric to test the prevalence of the opinions that a summary expresses, based on counting the number
Recent studies show promising performance in open-vocabulary object detection (OVD) using pseudo labels (PLs) from pretrained vision and language models (VLMs). However, PLs generated by VLMs are extremely noisy due to the gap between the pretraining objective of VLMs and OVD, which blocks further advances
The recent trend towards Personalized Federated Learning (PFL) has garnered significant attention as it allows for the training of models that are tailored to each client while maintaining data privacy. However, current PFL techniques primarily focus on modeling the conditional distribution heterogeneity
Confidence and Dispersity Speak: Characterizing Prediction Matrix for Unsupervised Accuracy Estimation This work aims to assess how well a model performs under distribution shifts without using labels. While recent methods study prediction confidence, this work reports prediction dispersity is another
The lack of visibility to behind-the-meter (BTM) PVs causes many challenges to utilities. By constructing a dictionary of typical load patterns based on daily average temperatures and power consumptions, this paper proposes a temperature-informed data-driven approach for disaggregating BTM PV generation.
In 2008, parallel computing posed significant challenges due to the complexities of parallel programming and the bottlenecks associated with efficient parallel execution. Inspired by the remarkable scalability achieved by networking and storage systems in handling extensive packet traffic and persistent
Unsupervised anomaly detection approaches have been widely accepted in applications for industrial systems. Industrial systems often operate with multiple modes since they work for multiple purposes or under different conditions. In order to deal with the difficulty of anomaly detection due to multiple
We present Application in a Box (AnB) product concept aimed at simplifying the deployment and operation of remote 5G applications. AnB comes pre-configured with all necessary hardware and software components, including sensors like cameras, hardware and software components for a local 5G wireless network,
IoT sensors, especially video cameras, are ubiquitously deployed around the world to perform a variety of computer vision tasks in several verticals including retail, health- care, safety and security, transportation, manufacturing, etc. To amortize their high deployment effort and cost, it is desirable
Edge Intelligence has received attention in the recent times for its potential towards improving responsiveness, reducing the cost of data transmission, enhancing security and privacy, and enabling autonomous decisions by edge devices. However, edge devices lack the power and compute resources necessary
Cross-Domain Detection (XDD) aims to train a domain-adaptive object detector using unlabeled images from a target domain and labeled images from a source domain. Existing approaches achieve this either by aligning the feature maps or the region proposals from the two domains, or by transferring the style
Camouflaged object detection (COD) aims to address the tough issue of identifying camouflaged objects visually blended into the surrounding backgrounds. COD is a challenging task due to the intrinsic similarity of camouflaged objects with the background, as well as their ambiguous boundaries. Existing
Conditional image-to-video (cI2V) generation aims to synthesize a new plausible video starting from an image (e.g., a person’s face) and a condition (e.g., an action class label like smile). The key challenge of the cI2V task lies in the simultaneous generation of realistic spatial appearance and temporal
Diffusion probabilistic models have achieved enormous success in the field of image generation and manipulation. In this paper, we explore a novel paradigm of using the diffusion model and classifier guidance in the latent semantic space for compositional visual tasks. Specifically, we train latent diffusion
Monocular 3D object localization in driving scenes is a crucial task, but challenging due to its ill-posed nature. Estimating 3D coordinates for each pixel on the object surface holds great potential as it provides dense 2D-3D geometric constraints for the underlying PnP problem. However, high-quality
Finetuning a large vision language model (VLM) on a target dataset after large scale pretraining is a dominant paradigm in visual question answering (VQA). Datasets for specialized tasks such as knowledge-based VQA or VQA in non natural-image domains are orders of magnitude smaller than those for general-purpose
Source-free domain adaptation (SFDA) is an emerging research topic that studies how to adapt a pretrained source model using unlabeled target data. It is derived from unsupervised domain adaptation but has the advantage of not requiring labeled source data to learn adaptive models. This makes it particularly
Early event detection aims to detect events even before the event is complete. However, most of the existing methods focus on an event with a single label but fail to be applied to cases with multiple labels. Another non-negligible issue for early event detection is a prediction with overconfidence due
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task agnostic foundation for a wide range of applications. The great promise of LLMs as general task solvers motivated people to extend their functionality largely beyond
Semi-Supervised Domain Adaptation (SSDA) is a recently emerging research topic that extends from the widely-investigated Unsupervised Domain Adaptation (UDA) by further having a few target samples labeled, i.e., the model is trained with labeled source samples, unlabeled target samples as well as a few
Utility pole detection and localization is the most fundamental application in aerial-optic cables using distributed acoustic sensing (DAS). The existing pole localization method recognizes the hammer knock signal on DAS traces by learning from knocking vibration patterns. However, it requires many efforts
We demonstrate, for the first time, real-time blind source separation of interfering GHz transmitters using photonic weights controlled by an RF-System-on-Chip FPGA. This analog system achieves multi-antenna signal separation with millisecond execution latency.
T cells monitor the health status of cells by identifying foreign peptides displayed on their surface. T-cell receptors (TCRs), which are protein complexes found on the surface of T cells, are able to bind to these peptides. This process is known as TCR recognition and constitutes a key step for immune
Access to high-quality data is an important barrier in the digital analysis of urban settings, including applications within computer vision and urban design. Diverse forms of data collected from sensors in areas of high activity in the urban environment, particularly at street intersections, are valuable
We review various applications of distributed fiber optic sensing (DFOS) and machine learning (ML) technologies that particularly benefit telecom operators’ fiber networks and businesses. By leveraging relative phase shift of the reflectance of coherent Rayleigh, Brillouin and Raman scattering of light
Millimeter‑wave (mmWave) communications is a key enabler towards realizing enhanced Mobile Broadband (eMBB) as a key promise of 5G and beyond, due to the abundance of bandwidth available at mmWave bands. An mmWave coverage map consists of blind spots due to shadowing and fading especially in dense
Time Division Duplex (TDD)-based distributed massive MIMO systems are envisioned as candidate solution for the physical layer of 6G multi-antenna systems supporting cooperative hybrid beamforming that heavily relies on the obtained uplink channel estimates for efficient coherent downlink precoding. However,
Imitation learning that mimics experts’ skills from their demonstrations has shown great success in discovering dynamic treatment regimes, i.e., the optimal decision rules to treat an individual patient based on related evolving treatment and covariate history. Existing imitation learning methods,
Dependence of EDFA gain shape on input power and input spectrum shape is modelled using a simple neural network-based architecture for amplifiers with different gains and output powers. The model can predict the gain within ±0.1 dB. Even though the model has good success predicting the performance of
Simultaneous phase and polarization sensing with span length resolution using the supervisory path is demonstrated. It is shown that by measuring polarization rotation matrix of the return paths, instead of monitoring only the state of polarization, location of the polarization disturbance can be determined
It has been demonstrated that prompt tuning is highly effective in efficiently eliciting knowledge from language models (LMs). However, the prompt tuning still lags behind fine tuning, especially when the LMs are small. P tuning v2 (Liu et al., 2021b) makes it comparable with finetuning by adding continuous
Coexistence of real-time constant-amplitude distributed acoustic sensing (DAS) and 400GbE signals is verified by field trial over metro fibers, demonstrating no QoT impact during co-propagation and supporting preemptive DAS-informed optical path switching before link failure
Polarization-based, multi-span sensing over a link with reflection-back circuits is demonstrated experimentally. By measuring rotation matrices instead of just monitoring polarization, a 35 dB extinction in localization is achieved regardless of the disturbance magnitude.
Modern applications are designed as an interacting set of microservices, and these applications are typically deployed on container orchestration platforms like Kubernetes. Several attractive features in Kubernetes make it a popular choice for deploying applications, and automatic scaling is one such
Text summarization has been a crucial problem in natural language processing (NLP) for several decades. It aims to condense lengthy documents into shorter versions while retaining the most critical information. Various methods have been proposed for text summarization, including extractive and abstractive
We report the first distributed acoustic sensing (DAS) experiment with over >1,000 km reach on a hybrid link comprising of a mixture of field and lab fibers with bi-directional inline Raman amplification after each span. We used 20× frequency-diversity chirped-pulses for the probe signal,and recovered
Various contrastive learning approaches have been proposed in recent years and have achieved significant empirical success. While effective and prevalent, contrastive learning has been less explored for time series data. A key component of contrastive learning is to select appropriate augmentations,
Source-free object detection (SFOD) aims to transfer a detector pre-trained on a label-rich source domain to an unlabeled target domain without seeing source data. While most existing SFOD methods generate pseudo labels via a source-pretrained model to guide training, these pseudo labels usually contain
We present a manhole localization method based on distributed fiber optic sensing and weakly supervised machine learning techniques. For the first time to our knowledge, ambient environment data is used for underground cable mapping with the promise of enhancing operational efficiency and reducing field
In recent years, the widespread use of drones has led to serious concerns about safety and privacy. Drone detection using microphone arrays has proven to be a promising method. However, it is challenging for microphones to serve large-scale applications due to the issues of synchronization, complexity,
Distributed Fiber Optic Sensing (DFOS) systems rely on measuring and analyzing different properties of the backscattered light of an optical pulse propagating along a fiber cable. DFOS systems can measure temperature, strain, vibrations, or acoustic excitations on the fiber cable and to their unique
Motivation: MHC Class I protein plays an important role in immunotherapy by presenting immunogenic peptides to anti-tumor immune cells. The repertoires of peptides for various MHC Class I proteins are distinct, which can be reflected by their diverse binding motifs. To characterize binding motifs for
This paper presents a framework for real-time concealed weapon detection (CWD) on 3D radar images for walk-through screening systems. The walk-through screening system aims to ensure security in crowded areas by performing CWD on walking persons, hence it requires an accurate and real-time detection
This paper presents an approach to train a unified deep network that simultaneously solves multiple human-related tasks. A multi-task framework is favorable for sharing information across tasks under restricted computational resources. However, tasks not only share information but may also compete for
Several recent studies investigate TCR-peptide/-pMHC binding prediction using machine learning or deep learning approaches. Many of these methods achieve impressive results on test sets, which include peptide sequences that are also included in the training set. In this work, we investigate how state
Devices with limited computing resources use smaller AI models to achieve low-latency inferencing. However, model accuracy is typically much lower than the accuracy of a bigger model that is trained and deployed in places where the computing resources are relatively abundant. We describe DyCo, a novel
We present a multi-sequence generalization of Variational Information Bottleneck and call the resulting model Attentive Variational Information Bottleneck (AVIB). Our AVIB model leverages multi-head self-attention to implicitly approximate a posterior distribution over latent encodings conditioned on
Although many anomaly detection approaches have been developed for multivariate time series data, limited effort has been made in federated settings in which multivariate time series data are heterogeneously distributed among different edge devices while data sharing is prohibited. In this paper, we
Graph Neural Network (GNN), as a powerful representation learning model on graph data, attracts much attention across various disciplines. However, recent studies show that GNN is vulnerable to adversarial attacks. How to make GNN more robust? What are the key vulnerabilities in GNN? How to address the
Knowledge Graphs (KGs) store information in the form of (head, predicate, tail)-triples. To augment KGs with new knowledge, researchers proposed models for KG Completion (KGC) tasks such as link prediction, i.e., answering (h, p, ?) or (?, p, t) queries. Such models are usually evaluated with averaged
Analogical reasoning is the process of discovering and mapping correspondences from a target subject to a base subject. As the most well-known computational method of analogical reasoning, Structure-Mapping Theory (SMT) abstracts both target and base subjects into relational graphs and forms the cognitive
Personalized Federated Learning (PFL) which collaboratively trains a federated model while considering local clients under privacy constraints has attracted much attention. Despite its popularity, it has been observed that existing PFL approaches result in sub-optimal solutions when the joint distribution
We review recent advances in distributed fiber optic sensing (DFOS) and their applications. The scattering mechanisms in glass, which are exploited for reflectometry-based DFOS, are Rayleigh, Brillouin, and Raman scatterings. These are sensitive to either strain and/or temperature, allowing optical fiber
Cameras are increasingly being deployed in cities, enterprises and roads world-wide to enable many applications in public safety, intelligent transportation, retail, healthcare and manufacturing. Often, after initial deployment of the cameras, the environmental conditions and the scenes around these
Serverless edge computing aims to deploy and manage applications so that developers are unaware of challenges associated with dynamic management, sharing, and maintenance of the edge infrastructure. However, this is a non-trivial task because the resource usage by various edge applications varies based
We perform the availability analysis for various reliable distributed fiber optic sensor placement schemes in the circumstances of multiple failures. The study can help the network carriers to select the optimal protection scheme for their network sensing services, considering both service availability
Distributed fiber optic sensing systems use long section of optical fiber as the sensing media. Therefore, the fiber characteristics determines the sensing capability and performance. In this presentation, various types of specialty optical fibers and their sensing applications will be introduced and
This paper presents a multi-sensor feature fusion (MSFF) neural network comprised of two inception layer-type multiple channel feature fusion (MCFF) networks for both inner-sensor and cross-sensor feature fusion in conjunction with a deep residual neural network (ResNet) for accurate grease life assessment
In Video Analytics Pipelines (VAP), Analytics Units (AUs) such as object detection and face recognition running on remote servers critically rely on surveillance cameras to capture high-quality video streams in order to achieve high accuracy. Modern IP cameras come with a large number of camera parameters
Urban flooding is becoming a common and devastating hazard, which causes life loss and economic damage. Monitoring and understanding urban flooding in a highly localized scale is a challenging task due to the complicated urban landscape, intricate hydraulic process, and the lack of high-quality and resolution
Beam search algorithms have been proposed to align the beams from an access point to a user equipment. The process relies on sending beams from a set of scanning beams (SB) and tailoring a transmission beam (TB) using the received feedback. In this paper, we discuss a fundamental trade-off between the
Building robust and generic object detection frameworks requires scaling to larger label spaces and bigger training datasets. However, it is prohibitively costly to acquire annotations for thousands of categories at a large scale. We propose a novel method that leverages the rich semantics available
With over a billion sold each year, cameras are not only becoming ubiquitous, but are driving progress in a wide range of domains such as mixed reality, robotics, and more. However, severe concerns regarding the privacy implications of camera-based solutions currently limit the range of environments
While it is desirable to train segmentation models on an aggregation of multiple datasets, a major challenge is that the label space of each dataset may be in conflict with one another. To tackle this challenge, we propose UniSeg, an effective and model-agnostic approach to automatically train segmentation
Self-supervised vision-language pretraining from pure images and text with a contrastive loss is effective, but ignores fine-grained alignment due to a dual-stream architecture that aligns image and text representations only on a global level. Earlier, supervised, non-contrastive methods were capable
It is a common practice to think of a video as a sequence of images (frames), and re-use deep neural network models that are trained only on images for similar analytics tasks on videos. In this paper, we show that this “leap of faith” that deep learning models that work well on images will also
Group Activity Recognition detects the activity collectively performed by a group of actors, which requires compositional reasoning of actors and objects. We approach the task by modeling the video as tokens that represent the multi-scale semantic concepts in the video. We propose COMPOSER, a Multiscale
Roadside LiDAR (Light Detection and Ranging) sensors are recently being explored for intelligent transportation systems aiming at safer and faster traffic management and vehicular operations. A key challenge in such systems is to efficiently transfer massive point-cloud data from the roadside LiDAR devices
We review sensing fusion results of integrating fiber sensing with video for machine-learning-based localization and classification of impulsive acoustic event detection. Classification accuracy >97% was achieved on aerial coils, and >99% using fiber-based signal enhancers.
Anomaly Detection (AD) aims to find defective patterns or abnormal samples among data, and has been a hot research topic due to various real-world applications. While various AD methods have been proposed, most of them assume the availability of a clean (anomaly-free) training set, which, however, may
5G-LAN is an enterprise local area network (LAN) that leverages 5G technology for wireless connectivity instead of WiFi. 5G technology is unique: it uses network slicing to distinguish customers in the same traffic class using new QoS technologies in the RF domain. This unique ability is not supported
Product reviews may have complex discourse including coreference and bridging relations to a main product, competing products, and interacting products. Current approaches to aspect-based sentiment analysis (ABSA) and opinion summarization largely ignore this complexity. On the other hand, existing systems
As a key component of e-commerce computing, product representation learning (PRL) provides benefits for a variety of applications, including product matching, search, and categorization. The existing PRL approaches have poor language understanding ability due to their inability to capture contextualized
Internet of things (IoT) applications deploy massive number of sensors to monitor the system and environment. Anomaly detection on streaming sensor data is an important task for IoT maintenance and operation. However, there are two major challenges for anomaly detection in real IoT applications: (1)
Multi-source Inductive Knowledge Graph Transfer Large-scale information systems, such as knowledge graphs (KGs), enterprise system networks, often exhibit dynamic and complex activities. Recent research has shown that formalizing these information systems as graphs can effectively characterize the entities
We explore two fiber sensing methods which enables coexistence with data transmission on DWDM fiber networks. Vibration detection and localization can be achieved by extracting optical phase from modified coherent transponders. Frequency-diverse chirped-pulse DAS with all-Raman amplification can improve
A big challenge in changing a monolithic application into a performant microservices-based application is the design of efficient mechanisms for microservices to communicate with each other. Prior proposals range from custom point-to-point communication among microservices using protocols like gRPC to
The applications of Internet-of-things (IoT) deploy a massive number of sensors to monitor the system and environment. Anomaly detection on streaming sensor data is an important task for IoT maintenance and operation. In real IoT applications, many sensors report categorical values rather than numerical
A variety of efforts have been put into sensing and modeling of pavements. Such capability is commonly validated with experimental data and used as reference for damage detection and other structural changes. Finite element models (FEM) often provides a high fidelity physics-base benchmark to evaluate
Among the power transmission infrastructures, the low-voltage overhead power lines are specifically critical, due to the complicated roadside environments and the significant number of connections to the end utility users. Maintaining of such a large size grid with mostly wood poles is a challenging
The plethora of sensors in our commodity devices provides a rich substrate for sensor-fused tracking. Yet, today’s solutions are unable to deliver robust and high tracking accuracies across multiple agents in practical, everyday environments – a feature central to the future of immersive and collaborative
5G services and applications explicitly reserve compute and network resources in today’s complex and dynamic infrastructure of multi-tiered computing and cellular networking to ensure application-specific service quality metrics, and the infrastructure providers charge the 5G services for the resources
Meta learning algorithms for few-shot video recognition use complex, episodic training but they often fail to learn effective feature representations. In contrast, we propose a new and simpler few-shot video recognition method that does not use meta-learning, but its performance compares well with the
This paper presents a deep learning-based framework to detect and localize the pedestrians’ anomaly behaviors in videos captured at the grade crossing. A skeleton detection and tracking algorithm are employed to capture the key point trajectories of body movements of the pedestrians. A deep recurrent
It is critical and important to detect anomalies in event sequences, which becomes widely available in many application domains. Indeed, various efforts have been made to capture abnormal patterns from event sequences through sequential pattern analysis or event representation learning. However, existing
Promising progress has been made toward learning efficient time series representations in recent years, but the learned representations often lack interpretability and do not encode semantic meanings by the complex interactions of many latent factors. Learning representations that disentangle these latent
Predicting the interactions between T-cell receptors (TCRs) and peptides is crucial for the development of personalized medicine and targeted vaccine in immunotherapy. Current datasets for training deep learning models of this purpose remain constrained without diverse TCRs and peptides. To combat the
For the first time, we demonstrate detection and classification of rain intensity using Distributed Acoustic Sensing (DAS). An artificial neural network was applied for rain intensity classification and high precision of over 96% was achieved.
We review multiple use cases over deployed networks including co-existing sensing/data transmission, cable cut prevention and perimeter intrusion detection to realize telecom infrastructure can be sensing backbones instead of the sole function of data transmission.
We report distributed-fiber-optic-sensing results on impulsive acoustic events localization/classification over telecom networks. A deep-learning-based model was trained to classify starter-gun and fireworks signatures with high accuracy of > 99% using fiber-based-signal-enhancer and >97% using aerial
We review recent advances aimed at increasing the reach of distributed fiber optic sensing with simultaneous data transmission. We review two methods based on measurement of accumulated phase on telecom signals, and chirp-pulsed DAS with inline amplification and frequency diversity.
We propose a new method to detect acoustic signals by matching distributed acoustic sensing data with simulation. In the simulation of the dynamic strain on an optical fiber, the optical fiber layouts and the gauge length are properly incorporated. We apply the proposed method to the acoustic-source
A large number of traffic collisions occur as a result of obstructed sight lines, such that even an advanced driver assistance system would be unable to prevent the crash. Recent work has proposed the use of around-the-corner radar systems to detect vehicles, pedestrians, and other road users in these
Although progress has been made for text-to-image synthesis, previous methods fall short of generalizing to unseen or underrepresented attribute compositions in the input text. Lacking compositionality could have severe implications for robustness and fairness, e.g., inability to synthesize the face
Design of multitasking deep learning models has mostly focused on improving the accuracy of the constituent tasks, but the challenges of efficiently deploying such models in a device-edge collaborative setup (that is common in 5G deployments) has not been investigated. Towards this end, in this paper,
Multi-task learning commonly encounters competition for resources among tasks, specifically when model capacity is limited. This challenge motivates models which allow control over the relative importance of tasks and total compute cost during inference time. In this work, we propose such a controllable
Convolutional Neural Networks have achieved remarkable success in face recognition, in part due to the abundant availability of data. However, the data used for training CNNs is often imbalanced. Prior works largely focus on the long-tailed nature of face datasets in data volume per identity or focus
Test-time adaptation approaches have recently emerged as a practical solution for handling domain shift without access to the source domain data. In this paper, we propose and explore a new multi-modal extension of test-time adaptation for 3D semantic segmentation. We find that, directly applying existing
Humans have the ability to accumulate knowledge of new tasks in varying conditions, but deep neural networks of-ten suffer from catastrophic forgetting of previously learned knowledge after learning a new task. Many recent methods focus on preventing catastrophic forgetting under the assumption of train
Self-supervised video representation learning has been shown to effectively improve downstream tasks such as video retrieval and action recognition. In this paper, we present the Cascade Positive Retrieval (CPR) that successively mines positive examples w.r.t. the query for contrastive learning in a
We propose an end-to-end network that takes a single perspective RGB image of a complex road scene as input, to produce occlusion-reasoned layouts in perspective space as well as a parametric bird’s-eye-view (BEV) space. In contrast to prior works that require dense supervision such as semantic labels
Recently, the distributed fiber optic sensing (DFOS) techniques have advanced rapidly. There emerges various types of DFOS sensors that can monitor physical parameters such as temperature, strain, and vibration. With these DFOS sensors deployed, the telecom networks are capable of offering additional
Sound Event Early Detection (SEED) is an essential task in recognizing the acoustic environments and soundscapes. However, most of the existing methods focus on the offline sound event detection, which suffers from the over-confidence issue of early-stage event detection and usually yield unreliable
We study few-shot debugging of transformer based natural language understanding models, using recently popularized test suites to not just diagnose but correct a problem. Given a few debugging examples of a certain phenomenon, and a held-out test set of the same phenomenon, we aim to maximize accuracy
Massive MIMO and hybrid beamforming are among the key physical layer technologies for the next generation wireless systems. In the last stage of the hybrid beamforming, the goal is to generate sharp beam with maximal and preferably uniform gain. We highlight the shortcomings of uniform linear arrays
We experimentally demonstrated the real-time operation of a photonic neuron with a self-connection, a prerequisite for integrated recurrent neural networks (RNNs). After studying two applications, we propose a photonics-assisted platform for time series prediction and classification.
Learning fine-grained embeddings is essential for extending the generalizability of models pre-trained on “coarse” labels (e.g., animals). It is crucial to fields for which fine-grained labeling (e.g., breeds of animals) is expensive, but fine-grained prediction is desirable, such as medicine. The dilemma
With the growth of 5G, Internet of Things (IoT), edge computing and cloud computing technologies, the infrastructure (compute and network) available to emerging applications (AR/VR, autonomous driving, industry 4.0, etc.) has become quite complex. There are multiple tiers of computing (IoT devices, near
We propose a reinforcement learning-based approach to query object localization, for which an agent is trained to localize objects of interest specified by a small exemplary set. We learn a transferable reward signal formulated using the exemplary set by ordinal metric learning. Our proposed method enables
This paper studies zero-shot domain adaptation where each domain is indexed on a multi-dimensional array, and we only have data from a small subset of domains. Our goal is to produce predictors that perform well on unseen domains. We propose a model which consists of a domain-invariant latent representation
In-band full-duplex (FD) communication has emerged as one of the promising techniques to improve data rates in next generation wireless systems. Typical FD scenarios considered in the literature assume FD base stations (BSs) and half-duplex (HD) users activated either in uplink (UL) or downlink (DL),
In pursuance of the unused spectrum in higher frequencies, millimeter wave (mmWave) bands have a pivotal role. However, the high path-loss and poor scattering associated with mmWave communications highlight the necessity of employing effective beamforming techniques. In order to efficiently search for
A key barrier to building performant, remotely managed and self-optimizing multi-sensor, distributed stream processing edge applications is high programming complexity. We recently proposed DataX [1], a novel platform that improves programmer productivity by enabling easy exchange, transformations, and
To overcome the high pathloss and the intense shadowing in millimeterwave (mmWave) communications, effective beamforming schemes are required which incorporate narrow beams with high beamforming gains. The mm Wave channel consists of a few spatial clusters each associated with an angle of departure (AoD).
We report the first distributed acoustic sensing (DAS) results over>1,000 km on a field-lab hybrid link using chirped-pulses with correlation detection and 20× frequency-diversity, achieving a sensitivity of 100 pa/√Hz at 20-meters spatial resolution.
We demonstrate distributed acoustic sensing (DAS) over a bidirectional datacenter link which uses self-homodyne coherent detection for the data signal. Frequency multiplexing allows sharing the optoelectronic hardware, and enables DAS as an auxiliary function.
We review the distributed-fiber-sensing field trial results over deployed telecom networks. With local AI processing, real-time detection, and localization of abnormal events with cable damage threat assessment are realized for cable self-protection.
We report field test results of facility perimeter intrusion detection with distributed-fiber-sensing technology and backscattering-enhanced-fiber by using deployed telecom fiber cables as sensing backhaul. Various intrusive activities, such as walking/jumping at >100ft distance, are detected.
We demonstrate the first fiber-optic drone detection method with ultra-highly sensitive optical microphones and distributed acoustic sensor. Accurate drone localization has been achieved through acoustic field mapping and data fusion.
Road surface condition can significantly impact the interaction between vehicles and pavement structure, which may even cause high fuel consumption and safety issues of drivers and vehicles. Distributed fiber optic sensing (DFOS) technology is a useful tool to perform continuous and real-time monitoring
We demonstrate a vibration detection and localization scheme based on bidirectional transmission of telecom signals with digital coherent detection at the receivers. Optical phase is extracted from the digital signal processing blocks of the coherent receiver, from which the vibration component is extracted
Neural networks (NNs) are attractive for nonlinear impairment compensation applications in communication systems, such as optical fiber nonlinearity, nonlinearity of driving amplifiers, and nonlinearity of semiconductor optical amplifiers. Without prior knowledge of the transmission link or the hardware
We target the task of cross-lingual Machine Reading Comprehension (MRC) in the direct zero-shot setting, by incorporating syntactic features from Universal Dependencies (UD), and the key features we use are the syntactic relations within each sentence. While previous work has demonstrated effective syntax-guided
By employing distributed fiber optic sensing (DFOS) technologies, field deployed fiber cables can be utilized as not only communication media for data transmissions but also sensing media for continuously monitoring of the physical phenomenon along the entire route. The fiber can be used to monitor ambient
This work aims to assess how well a model performs under distribution shifts without using labels. While recent methods study prediction confidence, this work reports prediction dispersity is another informative cue. Confidence reflects whether the individual prediction is certain, dispersity indicates
In this manuscript, we propose and experimentally demonstrate a dispersion-controlled optoelectronic oscillator with phase only modulator at 18 GHz. The generated microwave signal has a phase noise of −108 dBc/Hz at 10 kHz offset frequency and the integrated timing jitter is calculated to be 16.2 fs
In this paper, we propose an ordered time series classification framework that is robust against missing classes in the training data, i.e., during testing we can prescribe classes that are missing during training. This framework relies on two main components: (1) our newly proposed ordinal quadruplet
In pursuance of the unused spectrum in higher frequencies, millimeter wave (mmWave) bands have a pivotal role. However, the high path loss and poor scattering associated with mmWave communications highlight the necessity of employing effective beamforming techniques. In order to efficiently search for
To overcome the high path loss and the intense shadowing in millimeter wave (mmWave) communications, effective beamforming schemes are required which incorporate narrow beams with high beamforming gains. The mmWave channel consists of a few spatial clusters each associated with an angle of departure
StyleGANs have shown impressive results on data generation and manipulation in recent years, thanks to its disentangled style latent space. A lot of efforts have been made in inverting a pretrained generator, where an encoder is trained ad hoc after the generator is trained in a two-stage fashion. In
The recent success of deep learning applications has coincided with those widely available powerful computational resources for training sophisticated machine learning models with huge datasets. Nonetheless, training large models such as convolutional neural networks using model parallelism (as opposed
Molecule optimization is a critical step in drug development to improve the desired properties of drug candidates through chemical modification. We have developed a novel deep generative model, Modof, over molecular graphs for molecule optimization. Modof modifies a given molecule through the prediction
Distributed fiber optic sensing (DFOS) is a rapidly evolving field that allows the existing optical fiber infrastructure for telecommunications to be reused for wide-area sensing. Using the backscattering mechanisms of glass—which includes Rayleigh, Brillouin, and Raman backscatter—it is possible
For the first time to our knowledge, a stationary weight hanging on an operational aerial telecommunication field fiber was detected and localized using only ambient data collected by a φ-DAS system. Although stationary weights do not create temporally varying signals, and hence cannot be observed directly
Millions of cameras at edge are being deployed to power a variety of different deep learning applications. However, the frames captured by these cameras are not always pristine – they can be distorted due to lighting issues, sensor noise, compression etc. Such distortions not only deteriorate visual
Using deep reinforcement learning (DRL) to recover expert policies via imitation has been found to be promising in a wide range of applications. However, it remains a difficult task to interpret the control policy learned by the agent. Difficulties mainly come from two aspects: 1) agents in DRL are usually
Edge computing and 5G have made it possible to perform analytics closer to the source of data and achieve super-low latency response times, which isn’t possible with centralized cloud deployment. In this paper, we present a novel fever screening system, which uses edge machine learning techniques and
InfoGCL: Information-Aware Graph Contrastive Learning Various graph contrastive learning models have been proposed to improve the performance of tasks on graph datasets in recent years. While effective and prevalent, these models are usually carefully customized. In particular, despite all recent work
Millimeter-wave (mmWave) communications is considered as a key enabler towards the realization of next-generation wireless networks, due to the abundance of available spectrum at mmWave frequencies. However, mmWave suffers from high free-space path-loss and poor scattering resulting in mostly line-of-sight
Microservices-based video analytics pipelines routinely use multiple deep convolutional neural networks. We observe that the best allocation of resources to deep learning engines (or microservices) in a pipeline, and the best configuration of parameters for each engine vary over time, often at a timescale
Applications can tailor a network slice by specifying a variety of QoS attributes related to application-specific performance, function or operation. However, some QoS attributes like guaranteed bandwidth required by the application do vary over time. For example, network bandwidth needs of video streams
Point-of-interest (POI) recommendation is an emerging area of research on location-based social networks to analyze user behaviors and contextual check-in information. For this problem, existing approaches, with shallow or deep architectures, have two major drawbacks. First, for these approaches, the
Recent multilingual pre-trained language models have achieved remarkable zero-shot performance, where the model is only finetuned on one source language and directly evaluated on target languages. In this work, we propose a self-learning framework that further utilizes unlabeled data of target languages,
Compliments and concerns in reviews are valuable for understanding users’ shopping interests and their opinions with respect to specific aspects of certain items. Existing review-based recommenders favor large and complex language encoders that can only learn latent and uninterpretable text representations.
Image captioning systems are expected to have the ability to combine individual concepts when describing scenes with concept combinations that are not observed during training. In spite of significant progress in image captioning with the help of the autoregressive generation framework, current approaches
We develop a system for the FEVEROUS fact extraction and verification task that ranks an initial set of potential evidence and then pursues missing evidence in subsequent hops by trying to generate it, with a “next hop prediction module” whose output is matched against page elements in a predicted
In many high-stakes applications of machine learning models, outputting only predictions or providing statistical confidence is usually insufficient to gain trust from end users, who often prefer a transparent reasoning paradigm. Despite the recent encouraging developments on deep networks for sequential
Detecting anomalies in dynamic graphs is a vital task, with numerous practical applications in areas such as security, finance, and social media. Existing network embedding based methods have mostly focused on learning good node representations, whereas largely ignoring the subgraph structural changes
We demonstrate, for the first time, that cyclic linear pulse coding can be bipolar for BOTDA sensors, breaking the unipolar limitation of linear coding techniques and elevating the coding gain for a given code length.
We demonstrated for the first time that motor vehicle traffic and road capacity on multiple fiber routes can be monitored by using a distributed-fiber-optics-sensing system with a photonic switch on in-service telecom fiber cables.
Non-muscle invasive bladder cancer (NMIBC) generally has a good prognosis, however, recurrence after transurethral resection (TUR), the standard primary treatment, is a major problem. Clinical management after TUR has been based on risk classification using clinicopathological factors, but these classifications
Video analytics systems critically rely on video cameras, which capture high quality video frames, to achieve high analytics accuracy. Although modern video cameras often expose tens of configurable parameter settings that can be set by end users, deployment of surveillance cameras today often uses a
Detecting abnormal activities in real-world surveillance videos is an important yet challenging task as the prior knowledge about video anomalies is usually limited or unavailable. Despite that many approaches have been developed to resolve this problem, few of them can capture the normal spatio-temporal
Conditional Generative Adversarial Networks (cGANs) extend the standard unconditional GAN framework to learning joint data-label distributions from samples, and have been established as powerful generative models capable of generating high-fidelity imagery. A challenge of training such a model lies in
Learning transferable and domain adaptive feature representations from videos is important for video-relevant tasks such as action recognition. Existing video domain adaptation methods mainly rely on adversarial feature alignment, which has been derived from the RGB image space. However, video data is
Action recognition is an important problem that requires identifying actions in video by learning complex interactions across scene actors and objects. However, modern deep-learning based networks often require significant computation and may capture scene context using various modalities that further
Recent studies have demonstrated the vulnerability of deep neural networks against adversarial examples. In-spired by the observation that adversarial examples often lie outside the natural image data manifold and the intrinsic dimension of image data is much smaller than its pixel space dimension, we
We investigate ways to leverage uncertainty in face images to improve the quality of the face clusters. We observe that popular clustering algorithms do not produce better quality clusters when clustering probabilistic face representations that implicitly model uncertainty – these algorithms predict
Distributed fiber optic sensing systems (DFOS) allow deployed fiber cables to be sensing media, not only dedicated function of data transmission. The fiber cable can monitor the ambient environment over wide area for many applications. We review recent field trial results, and show how artificial intelligence
Applications that use edge computing and 5G to improve response times consume both compute and network resources. However, 5G networks manage only network resources without considering the application’s compute requirements, and container orchestration frameworks manage only compute resources without
In optical communication systems, fibre nonlinearity is the major obstacle in increasing the transmission capacity. Typically, digital signal processing techniques and hardware are used to deal with optical communication signals, but increasing speed and computational complexity create challenges for
The exponential growth in smart sensors and rapid progress in 5G networks is creating a world awash with data streams. However, a key barrier to building performant multi-sensor, distributed stream processing applications is high programming complexity. We propose DataX, a novel platform that improves
Guided acoustic Brillouin (GAWBS) noise is measured using a novel, homodyne measurement technique for four commonly used fibers in long-distance optical transmission systems. The measurements are made with single spans and then shown to be consistent with separate multi-span long-distance measurements.
Optical fibers have a sensing function that captures environmental changes around the fiber cable. According to the recent technology evolution of optical transmission and AI, the application of the fiber sensing has expanded and visualization accuracy has improved. We have proposed to monitor the traffic
Identification of people with elevated body temperature can reduce or dramatically slow down the spread of infectious diseases like COVID-19. We present a novel fever-screening system, F 3 S, that uses edge machine learning techniques to accurately measure core body temperatures of multiple individuals
Motivated by the success of pre-trained language models such as BERT in a broad range of natural language processing (NLP) tasks, recent research efforts have been made for adapting these models for different application domains. Along this line, existing domain-oriented models have primarily followed
Discrete event sequences are ubiquitous, such as an ordered event series of process interactions in Information and Communication Technology systems. Recent years have witnessed increasing efforts in detecting anomalies with discrete event sequences. However, it remains an extremely difficult task due
Many users implicitly assume that software can only be exploited after it is installed. However, recent supply-chain attacks demonstrate that application integrity must be ensured during installation itself. We introduce SIGL, a new tool for detecting malicious behavior during software installation.
Modern natural language understanding models depend on pretrained subword embeddings, but applications may need to reason about words that were never or rarely seen during pretraining. We show that examples that depend critically on a rarer word are more challenging for natural language inference models.
We design and build SkyHaul, the first large-scale, self-organizing network of Unmanned Aerial Vehicles (UAVs) that are connected using a mm Wave wireless mesh backhaul. While the use of a mmWave backhaul paves the way for a new class of bandwidth-intensive, latency-sensitive cooperative applications
MotivationMapping distal regulatory elements, such as enhancers, is a cornerstone for elucidating how genetic variations may influence diseases. Previous enhancer-prediction methods have used either unsupervised approaches or supervised methods with limited training data. Moreover, past approaches have
Imitation learning has been proved to be effective in mimicking experts’ behaviors from their demonstrations without access to explicit reward signals. Meanwhile, complex tasks, e.g., dynamic treatment regimes for patients with comorbidities, often suggest significant variability in expert demonstrations
Narrow beams are key to wireless communications in millimeter wave frequency bands. Beam alignment (BA) allows the base station (BS) to adjust the direction and width of the beam used for communication. During BA, the BS transmits a number of scanning beams covering different angular regions. The goal
We propose an efficient approach for placing distributed fiber optic sensors (DFOS) with concurrent sensing capability. It consumes 5.7% to 9.5% fewer sensors than that using DFOS without concurrent sensing, for covering the same network.
We report the distributed-fiber-sensing field trial results over a 5G-transport-network. A standard communication fiber is used with real-time AI processing for cable self-protection, cable-cut threat assessment and road traffic monitoring in a long-term continuous test.
We present FACESEC, a framework for fine-grained robustness evaluation of face recognition systems. FACESEC evaluation is performed along four dimensions of adversarial modeling: the nature of perturbation (e.g., pixel-level or face accessories), the attacker’s system knowledge (about training data
mmWave 5G networks promise to enable a new generation of networked applications requiring a combination of high throughput and ultra-low latency. However, in practice, mmWave performance scales poorly for large numbers of users due to the significant overhead required to manage the highly-directional
Face recognition models trained under the assumption of identical training and test distributions often suffer from poor generalization when faced with unknown variations, such as a novel ethnicity or unpredictable individual make-ups during test time. In this paper, we introduce a novel cross-domain
Trajectory prediction is a safety-critical tool for autonomous vehicles to plan and execute actions. Our work addresses two key challenges in trajectory prediction, learning multimodal outputs, and better predictions by imposing constraints using driving knowledge. Recent methods have achieved strong
Learning methods for relative camera pose estimation have been developed largely in isolation from classical geometric approaches. The question of how to integrate predictions from deep neural networks (DNNs) and solutions from geometric solvers, such as the 5-point algorithm [37], has as yet remained
Anomaly detection is an important data mining task with numerous applications, such as intrusion detection, credit card fraud detection, and video surveillance. However, given a specific complicated task with complicated data, the process of building an effective deep learning-based system for anomaly
Empowered by the rapid advancement of fiber optic sensing techniques in recent years, network carriers are able to upgrade their network infrastructure beyond the basic communication services with extra sensing applications and services (e.g., monitoring traffic and road condition, leakage detection,
Measuring document similarity plays an important role in natural language processing tasks. Most existing document similarity approaches suffer from the information gap caused by context and vocabulary mismatches when comparing varying-length texts. In this paper, we propose an unsupervised concept representation
CCCE in a 60-km fiber is estimated from its GAWBS noise spectrum by comparing the TR 1m modes with the R 0m modes. The estimated CCCE value 0.73 μm is consistent with conventional measurements of 0.6–0.8 μm.
We report the field trial results of monitoring abnormal activities near deployed cable with fiber-optic-sensing technology for cable protection. Detection and position determination of abnormal events and evaluating the threat to the cable is realized.
Neural networks are attractive for nonlinear impairment compensation applications in communication systems. In this paper, several approaches to reduce computational complexity of the neural network-based algorithms are presented.
We demonstrated for the first time to our knowledge, the detection and localization of a static weight on an aerial cable by using frequency domain decomposition analysis of ambient vibrations detected by a φ-DAS system.
We demonstrate a new application of fiber-optic-sensing and machine learning techniques for vehicle run-off-road events detection to enhance roadway safety and efficiency. The proposed approach achieves high accuracy in a testbed under various experimental conditions.
In distributed acoustic sensing (DAS) on aerial fiber-optic cables, utility pole localization is a prerequisite for any subsequent event detection. Currently, localizing the utility poles on DAS traces relies on human experts who manually label the poles’ locations by examining DAS signal patterns
Distributed fiber optical systems (DFOS) allow deployed optical cables to monitor the ambient environment over wide geographic area. We review recent field trial results, and show how DFOS can be made compatible with passive optical networks (PONs).
We demonstrate vibration detection and localization based on extracting optical phase from the DSP elements of a coherent receiver in bidirectional WDM transmission of 200-Gb/s DP-16QAM over 380 km of installed field fiber.
Forget passwords—identity verification can now be accomplished with the touch of a finger or in the blink of an eye as the biometrics field expands to encompass new techniques and application areas.
Centralized cloud computing with 100+ milliseconds network latencies cannot meet the tens of milliseconds to sub-millisecond response times required for emerging 5G applications like autonomous driving, smart manufacturing, tactile internet, and augmented or virtual reality. We describe a new, dynamic
Learning disentangled representations leads to interpretable models and facilitates data generation with style transfer, which has been extensively studied on static data such as images in an unsupervised learning framework. However, only a few works have explored unsupervised disentangled sequential
This paper considers the problem of spatiotemporal object-centric reasoning in videos. Central to our approach is the notion of object permanence, i.e., the ability to reason about the location of objects as they move through the video while being occluded, contained or carried by other objects. Existing
Prognostics or early detection of incipient faults by leveraging the monitoring time series data in complex systems is valuable to automatic system management and predictive maintenance. However, this task is challenging. First, learning the multi-dimensional heterogeneous time series data with various
T-cell receptors can recognize foreign peptides bound to major histocompatibility complex (MHC) class-I proteins, and thus trigger the adaptive immune response. Therefore, identifying peptides that can bind to MHC class-I molecules plays a vital role in the design of peptide vaccines. Many computational
Outlier detection is an important data mining task with numerous applications such as intrusion detection, credit card fraud detection, and video surveillance. However, given a specific task with complex data, the process of building an effective deep learning based system for outlier detection still
Graph Neural Networks (GNNs) have shown to be powerful tools for graph analytics. The key idea is to recursively propagate and aggregate information along the edges of the given graph. Despite their success, however, the existing GNNs are usually sensitive to the quality of the input graph. Real-world
We consider the models of deep multi-task learning with recurrent architectures that exploit regularities across tasks to improve the performance of multiple sequence processing tasks jointly. Most existing architectures are painstakingly customized to learn task relationships for different problems,
Forecasting on Sparse Multivariate Time Series Forecasting on sparse multivariate time series (MTS) aims to model the predictors of future values of time series given their incomplete past, which is important for many emerging applications. However, most existing methods process MTS’s individually,
We propose a method to accurately obtain the ratio of tumor cells over an entire histological slide. We use deep fully convolutional neural network models trained to detect and classify cells on images of H&E-stained tissue sections. Pathologists’ labels consisting of exhaustive nuclei locations and
One major source of vulnerability of neural nets in classification tasks is from overparameterized fully connected layers near the end of the network. In this paper, we propose a new neighborhood preserving layer which can replace these fully connected layers to improve the network robustness. Networks
In this paper, we focus on exploring the fusion of images and point clouds for 3D object detection in view of the complementary nature of the two modalities, i.e., images possess more semantic information while point clouds specialize in distance sensing. To this end, we present a novel two-stage multi-modal
Modern video person re-identification (re-ID) machines are often trained using a metric learning approach, supervised by a triplet loss. The triplet loss used in video re-ID is usually based on so-called clip features, each aggregated from a few frame features. In this paper, we propose to model the
Despite recent progress in Graph Neural Networks (GNNs), explaining predictions made by GNNs remains a challenging open problem. The leading method independently addresses the local explanations (i.e., important subgraph structure and node features) to interpret why a GNN model makes the prediction for
Recent advances in causal analysis can accelerate incident response time, but only after a causal graph of the attack has been constructed. Unfortunately, existing causal graph generation techniques are mainly offline and may take hours or days to respond to investigator queries, creating greater opportunity
We demonstrate fiber optic sensing systems in a distributed fiber sensor network built on existing telecom infrastructure to detect temperature, acoustic effects, vehicle traffic, etc. Measurements are also demonstrated with different network topologies and simultaneously sensing four fiber routes with
We are the first to investigate a novel problem, called distributed fiber optic sensor placement, in the context of Infrastructure-as-a-Sensor. We propose an ILP-based optimal solution and a close-to-optimal heuristic solution, both of which aim at minimizing the cost of sensors.
Full attention, which generates an attention value per element of the input feature maps, has been successfully demonstrated to be beneficial in visual tasks. In this work, we propose a fully attentional network, termed channel recurrent attention network, for the task of video pedestrian retrieval.
Accurate air turbulence forecasting can help airlines avoid hazardous turbulence, guide the routes that keep passengers safe, maximize efficiency, and reduce costs. Traditional turbulence forecasting approaches heavily rely on painstakingly customized turbulence indexes, which are less effective in dynamic
Face anti-spoofing (FAS) seeks to discriminate genuine faces from fake ones arising from any type of spoofing attack. Due to the wide variety of attacks, it is implausible to obtain training data that spans all attack types. We propose to leverage physical cues to attain better generalization on unseen
Directional transmission patterns (a.k.a. narrow beams) are the key to wireless communications in millimeter wave (mmWave) frequency bands which suffer from high path loss, severe shadowing, and intense blockage. In addition, the propagation channel in mmWave frequencies incorporates only a few number
The modern Internet has witnessed the proliferation of web applications that play a crucial role in the branding process among enterprises. Web applications provide a communication channel between potential customers and business products. However, web applications are also targeted by attackers due
Deep learning systems on the cloud are increasingly targeted by attacks that attempt to steal sensitive data. Intel SGX has been proven effective to protect the confidentiality and integrity of such data during computation. However, state-of-the-art SGX systems still suffer from substantial performance
Differentially Private Federated Learning (DPFL) is an emerging field with many applications. Gradient averaging-based DPFL methods require costly communication rounds and hardly work with large capacity models due to the explicit dimension dependence in its added noise. In this work, inspired by knowledge
To the best of our knowledge, we present the first underground fiber cable position detection methods using distributed fiber optic sensing (DFOS) technology. Meter level localization accuracy is achieved in the results.
Biometric authentication is the recognition of human identity via unique anatomical features. The development of novel methods parallels widespread application by consumer devices, law enforcement, and access control. In particular, methods based on finger veins, as compared to face and fingerprints,
Anomaly detection has been widely applied in modern data-driven security applications to detect abnormal events/entities that deviate from the majority. However, less work has been done in terms of detecting suspicious event sequences/paths, which are better discriminators than single events/entities
The recent innovation of frequency-shifted (FS) backscatter allows for backscattering with commodity devices, which are inherently half-duplex. However, their reliance on oscillators for generating the frequency-shifting signal on the tag, forces them to incur the transient phase of the oscillator before
Retailers are aiming to enhance customer experience by automating the checkout process. The key impediment here is the effort to manually align the product barcode with the scanner, requiring sequential handling of items without blocking the line-of-sight of the laser beam. While recent systems such
Node classification in temporal graphs aims to predict node labels based on historical observations. In real-world applications, temporal graphs are complex with both graph topology and node attributes evolving rapidly, which poses a high overfitting risk to existing graph learning approaches. In this
Hepatocellular carcinoma (HCC) is a representative primary liver cancer caused by long-term and repetitive liver injury. Surgical resection is generally selected as the radical cure treatment. Because the early recurrence of HCC after resection is associated with low overall survival, the prediction
Deep neural networks have been widely applied for missing data imputation. However, most existing studies have been focused on imputing continuous data, while discrete data imputation is under-explored. Discrete data is common in real world, especially in research areas of bioinformatics, genetics, and
We demonstrate an anti-spoofing face recognition system that is able to differentiate real human face with 3D printed materials. Face images captured in infrared structure light are analyzed for surface materials and spatial structure.
Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video. Previous works have approached this task by processing the entire video, often more than once, to localize relevant activities. In the real world
We tackle an unsupervised domain adaptation problem for which the domain discrepancy between labeled source and unlabeled target domains is large, due to many factors of inter- and intra-domain variation. While deep domain adaptation methods have been realized by reducing the domain discrepancy, these
Classical monocular Simultaneous Localization And Mapping (SLAM) and the recently emerging convolutional neural networks (CNNs) for monocular depth prediction represent two largely disjoint approaches towards building a 3D map of the surrounding environment. In this paper, we demonstrate that the coupling
We propose a novel framework for domain adaptation in semantic segmentation with image-level weak labels in the target domain. The weak labels may be obtained based on a model prediction for unsupervised domain adaptation (UDA), or from a human oracle in a new weakly-supervised domain adaptation (WDA)
In this paper, we derive a new differential homography that can account for the scanline-varying camera poses in Rolling Shutter (RS) cameras, and demonstrate its application to carry out RS-aware image stitching and rectification at one stroke. Despite the high complexity of RS geometry, we focus in
While deep face recognition has benefited significantly from large-scale labeled data, current research is focused on leveraging unlabeled data to further boost performance, reducing the cost of human annotation. Prior work has mostly been in controlled settings, where the labeled and unlabeled data
Monocular visual odometry (VO) suffers severely from error accumulation during frame-to-frame pose estimation. In this paper, we present a self-supervised learning method for VO with special consideration for consistency over longer sequences. To this end, we model the long-term dependency in pose prediction
We propose a simple but effective multi-source domain generalization technique based on deep neural networks by incorporating optimized normalization layers that are specific to individual domains. Our approach employs multiple normalization methods while learning separate affine parameters per domain.
Given multiple datasets with different label spaces, the goal of this work is to train a single object detector predicting over the union of all the label spaces. The practical benefits of such an object detector are obvious and significant—application-relevant categories can be picked and merged form
We address the problem of domain adaptation in videos for the task of human action recognition. Inspired by image-based domain adaptation, we can perform video adaptation by aligning the features of frames or clips of source and target videos. However, equally aligning all clips is sub-optimal as not
We propose advances that address two key challenges in future trajectory prediction: (i) multimodality in both training data and predictions and (ii) constant time inference regardless of number of agents. Existing trajectory predictions are fundamentally limited by lack of diversity in training data,
A key aspect of Federated Learning (FL) is the requirement of a centralized aggregator to maintain and update the global model. However, in many cases orchestrating a centralized aggregator might be infeasible due to numerous operational constraints. In this paper, we introduce BAFFLE, an aggregator
Many residential energy consumers have installed photovoltaic (PV) panels and energy storage systems. These residential users can aggregate and participate in the energy markets. A stochastic decision making model for an aggregation of these residential units for participation in two-settlement markets
Graph representation learning serves as the core of important prediction tasks, ranging from product recommendation to fraud detection. Reallife graphs usually have complex information in the local neighborhood, where each node is described by a rich set of features and connects to dozens or even hundreds
Modern storage systems leverage flash caching to boost I/O performance, and enhancing the space efficiency and endurance of flash caching remains a critical yet challenging issue in the face of ever-growing data-intensive workloads. Deduplication and compression are promising data reduction techniques
Read Improving Face Recognition by Clustering Unlabeled Faces in the Wild (arXiv). While deep face recognition has benefited significantly from large scale labeled data, current research is focused on leveraging unlabeled data to further boost performance, reducing the cost of human annotation. Prior
RFID applications for taking inventory and processing transactions in point-of-sale (POS) systems improve operational efficiency but are not designed to provide insights about customers’ interactions with products. We bridge this gap by solving the proximity grouping problem to identify groups of RFID
Learning disentangled representations of natural language is essential for many NLP tasks, e.g., conditional text generation, style transfer, personalized dialogue systems, etc. Similar problems have been studied extensively for other forms of data, such as images and videos. However, the discrete nature
Directional transmission patterns (a.k.a. narrow beams) are the key to wireless communications in millimeter wave (mmWave) frequency bands which suffer from high path loss and severe shadowing. In addition, the propagation channel in mmWave frequencies incorporates only a few number of spatial clusters
We address the challenging task of occlusion-aware indoor 3D scene understanding. We represent scenes by a set of planes, where each one is defined by its normal, offset and two masks outlining (i) the extent of the visible part and (ii) the full region that consists of both visible and occluded parts
With increasing ethical and legal concerns on privacy for deep models in visual recognition, differential privacy has emerged as a mechanism to disguise membership of sensitive data in training datasets. Recent methods like Private Aggregation of Teacher Ensembles (PATE) leverage a large ensemble of
Recognizing wild faces is extremely hard as they appear with all kinds of variations. Traditional methods either train with specifically annotated variation data from target domains, or by introducing unlabeled target variation data to adapt from the training data. Instead, we propose a universal representation
In this paper, we address the problem of inferring the layout of complex road scenes from video sequences. To this end, we formulate it as a top-view road attributes prediction problem and our goal is to predict these attributes for each frame both accurately and consistently. In contrast to prior work,
Pose-tracking is an important problem that requires identifying unique human pose-instances and matching them temporally across different frames in a video. However, existing pose-tracking methods are unable to accurately model temporal relationships and require significant computation, often computing
We propose a sequential variational autoencoder to learn disentangled representations of sequential data (e.g., videos and audios) under self-supervision. Specifically, we exploit the benefits of some readily accessible supervision signals from input data itself or some off-the-shelf functional models
Efficient audio scene classification is essential for smart sensing platforms such as robots, medical monitoring, surveillance, or autonomous vehicles. We propose a retrieval-based scene classification architecture that combines recurrent neural networks and attention to compute embeddings for short
Remaining Useful Life (RUL) estimation is a key element in Predictive maintenance. System agnostic approaches which just utilize sensor and operational time series have gained popularity due to its ease of implementation. Due to the nature of measurement or degradation mechanisms, its accurate estimation
In this study, we explored surface-enhanced Raman spectroscopy (SERS) for analyzing red wine through several facile sample preparations. These approaches involved the direct analysis of red wine with Raman spectroscopy and the direct incubation of red wine with silver nanoparticles (i.e., AgNPs) and
Inductive and unsupervised graph learning is a critical technique for predictive or information retrieval tasks where label information is difficult to obtain. It is also challenging to make graph learning inductive and unsupervised at the same time, as learning processes guided by reconstruction error
Graph Convolutional Networks (GCNs) have shown to be a powerful tool for analyzing graph-structured data. Most of previous GCN methods focus on learning a good node representation by aggregating the representations of neighboring nodes, whereas largely ignoring the edge information. Although few recent
Recent developments in discovering dynamic treatment regimes (DTRs) have heightened the importance of deep reinforcement learning (DRL) which are used to recover the doctor’s treatment policies. However, existing DRL-based methods expose the following limitations: 1) supervised methods based on behavior
While backtracking analysis has been successful in assisting the investigation of complex security attacks, it faces a critical dependency explosion problem. To address this problem, security analysts currently need to tune backtracking analysis manually with different case-specific heuristics. However,
We propose a framework for answering open domain multi hop questions in which partial information is read and used to generate followup questions, to finally be answered by a pretrained single hop answer extractor. This framework makes each hop interpretable, and makes the retrieval associated with later
To subvert recent advances in perimeter and host security, the attacker community has developed and employed various attack vectors to make malware much more stealthy than before to penetrate the target system and prolong its presence. The advanced malware, or stealthy malware, impersonates or abuses
We demonstrate the experimental implementation of photonic neural network for fiber nonlinearity compensation over a 10,080 km trans-pacific transmission link. Q-factor improvement of 0.51 dB is achieved with only 0.06 dB lower than numerical simulations.
We demonstrated for the first time that geographic locations on deployed fiber cables can be determined accurately by using OTDR distances. The method involves vibration stimulation near deployed cables and distributed fiber optical sensing technology.
We propose reusing existing optical cables in metropolitan networks for distributed sensing using a bidirectional, dual-band architecture where communications and sensing signals can coexist with weak interaction on the same optical fiber.
We demonstrate a passive optical network (PON) that employs reflective semiconductor optical amplifiers (RSOAs) at optical network units (ONUs) to allow simultaneous data transmission with distributed fiber-optic sensing (DFOS) on individual distribution fibers.
We propose an active learning approach for transferring representations across domains. Our approach, active adversarial domain adaptation (AADA), explores a duality between two related problems: adversarial domain alignment and importance sampling for adapting models across domains. The former uses
We present an audio-visual multimodal approach for the task of zero-shot learning (ZSL) for classification and retrieval of videos. ZSL has been studied extensively in the recent past but has primarily been limited to visual modality and to images. We demonstrate that both audio and visual modalities
Blind video deblurring restores sharp frames from a blurry sequence without any prior. It is a challenging task because the blur due to camera shake, object movement and defocusing is heterogeneous in both temporal and spatial dimensions. Traditional methods train on datasets synthesized with a single
We address the problem of human action classification in drone videos. Due to the high cost of capturing and labeling large-scale drone videos with diverse actions, we present unsupervised and semi-supervised domain adaptation approaches that leverage both the existing fully annotated action recognition
We address the challenging task of video-based person re-identification. Recent works have shown that splitting the video sequences into clips and then aggregating clip-based similarity is appropriate for the task. We show that using a learned clip similarity aggregation function allows filtering out
Recently, recommender systems have been able to emit substantially improved recommendations by leveraging user-provided reviews. Existing methods typically merge all reviews of a given user (item) into a long document, and then process user and item documents in the same manner. In practice, however,
Multivariate time series data are becoming increasingly ubiquitous in varies real-world applications such as smart city, power plant monitoring, wearable devices, etc. Given the current time series segment, how to retrieve similar segments within the historical data in an efficient and effective manner
The problem of learning and forecasting underlying trends in time series data arises in a variety of applications, such as traffic management, energy optimization, etc. In literature, a trend in time series is characterized by the slope and duration, and its prediction is then to forecast the two values
Data privacy has emerged as an important issue as data-driven deep learning has been an essential component of modern machine learning systems. For instance, there could be a potential privacy risk of machine learning systems via the model inversion attack, whose goal is to reconstruct the input data
Click-through rate (CTR) prediction is a critical task in online advertising and marketing. For this problem, existing approaches, with shallow or deep architectures, have three major drawbacks. First, they typically lack persuasive rationales to explain the outcomes of the models. Unexplainable predictions
Question routing (QR) aims at recommending newly posted questions to the potential answerers who are most likely to answer the questions. The existing approaches that learn users’ expertise from their past question-answering activities usually suffer from challenges in two aspects: 1) multi-faceted expertise
To the best of our knowledge, we present the first field trial of distributed fiber optical sensing (DFOS) and high-speed communication, comprising a coexisting system, over an operation telecom network. Using probabilistic-shaped (PS) DP-144QAM, a 36.8 Tb/s with an 8.28-b/s/Hz spectral efficiency (SE)
Increasing adoption of solar photovoltaic (PV) presents new challenges to modern power grid due to its variable and intermittent nature. Fluctuating outputs from PV generation can cause the grid violating voltage operation limits. PV smart inverters (SIs) provide a fast-response method to regulate voltage
Modern cyber-physical systems are increasingly complex and vulnerable to attacks like false data injection aimed at destabilizing and confusing the systems. We develop and evaluate an attack-detection framework aimed at learning a dynamic invariant network, data-driven temporal causal relationships between
System monitoring has recently emerged as an effective way to analyze and counter advanced cyber attacks. The monitoring data records a series of system events and provides a global view of system behaviors in an organization. Querying such data to identify potential system risks and malicious behaviors
In this paper, we introduce a contextual grounding approach that captures the context in corresponding text entities and image regions to improve the grounding accuracy. Specifically, the proposed architecture accepts pre-trained text token embeddings and image object features from an off-the-shelf object
Cyber-physical systems (CPS) are ubiquitous in several critical infrastructure applications. Forecasting the state of CPS, is essential for better planning, resource allocation and minimizing operational costs. It is imperative to forecast the state of a CPS multiple steps into the future to afford enough
Given a network with the labels for a subset of nodes, transductive node classification targets to predict the labels for the remaining nodes in the network. This technique has been used in a variety of applications such as voxel functionality detection in brain network and group label prediction in
Existing representation learning methods based on graph neural networks and their variants rely on the aggregation of neighborhood information, which makes it sensitive to noises in the graph, e.g. erroneous links between nodes, incorrect/missing node features. In this paper, we propose Graph Denoising
Network embedding aims to learn the low-dimensional representations/embeddings of vertices which preserve the structure and inherent properties of the networks. The resultant embeddings are beneficial to downstream tasks such as vertex classification and link prediction. A vast majority of real-world
In this paper, we introduce a contextual grounding approach that captures the context in corresponding text entities and image regions to improve the grounding accuracy. Specifically, the proposed architecture accepts pre-trained text token embeddings and image object features from an off-the-shelf object
The rich and accessible labeled data fueled the revolutionary successes of deep learning in object recognition. However, recognizing objects of novel classes with limited supervision information provided, i.e., Novel Object Recognition (NOR), remains a challenging task. We identify in this paper two
Self-calibration of camera intrinsics and radial distortion has a long history of research in the computer vision community. However, it remains rare to see real applications of such techniques to modern Simultaneous Localization And Mapping (SLAM) systems, especially in driving scenarios. In this paper,
We address the problem of 3D object detection from 2D monocular images in autonomous driving scenarios. We propose to lift the 2D images to 3D representations using learned neural networks and leverage existing networks working directly on 3D data to perform 3D object detection and localization. We show
Zero-shot learning (ZSL) aims to recognize instances of unseen classes solely based on the semantic descriptions of the classes. Existing algorithms usually formulate it as a semantic-visual correspondence problem, by learning mappings from one feature space to the other. Despite being reasonable, previous
Predicting structured outputs such as semantic segmentation relies on expensive per-pixel annotations to learn supervised models like convolutional neural networks. However, models trained on one data domain may not generalize well to other domains without annotations for model finetuning. To avoid the
Traditional intrinsic image decomposition focuses on decomposing images into reflectance and shading, leaving surfaces normals and lighting entangled in shading. In this work, we propose a Global-Local Spherical Harmonics (GLoSH) lighting model to improve the lighting component, and jointly predict reflectance
2018’s 1.2 million North American charging ports will grow ten times to over 12.6 million by 2027, according to Navigant, which could overwhelm the nation’s grids. DC Fast charging requires grid upgrade to supply the new charging demand. However, since the utilization ratio of those charging station
We proposed and demonstrated a new machine learning algorithm for wavelength modulation spectroscopy to enhance the accuracy of fire detection. The result shows more than 8% of accuracy improvement by analyzing CO/CO 2 2f signals.
Ongoing smart grid activities and associated automation resulted in rich set of data. These data can be utilized for monitoring and estimation of real time photovoltaic (PV) generation. Inherent variability in PV and related impact on power systems is a challenging problem. Improving the accuracy of
Millimeter-wave (mmWave) bands have the potential to enable significantly high data rates in wireless systems. In order to overcome intense path loss and severe shadowing in these bands, it is essential to employ directional beams for data transmission. Furthermore, it is known that the mmWave channel
In-band full-duplex (FD) communications – enabled by recent advances in antenna and RF circuit design – has emerged as one of the promising techniques to improve data rates in wireless systems. One of the major roadblocks in enabling high data rates in FD systems is the inter-user interference (IUI)
The need for countering Advanced Persistent Threat (APT) attacks has led to the solutions that ubiquitously monitor system activities in each enterprise host, and perform timely attack investigation over the monitoring data for uncovering the attack sequence. However, existing general-purpose query systems
An artificial neural network (ANN) based transfer learning model is built for quality of transmission (QoT) prediction in optical systems feasible with different modulation formats. Knowledge learned from one optical system can be transferred to a similar optical system by adjusting weights in ANN hidden
Information systems have widely been the target of malware attacks. Traditional signature-based malicious program detection algorithms can only detect known malware and are prone to evasion techniques such as binary obfuscation, while behavior-based approaches highly rely on the malware training samples
Node classification in graph-structured data aims to classify the nodes where labels are only available for a subset of nodes. This problem has attracted considerable research efforts in recent years. In real-world applications, both graph topology and node attributes evolve over time. Existing techniques,
The behind the meter battery energy storage systems (BTM-BESSs) have been deployed widely by indus-trial/commercial buildings to manage electricity transaction with utilities in order to reduce customers’ electricity bills. Commercial BTM battery storages are mainly employed to cut the customers’ monthly
Developing conditional generative models for text-to-video synthesis is an extremely challenging yet an important topic of research in machine learning. In this work, we address this problem by introducing Text-Filter conditioning Generative Adversarial Network (TFGAN), a conditional GAN model with a
Traditional embedding approaches associate a real-valued embedding vector with each symbol or data point, which is equivalent to applying a linear transformation to “one-hot” encoding of discrete symbols or data objects. Despite simplicity, these methods generate storage-inefficient representations
Traffic conditions of the highway, Ya traffic volume meter CCTV Because it is observed in the spot, such as the discovery of traffic disturbances which deviates from the observation spot it may be delayed. The traffic flow has a problem from the point observations data indirectly order to be estimated,
Read Deep Supervision with Intermediate Concepts (IEEE). Recent data-driven approaches to scene interpretation predominantly pose inference as an end-to-end black-box mapping, commonly performed by a Convolutional Neural Network (CNN). However, decades of work on perceptual organization in both human
We address the challenging problem of generating facial attributes using a single image in an unconstrained pose. In contrast to prior works that largely consider generation on 2D near-frontal images, we propose a GAN-based framework to generate attributes directly on a dense 3D representation given
The higher-order spatial modes of a light beam are receiving significant interest. They can be used to further increase the data speeds of high speed optical communication, and for novel optical sensing modalities. As such, the classification of higher-order spatial modes is ubiquitous. Canonical classification
Fiber nonlinearity is one of the major limitations to the achievable capacity in long distance fiber optic transmission systems. Nonlinear impairments are determined by the signal pattern and the transmission system parameters. Deterministic algorithms based on approximating the nonlinear Schrodinger
The power systems worldwide have been embracing the rapid growth of distributed energy resources. Commonly, distributed energy resources exist in the distribution level, such as electric vehicles, rooftop photovoltaic panels, and home battery systems, which cannot be controlled by a centralized entity
A two-stage neural network model is applied on captured PS-144QAM raw data to estimate channel G-OSNR in a metro network field trial. We obtained 0.27dB RMSE with first-stage CNN classifier and second-stage ANN regressions.
In this paper, we consider the problem of developing predictive models with limited data for energy assets such as electricity loads, PV power generations, etc. We specifically investigate the cases where the amount of historical data is not sufficient to effectively train the prediction model. We first
We use the term clairvoyant to refer to networks that provide on-demand visibility for any flow at any time. Traditionally, network visibility is achieved by instrumenting and passively monitoring all flows in a network. SDN networks, by design endowed with full visibility, offer another alternative
We introduce a novel dataset for high-level 3D scene understanding of complex road scenes. Our annotations extend the existing datasets KITTI [5] and nuScenes [1] with semantically and geometrically meaningful attributes like the number of lanes or the existence of, and distance to, intersections, sidewalks
In this paper, we address the problem of inferring the layout of complex road scenes given a single camera as input. To achieve that, we first propose a novel parameterized model of road layouts in a top-view representation, which is not only intuitive for human visualization but also provides an interpretable
Despite the large volume of face recognition datasets, there is a significant portion of subjects, of which the samples are insufficient and thus under-represented. Ignoring such significant portion results in insufficient training data. Training with under-represented data leads to biased classifiers
Recent developments in deep domain adaptation have allowed knowledge transfer from a labeled source domain to an unlabeled target domain at the level of intermediate features or input pixels. We propose that advantages may be derived by combining them, in the form of different insights that lead to a
An exact method of correcting the rolling shutter (RS) effect requires recovering the underlying geometry, i.e. the scene structures and the camera motions between scanlines or between views. However, the multiple-view geometry for RS cameras is much more complicated than its global shutter (GS) counterpart,
We introduce the Neural Collaborative Subspace Clustering, a neural model that discovers clusters of data points drawn from a union of low-dimensional subspaces. In contrast to previous attempts, our model runs without the aid of spectral clustering. This makes our algorithm one of the kinds that can
In this paper, we study the generalization performance of deep neural networks in learning problems where the given task is governed by a set of rules. We consider two settings of supervised learning and rule-based learning. In supervised learning, the network is trained with pairs of inputs and the
Millimeter-wave (mmWave) bands have shown the potential to enable high data rates for next generation mobile networks. In order to cope with high path loss and severe shadowing in mmWave frequencies, it is essential to employ massive antenna arrays and generate narrow transmission patterns (beams). When
Localizing moments in untrimmed videos using language queries is a new task that requires the ability to accurately ground language into video. Existing approaches process the video, often more than once, to localize the activities and are inefficient. In this paper, we present TripNet, an end-to-end
Unsupervised domain adaptation is a promising avenue to enhance the performance of deep neural networks on a target domain, using labels only from a source domain. However, the two predominant methods, domain discrepancy reduction learning and semi-supervised learning, are not readily applicable when
Simulation is a useful tool in situations where training data for machine learning models is costly to annotate or even hard to acquire. In this work, we propose a reinforcement learning-based method for automatically adjusting the parameters of any (non-differentiable) simulator, thereby controlling
Program or process is an integral part of almost every IT/OT system. Can we trust the identity/ID (e.g., executable name) of the program? To avoid detection, malware may disguise itself using the ID of a legitimate program, and a system tool (e.g., PowerShell) used by the attackers may have the fake
Co-clustering partitions instances and features simultaneously by leveraging the duality between them, and it often yields impressive performance improvement over traditional clustering algorithms. The recent development in learning deep representations has demonstrated the advantage in extracting effective
In spite of its importance, passenger demand prediction is a highly challenging problem, because the demand is simultaneously influenced by the complex interactions among many spatial and temporal factors and other external factors such as weather. To address this problem, we propose a Spatio-TEmporal
Fiber nonlinearity is a major limitation on the achievable maximum capacity per fiber core. Digital signal processing (DSP) can be used directly to compensate nonlinear impairments, however with limited effectiveness. It is well known that fibers with higher chromatic dispersion (CD) reduce nonlinear
Neuron network (NN) is proposed to work together with perturbation-based nonlinearity compensation (NLC) algorithm by feeding with intra-channel cross-phase modulation (IXPM) and intra-channel four-wave mixing (IFWM) triplets. Without prior knowledge of the transmission link and signal pulse shaping/baudrate,
Flexible wavelength-multiplexing technique in backbone submarine networks has been deployed to accommodate the trend of variable-rate modulation formats. In this paper, we propose a new design of flexible-rate transponders in the scenario of flexible multiplexing scheme to achieve near-Shannon performance.
Setuid system calls enable critical functions such as user authentications and modular privileged components. Such operations must only be executed after careful validation. However, current systems do not perform rigorous checks, allowing exploitation of privileges through memory corruption vulnerabilities
For the first time, we demonstrate detection of vehicle speed, density, and road conditions using deployed fiber carrying high-speed data transmission, and prove carriers’ large-scale fiber infrastructures can also be used as ubiquitous sensing networks.
Asymmetric information is shown to be more accurate in characterizing the performance of quadrant folding shaped (QFS) M-QAM. The performance difference of QFS M-QAM schemes strongly depends on the FEC coding rate, and the optimum FEC coding rate is found to be around ?0.8, which is independent of QFS
Modern malware and cyber attacks depend heavily on DNS services to make their campaigns reliable and difficult to track. Monitoring network DNS activities and blocking suspicious domains have been proven an effective technique in countering such attacks. However, recent successful campaigns reveal that
Large enterprises are increasingly relying on threat detection softwares (e.g., Intrusion Detection Systems) to allow them to spot suspicious activities. These softwares generate alerts which must be investigated by cyber analysts to figure out if they are true attacks. Unfortunately, in practice, there
First responders, a critical lifeline of any society, often find themselves in precarious situations. The ability to track them in real-time in unknown indoor environments would significantly contribute to the success of their mission as well as their safety. In this work, we present the design, implementation
We propose a novel multi-parameter sensing technique based on a Brillouin optical time domain reflectometry in the elliptical-core few-mode fiber, using higher-order optical and acoustic modes. Multiple Brillouin peaks are observed for the backscattering of both the LP01 mode and LP11 mode. We characterize
Integration of renewables and energy storage, leading to rise of prosumers, has created localized bidirectional flows. As the result, the utility demand has decreased and traditional centralized controller can no longer realize the optimal performance of ever growing distribution systems. To achieve
Existing visual reasoning datasets, such as Visual Question Answering (VQA), often suffer from biases conditioned on the question, image or answer distributions. The recently proposed CLEVR dataset addresses these limitations and requires fine-grained reasoning, but the dataset is synthetic and consists
Nowadays, multivariate time series data are increasingly collected in various real-world systems, e.g., power plants, wearable devices, etc. Anomaly detection and diagnosis in multivariate time series refer to identifying abnormal status in certain time steps and pinpointing the root causes. Building
We experimentally investigate the scattering effect on an 80 Gbit/s orbital angular momentum (OAM) multiplexed optical wireless communication link. The power loss, mode purity, cross talk, and bit error rate performance are measured and analyzed for different OAM modes under scattering levels from ballistic
Attribute-conditioned face synthesis has many potential use cases, such as to aid the identification of a suspect or a missing person. Building on top of a conditional version of VAE-GAN, we augment the pathways connecting the latent space with channel-recurrent architecture, in order to provide not
We propose a novel memory-based online video representation that is efficient, accurate and predictive. This is in contrast to prior works that often rely on computationally heavy 3D convolutions, ignore motion when aligning features over time, or operate in an off-line mode to utilize future frames.
We demonstrate high spectral efficiency transmission over 549 km of field-deployed single-mode fiber using probabilistic-shaped 144QAM. We achieved 41.5 Tb/s over the C-band at a spectral efficiency of 9.02 b/s/Hz using 32-Gbaud channels at a channel spacing of 33.33 GHz, and 38.1 Tb/s at a spectral
Accurate modeling of battery capacity degradation is an important component for both battery manufacturers and energy management systems. In this paper, we develop a battery degradation model using deep learning algorithms. The model is trained with the real data collected from battery storage solutions
Machine learning tasks typically involve minimizing a loss function that measures the distance of the model output and the ground-truth. In some applications, in addition to the usual loss function, the output must also satisfy certain requirements for further processing. We call such requirements model
We introduce a new inference task – Visual Entailment (VE) – which differs from traditional Textual Entailment (TE) tasks whereby a premise is defined by an image, rather than a natural language sentence as in TE tasks. A novel dataset SNLI-VE is proposed for VE tasks based on the Stanford Natural Language
We envision a flexible, dynamic airborne LTE infrastructure built upon Unmanned Autonomous Vehicles (UAVs) that will provide on-demand, on-time, network access, anywhere. In this paper, we design, implement and evaluate SkyRAN, a self-organizing UAV-based LTE RAN (Radio Access Network) that is a key
Recent studies have demonstrated the vulnerability of deep convolutional neural networks against adversarial examples. Inspired by the observation that the intrinsic dimension of image data is much smaller than its pixel space dimension and the vulnerability of neural networks grows with the input dimension,
Subspace clustering algorithms are notorious for their scalability issues because building and processing large affinity matrices are demanding. In this paper, we introduce a method that simultaneously learns an embedding space along subspaces within it to minimize a notion of reconstruction error, thus
In order to learn object segmentation models in videos, conventional methods require a large amount of pixel-wise ground truth annotations. However, collecting such supervised data is time-consuming and labor-intensive. In this paper, we exploit existing annotations in source images and transfer such
Making predictions about what might happen in the future is important for reacting adequately in many situations. For example, observing that Man kidnaps girl may have the consequence that Man kills girl. While this is part of common sense reasoning for humans, it is not obvious how machines
Convolutional neural networks (CNNs) have recently emerged as a popular building block for natural language processing (NLP). Despite their success, most existing CNN models employed in NLP share the same learned (and static) set of filters for all input sentences. In this paper, we consider an approach
We develop a system for the FEVER fact extraction and verification challenge that uses a high precision entailment classifier based on transformer networks pretrained with language modeling, to classify a broad set of potential evidence. The precision of the entailment classifier allows us to enhance
Existing entailment datasets mainly pose problems which can be answered without attention to grammar or word order. Learning syntax requires comparing examples where different grammar and word order change the desired classification. We introduce several datasets based on synthetic transformations of
The advances in unmanned aerial vehicle (UAV) technology have empowered mobile operators to deploy LTE base stations (BSs) on UAVs, and provide on-demand, adaptive connectivity to hotspot venues as well as emergency scenarios. However, today’s evolved packet core (EPC) that orchestrates the LTE RAN faces
Commercial and industry (C& I) customers incur two types of electricity charges on their bills: one for the amount of energy usage and another one for the maximum demand during certain billing periods. The second charge type is known as Demand Charge (DC), which could account for over half of a customers’
Behavior-based Community Detection: Application to Host Assessment in Enterprise Information Networks Community detection in complex networks is a fundamental problem that attracts much attention across various disciplines. Previous studies have been mostly focusing on external connections between nodes
Given a large number of low-quality heterogeneous categorical alerts collected from an anomaly detection system, how to characterize the complex relationships between different alerts and deliver trustworthy rankings to end users? While existing techniques focus on either mining alert patterns or filtering
Node ranking in temporal networks are often impacted by heterogeneous context from node content, temporal, and structural dimensions. This paper introduces TGNet , a deep-learning framework for node ranking in heterogeneous temporal graphs. TGNet utilizes a variant of Recurrent Neural Network to adapt
Today’s enterprises are exposed to sophisticated attacks, such as Advanced Persistent Threats~(APT) attacks, which usually consist of stealthy multiple steps. To counter these attacks, enterprises often rely on causality analysis on the system activity data collected from a ubiquitous system monitoring
We claim that there is currently no satisfactory way to regularize a generative adversarial network (GAN): neither the generator nor discriminator is particularly amenable to the imposition of inductive biases derived from domain knowledge. A generator is effectively a causal model of generationone
Unsupervised domain adaptation is an attractive avenue to enhance the performance of deep neural networks in a target domain, using labels only from a source domain. However, two predominant methods along this line, namely, domain divergence reduction learning and semi-supervised learning, are not readily
The advent of LTE into the unlicensed spectrum has necessitated the understanding of its operational efficiency when sharing spectrum with different radio access technologies. Our study reveals that LTE, owing to its inherent transmission characteristics, suffers significant performance degradation in
We propose a single-ended Brillouin-based sensor in elliptical-core few-mode optical fiber for multi-parameter measurement using spontaneous Brillouin scattering. Distributed sensing of temperature and strain is demonstrated over 0.5 km elliptical-core few-mode fiber.
An ensemble learning algorithm is applied to enhance filtering tolerance of cascaded WSSs in open ROADM environment to demonstrate ~0.8dB Q-factor improvement over MLSE after transmitting over 3200km with 16 ROADMs.
A simplified, system-agnostic NLC algorithm based on a neuron network is proposed to pre-distort symbols at transmitter side to demonstrate ~0.6dB Q improvement after 2800km SMF transmission using 32Gbaud DP-16QAM.
We study digital signal processing techniques to optimize the back-to-back performance of large probabilistic shaped constellations. We cover joint optimization of LDPC and constellation shaping, CD pre-compensation, clipping and I/Q imbalance compensation.
Parametric embedding methods such as parametric t-distributed Stochastic Neighbor Embedding (pt-SNE) enables out-of-sample data visualization without further computationally expensive optimization or approximation. However, pt-SNE favors small mini-batches to train a deep neural network but large mini-batches
Interest point descriptors have fueled progress on almost every problem in computer vision. Recent advances in deep neural networks have enabled task-specific learned descriptors that outperform hand-crafted descriptors on many problems. We demonstrate that commonly used metric learning approaches do
We propose a method to forecast a vehicle’s ego-motion as a distribution over spatiotemporal paths, conditioned on features (e.g., from LIDAR and images) embedded in an overhead map. The method learns a policy inducing a distribution over simulated trajectories that is both diverse (produces most paths
Given a single RGB image of a complex outdoor road scene in the perspective view, we address the novel problem of estimating an occlusion-reasoned semantic scene layout in the top-view. This challenging problem not only requires an accurate understanding of both the 3D geometry and the semantics of the
We introduce and tackle the problem of zero-shot object detection (ZSD), which aims to detect object classes which are not observed during training. We work with a challenging set of object classes, not restricting ourselves to similar and/or fine-grained categories as in prior works on zero-shot classification.
We present DIP, a deep learning-based framework to learn structural properties of the Internet, such as node clustering or distance between nodes. Existing embedding-based approaches use linear algorithms on a single source of data, such as latency or hop count information, to approximate the position
In recent years, many techniques have been developed to improve the performance and efficiency of data center networks. While these techniques provide high accuracy, they are often designed using heuristics that leverage domain-specific properties of the workload or hardware.In this vision paper, we
Multivariate time series data are becoming increasingly common in numerous real-world applications, e.g., power plant monitoring, health care, wearable devices, automobiles, etc. As a result, multivariate time series retrieval, i.e., given the current multivariate time series segment, how to obtain its
The problem of network representation learning, also known as network embedding, arises in many machine learning tasks assuming that there exist a small number of variabilities in the vertex representations which can capture the “semantics” of the original network structure. Most existing network embedding
Massive and dynamic networks arise in many practical applications such as social media, security and public health. Given an evolutionary network, it is crucial to detect structural anomalies, such as vertices and edges whose “behaviors” deviate from underlying majority of the network, in a real-time
The latent behavior of an information system that can exhibit extreme events, such as system faults or cyber-attacks, is complex. Recently, the invariant network has shown to be a powerful way of characterizing complex system behaviors. Structures and evolutions of the invariance network, in particular,
Recently, advanced cyber attacks, which consist of a sequence of steps that involve many vulnerabilities and hosts, compromise the security of many well-protected businesses. This has led to solutions that ubiquitously monitor system activities in each host (big data) as a series of events and search
Vast geographical distances in Africa are a leading cause for the so-called digital divide due to the high cost of installing fiber. Free-space optical (FSO) communications offer a convenient and higher bandwidth alternative to point-to-point radio microwave links, with the possibility of repurposing
Large monthly demand charge of commercial and industrial entities is a major problem for their economical business. Utilizing a battery by behind-the-meter Energy Management Systems (EMS) has been seen as a solution to demand charge reduction. In state-of-the-art approaches, the EMS maintains sufficient
Multi-dimensional Hawkes processes (MHP) has been widely used for modeling temporal events. However, when MHP was used for modeling events with spatio-temporal characteristics, the spatial information was often ignored despite its importance. In this paper, we introduce a framework to exploit MHP for
Many deep learning architectures have been proposed to model the compositionality in text sequences, requiring substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper,
The need for countering Advanced Persistent Threat (APT) attacks has led to solutions that ubiquitously monitor system activities in each host and perform timely attack investigation over the monitoring data for analyzing attack provenance. However, existing query systems based on relational databases
Conventional embedding methods directly associate each symbol with a continuous embedding vector, which is equivalent to applying a linear transformation based on a one-hot encoding of the discrete symbols. Despite its simplicity, such approach yields the number of parameters that grows linearly
Administrators of most user-facing systems depend on periodic log data to get an idea of the health and status of production applications. Logs report information, which is crucial to diagnose the root cause of complex problems. In this paper, we present a real-time log analysis system called LogLens
By adjusting single shaping factor in the distribution matcher, probabilistic-shaped M-QAM is reviewed to provide both flex-rate and near-Shannon performance at the given flex-grid bandwidth and filling ratio.
Human actions often involve complex interactions across several inter-related objects in the scene. However, existing approaches to fine-grained video understanding or visual relationship detection often rely on single object representation or pairwise object relationships. Furthermore, learning interactions
Online video object segmentation is a challenging task as it entails to process the image sequence timely and accurately. To segment a target object through the video, numerous CNN-based methods have been developed by heavily finetuning on the object mask in the first frame, which is time-consuming for
Convolutional neural network-based approaches for semantic segmentation rely on supervision with pixel-level ground truth, but may not generalize well to unseen image domains. As the labeling process is tedious and labor intensive, developing algorithms that can adapt source ground truth labels to the
An adaptive control system for battery integrated PV generation is designed to reduce the fluctuating in PV power production. The core component of the system is a four-layer power control system (PCS) for Battery Energy Storage (BES). BES responds to the power dispatch commands from PCS and charges/discharges
Unsupervised anomaly detection on multi- or high-dimensional data is of great importance in both fundamental machine learning research and industrial applications, for which density estimation lies at the core. Although previous approaches based on dimensionality reduction followed by density estimation
Network embedding aims to learn a low-dimensional vector representation for each node in the social and information networks, with the constraint to preserve network structures. Most existing methods focus on single network embedding, ignoring the relationship between multiple networks. In many real-world
This paper presents a method to determine optimal energy and power capacity of distributed Energy Storage Systems (ESS) in behind-the-meter applications to maximize local Photovoltaic (PV) utilization or minimize Demand Charge (DC) cost. The problem is solved as a multi-objective optimization model to
This paper proposes a novel memory-based online video representation that is efficient, accurate and predictive. This is in contrast to prior works that often rely on computationally heavy 3D convolutions, ignore actual motion when aligning features over time, or operate in an off-line mode to utilize
Real-world face recognition datasets exhibit long-tail characteristics, which results in biased classifiers in conventionally-trained deep neural networks, or insufficient data when long-tail classes are ignored. In this paper, we propose to handle long-tail classes in the training of a face recognition
Despite recent successes in synthesizing faces and bedrooms, existing generative models struggle to capture more complex image types (Figure 1), potentially due to the oversimplification of their latent space constructions. To tackle this issue, building on Variational Autoencoders (VAEs), we integrate
41.5-Tb/s over 549 km of deployed SSMF in Verizon’s network is achieved using probabilistic-shaped 144QAM to optimize throughput at ultra-fine granularity. This is the highest C-band only capacity and spectral efficiency in metro field environment.
Quality of transmission prediction for real-time mixed line-rate systems is realized using artificial neural network based transfer learning with SDN orchestrating. 0.42 dB accuracy is achieved with a 1000 to 20 reduction in training samples.
We report on the evolution of the longest segment of FASTER cable at 11,017 km, with 8QAM transponders at 4b/s/Hz spectral efficiency (SE) in service. With offline testing, 6 b/s/Hz is further demonstrated using probabilistically shaped 64QAM, and a novel, low complexity nonlinearity compensation technique
A novel algorithm to design geometric shaped 32QAM to work with probabilistic shaping is proposed to approach the Shannon limit within ~0.2 dB in SNR. The experimental results show ~0.2 dB SNR advantage over 64Gbaud PAS-64QAM, and flex-rate transmission demonstrates > 500 km reach improvement over 32QAM.
We propose universal distribution matchers applicable to any two-dimensional signal constellation. We experimentally demonstrate that the performance of 32-ary QAM, based on hybrid probabilistic-geometric shaping, is superior to probabilistically shaped 32QAM and regular 32QAM.
Light-field cameras have recently emerged as a powerful tool for one-shot passive 3D shape capture. However, obtaining the shape of glossy objects like metals or plastics remains challenging, since standard Lambertian cues like photo-consistency cannot be easily applied. In this paper, we derive a spatially-varying
The increasingly sophisticated Advanced Persistent Threat (APT) attacks have become a serious challenge for enterprise IT security. Attack causality analysis, which tracks multi-hop causal relationships between files and processes to diagnose attack provenances and consequences, is the first step towards
A systematic study, including theory, simulation and experiments, is carried out to review the generalized pairwise optimization algorithm for designing optimized constellation. In order to verify its effectiveness, the algorithm is applied in three testing cases: 2-dimensional 8 quadrature amplitude
Recent developments in deep domain adaptation have allowed knowledge transfer from a labeled source domain to an unlabeled target domain at the level of intermediate features or input pixels. We propose that advantages may be derived by combining them, in the form of different insights that lead to a
Previous models for video captioning often use the output from a specific layer of a Convolutional Neural Network (CNN) as video features. However, the variable context-dependent semantics in the video may make it more appropriate to adaptively select features from the multiple CNN layers. We propose
Generating videos from text has proven to be a significant challenge for existing generative models. We tackle this problem by training a conditional generative model to extract both static and dynamic information from text. This is manifested in a hybrid framework, employing a Variational Autoencoder
Adaptive Memory Networks We present Adaptive Memory Networks (AMN) that processes input-question pairs to dynamically construct a network architecture optimized for lower inference times for Question Answering (QA) tasks. AMN processes the input story to extract entities and stores them in memory banks.
Large-scale training for semantic segmentation is challenging due to the expense of obtaining training data for this task relative to other vision tasks. We propose a novel training approach to address this difficulty. Given cheaply-obtained sparse image labelings, we propagate the sparse labels to produce
Un-manned aerial vehicle (UAVs) have the potential to change the landscape of wide-area wireless connectivity by bringing them to areas where connectivity was sparing or non-existent (e.g. rural areas) or has been compromised due to disasters. While Google’s Project Loon and Facebook’s Project Aquila
Cyber-physical systems (CPSs) are today ubiquitous in urban environments. Such systems now serve as the backbone to numerous critical infrastructure applications, from smart grids to IoT installations. Scalable and seamless operation of such CPSs requires sophisticated tools for monitoring the time series
Generalized mutual information (GMI) has been comprehensively studied in multidimensional constellation and probabilistic-shaped (PS) constellation together with different forward error correction (FEC) coding schemes. The simulation results confirm that GMI is an efficient and accurate tool to compare
We introduce a new light-field dataset of materials and take advantage of the recent success of deep learning to perform material recognition on the 4D light field. Our dataset contains 12 material categories, each with 100 images taken with a Lytro Illum, from which we extract about 30,000 patches in
We present a physically interpretable 3D model for handling occlusions with applications to road scene understanding. Given object detection and SFM point tracks, our unified model probabilistically assigns point tracks to objects and reasons about object detection scores and bounding boxes. It uniformly
Our WarpNet matches images of objects in fine-grained datasets without using part annotations. It aligns an object in one image with a different object in another by exploiting a fine-grained dataset to create artificial data for training a Siamese network with an unsupervised discriminative learning
We propose a novel framework for monocular traffic scene recognition, relying on a decomposition into high-order and atomic scenes to meet those challenges. High-order scenes carry semantic meaning useful for AWS applications, while atomic scenes are easy to learn and represent elemental behaviors based
This paper investigates a novel problem of generating images from visual attributes. We model the image as a composite of foreground and background and develop a layered generative model with disentangled latent variables that can be learned end-to-end using a variational auto-encoder. We experiment