Shaobo Han NEC Labs America

Shaobo Han

Senior Researcher

Optical Networking & Sensing

Posts

Text-guided Device-realistic Sound Generation for Fiber-based Sound Event Classification

Recent advancements in unique acoustic sensing devices and large-scale audio recognition models have unlocked new possibilities for environmental sound monitoring and detection. However, applying pretrained models to non-conventional acoustic sensors results in performance degradation due to domain shifts, caused by differences in frequency response and noise characteristics from the original training data. In this study, we introduce a text-guided framework for generating new datasets to retrain models specifically for these non-conventional sensors efficiently. Our approach integrates text-conditional audio generative models with two additional steps: (1) selecting audio samples based on text input to match the desired sounds, and (2) applying domain transfer techniques using recorded impulse responses and background noise to simulate the characteristics of the sensors. We demonstrate this process by generating emulated signals for fiber-optic Distributed Acoustic Sensors (DAS), creating datasets similar to the recorded ESC-50 dataset. The generated signals are then used to train a classifier, which outperforms few-shot learning approaches in environmental sound classification.

CLAP-S: Support Set Based Adaptation for Downstream Fiber-optic Acoustic Recognition

Contrastive Language-Audio Pretraining (CLAP) models have demonstrated unprecedented performance in various acoustic signal recognition tasks. Fiber-optic-based acoustic recognition is one of the most important downstream tasks and plays a significant role in environmental sensing. Adapting CLAP for fiber-optic acoustic recognition has become an active research area. As a non-conventional acoustic sensor, fiberoptic acoustic recognition presents a challenging, domain-specific, low-shot deployment environment with significant domain shifts due to unique frequency response and noise characteristics. To address these challenges, we propose a support-based adaptation method, CLAP-S, which linearly interpolates a CLAP Adapter with the Support Set, leveraging both implicit knowledge through fine-tuning and explicit knowledge retrieved from memory for cross-domain generalization. Experimental results show that our method delivers competitive performance on both laboratory recorded fiber-optic ESC-50 datasets and a real-world fiber optic gunshot-firework dataset. Our research also provides valuable insights for other downstream acoustic recognition tasks.

Scalable Machine Learning Models for Optical Transmission System Management

Optical transmission systems require accurate modeling and performance estimation for autonomous adaption and reconfiguration. We present efficient and scalable machine learning (ML) methods for modeling optical networks at component- and network-level with minimizeddata collection.

Field Trials of Manhole Localization and Condition Diagnostics by Using Ambient Noise and Temperature Data with AI in a Real-Time Integrated Fiber Sensing System

Field trials of ambient noise-based automated methods for manhole localization and condition diagnostics using a real-time DAS/DTS integrated system were conducted. Crossreferencingmultiple sensing data resulted in a 94.7% detection rate and enhanced anomaly identification.

Field Tests of AI-Driven Road Deformation Detection Leveraging Ambient Noise over Deployed Fiber Networks

This study demonstrates an AI-driven method for detecting road deformations using Distributed Acoustic Sensing (DAS) over existing telecom fiber networks. Utilizingambient traffic noise, it enables real-time, long-term, and scalable monitoring for road safety.

Dual Privacy Protection for Distributed Fiber Sensing with Disaggregated Inference and Fine-tuning of Memory-Augmented Networks

We propose a memory-augmented model architecture with disaggregated computation infrastructure for fiber sensing event recognition. By leveraging geo-distributed computingresources in optical networks, this approach empowers end-users to customize models while ensuring dual privacy protection.

NEC Labs America Attends the 39th Annual AAAI Conference on Artificial Intelligence #AAAI25

Our NEC Lab America team attended the Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25), in Philadelphia, Pennsylvania at the Pennsylvania Convention Center from February 25 to March 4, 2025. The purpose of the AAAI conference series was to promote research in Artificial Intelligence (AI) and foster scientific exchange between researchers, practitioners, scientists, students, and engineers across the entirety of AI and its affiliated disciplines. Our team presented technical papers, led special tracks, delivered talks on key topics, participated in workshops, conducted tutorials, and showcased research in poster sessions. The team greeted visitors at Booth #208 and was there Thursday through Saturday.

CLAP-S: Support Set Based Adaptation for Downstream Fiber-optic Acoustic Recognition

Contrastive Language-Audio Pretraining (CLAP) models have demonstrated unprecedented performance in various acoustic signal recognition tasks. Fiber optic-based acoustic recognition is one of the most important downstream tasks and plays a significant role in environmental sensing. Adapting CLAP for fiber-optic acoustic recognition has become an active research area. As a non-conventional acoustic sensor, fiber-optic acoustic recognition presents a challenging, domain-specific, low-shot deployment environment with significant domain shifts due to unique frequency response and noise characteristics. To address these challenges, we propose a support-based adaptation method, CLAP-S, which linearly interpolates a CLAP Adapter with the Support Set, leveraging both implicit knowledge through fine-tuning and explicit knowledge retrieved from memory for cross-domain generalization. Experimental results show that our method delivers competitive performance on both laboratory-recorded fiber-optic ESC-50 datasets and a real-world fiber-optic gunshot-firework dataset. Our research also provides valuable insights for other downstream acoustic recognition tasks.

Multi-span optical power spectrum prediction using cascaded learning with one-shot end-to-end measurement

Scalable methods for optical transmission performance prediction using machine learning (ML) are studied in metro reconfigurable optical add-drop multiplexer (ROADM) networks. A cascaded learning framework is introduced to encompass the use of cascaded component models for end-to-end (E2E) optical path prediction augmented with different combinations of E2E performance data and models. Additional E2E optical path data and models are used to reduce the prediction error accumulation in the cascade. Off-line training (pre-trained prior to deployment) and transfer learning are used for component-level erbium-doped fiber amplifier (EDFA) gain models to ensure scalability. Considering channel power prediction, we show that the data collection processof the pre-trained EDFA model can be reduced to only 5% of the original training set using transfer learning. We evaluate the proposed method under three different topologies with field deployed fibers and achieve a mean absolute error of 0.16 dB with a single (one-shot) E2E measurement on the deployed 6-span system with 12 EDFAs.

VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks

As the adoption of large language models increases and the need for per-user or per-task model customization grows, the parameter-efficient fine-tuning (PEFT) methods, such as low-rank adaptation (LoRA) and its variants, incur substantial storage and transmission costs. To further reduce stored parameters, we introduce a “divide-and-share” paradigm that breaks the barriers of low-rank decomposition across matrix dimensions, modules, and layers by sharing parameters globally via a vector bank. As an instantiation of the paradigm to LoRA, our proposed VB-LoRA composites all the low-rank matrices of LoRA from a shared vector bank with a differentiable top-k admixture module. VB-LoRA achieves extreme parameter efficiency while maintaining comparable or better performance compared to state-of-the-art PEFT methods. Extensive experiments demonstrate the effectiveness of VB-LoRA on natural language understanding, natural language generation, instruction tuning, and mathematical reasoning tasks. When fine-tuning the Llama2-13B model, VB-LoRA only uses 0.4% of LoRA’s stored parameters, yet achieves superior results. Our source code is available at https://github.com/leo-yangli/VB-LoRA. This method has been merged into the Hugging Face PEFT package.