Xujiang Zhao NEC Labs America

Xujiang Zhao is a researcher in the Data Science & System Security department at NEC Laboratories America, based in Princeton, New Jersey. He holds a B.S. in Civil Engineering from Chongqing University and an M.S. in Computer Science from the University of Science and Technology of China. He earned his PhD in Computer Science from the University of Texas at Dallas, and his academic training provided a strong foundation in both theoretical and applied aspects of computing, which continues to shape his contributions at NEC.

At NEC Labs, Zhao’s research focuses on aligning large language models (LLMs) with human intent through techniques that enhance explainability, factual consistency, uncertainty estimation, and robustness. He develops methods that make LLMs more transparent and reliable, ensuring that they can be applied in sensitive, high-stakes environments. A key area of his work is building collaborative agent systems that integrate LLMs with domain-specific expertise and human feedback loops, enabling AI to work more effectively as a partner in decision-making.

Beyond language alignment, Zhao explores applications in image–text retrieval, synthetic media detection, and multi-agent reasoning, areas that are increasingly critical for enterprise knowledge management, misinformation defense, and the verification of AI-generated content. By combining fundamental advances in machine learning with applied research, his work pushes forward the responsible and practical use of foundation models across industries.

Posts

SEED: Sound Event Early Detection via Evidential Uncertainty

Sound Event Early Detection (SEED) is an essential task in recognizing the acoustic environments and soundscapes. However, most of the existing methods focus on the offline sound event detection, which suffers from the over-confidence issue of early-stage event detection and usually yield unreliable results. To solve the problem, we propose a novel Polyphonic Evidential Neural Network (PENet) to model the evidential uncertainty of the class probability with Beta distribution. Specifically, we use a Beta distribution to model the distribution of class probabilities, and the evidential uncertainty enriches uncertainty representation with evidence information, which plays a central role in reliable prediction. To further improve the event detection performance, we design the backtrack inference method that utilizes both the forward and backward audio features of an ongoing event. Experiments on the DESED database show that the proposed method can simultaneously improve 13.0% and 3.8% in time delay and detection F1 score compared to the state-of-the-art methods.

Boosting Cross-Lingual Transfer via Self-Learning with Uncertainty Estimation

Recent multilingual pre-trained language models have achieved remarkable zero-shot performance, where the model is only finetuned on one source language and directly evaluated on target languages. In this work, we propose a self-learning framework that further utilizes unlabeled data of target languages, combined with uncertainty estimation in the process to select high-quality silver labels. Three different uncertainties are adapted and analyzed specifically for the cross lingual transfer: Language Heteroscedastic/Homoscedastic Uncertainty (LEU/LOU), Evidential Uncertainty (EVI). We evaluate our framework with uncertainties on two cross-lingual tasks including Named Entity Recognition (NER) and Natural Language Inference (NLI) covering 40 languages in total, which outperforms the baselines significantly by 10 F1 for NER on average and 2.5 accuracy for NLI.