Our team will be attending CVPR 2024 (The IEEE /CVF Conference on Computer Vision & Pattern Recognition) from June 17-21! See you there at the NEC Labs America Booth 1716! Stay tuned for more information about our participation.
Weakly-Supervised Temporal Action Localization with Multi-Modal Plateau Transformers Poster
Workshop: 3rd Workshop on Learning with Limited Labelled Data for Image and Video Understanding
Authors: Xin Hu, Tulane University; Kai Li, NEC Labs America; Deep Patel, NEC Labs America (Presenting); Erik Kruus, NEC Labs America; Martin Renqiang Min, NEC Labs America and Zhengming Ding, Tulane University
Poster timing: Tuesday, June 18, 2024
Deep Patel, NEC Labs America (Presenting)
Seeing the Vibration from Fiber-Optic Cable: Rain Intensity Monitoring using Deep Frequency Filtering Poster
Workshop: 20th Workshop on Perception Beyond the Visible Spectrum
Authors: Zhuocheng Jiang (Presenter), Yangmin Ding, Junhui Zhao, Yue Tian, Shaobo Han, Sarper Ozharar, Ting Wang and James M. Moore
Poster timing: Tuesday, June 18, 2024
Zhuocheng Jiang, NEC Labs America (Presenter)
Data-Driven Autonomous Driving Simulation (DDASD) Workshop
Abstract: Real-world on-road testing of autonomous vehicles can be expensive or dangerous, making simulation a crucial tool to accelerate the development of safe autonomous driving (AD), a technology with enormous real-world impact. However, to minimise the sim-to-real gap, good agent behaviour models and sensor/perception imitation are paramount. A recent surge in published papers in this fast-growing field has led to a lot of progress, but several fundamental questions remain unanswered, for example regarding the fidelity and diversity of generative behaviour and perception models, generation of realistic controllable scenes at scale and the safety assessment of the simulation toolchain. In this workshop, our goal is to bring together practitioners and researchers from all areas of AD simulation and to discuss pressing challenges, recent breakthroughs and future directions.
Presenters: Our Intern Shanlin Sun will have an oral presentation at the workshop.
More Information:https://agents4ad.github.io/
Date: Tuesday, June 18, 2024
Shanlin Sun (Presenter)
S’More: Instantaneous Perception of Moving Objects in 3D
Presenters: Bingbing Zhuang, Intern Di Liu
Date: Friday, June 21, 2024 from 10:30am-12pm
Bingbing Zhuang, NEC Labs America (Presenter)
LidaRF: Delving into Lidar for Neural Radiance Field on Street Scenes
Presenters: Bingbing Zhuang, Ziyu Jiang, Intern Shanlin Sun
Date: Friday, June 21 from 10:30am-12pm
Bingbing Zhuang, NEC Labs America (Presenter)
Ziyu Jiang, NEC Labs America (Presenter)
Presented Papers
Taming Self-Training for Open-Vocabulary Object Detection
Recent studies have shown promising performance in open-vocabulary object detection (OVD) by utilizing pseudo labels (PLs) from pretrained vision and language models (VLMs). However, teacher-student self-training, a powerful and widely used paradigm to leverage PLs, is rarely explored for OVD.
AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving
Autonomous vehicle (AV) systems rely on robust perception models as a cornerstone of safety assurance. However, objects encountered on the road exhibit a long-tailed distribution, with rare or unseen categories posing challenges to a deployed perception model. This necessitates an expensive process of
Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Visual program synthesis is a promising approach to exploit the reasoning abilities of large language models for compositional computer vision tasks. Previous work has used few-shot prompting with frozen LLMs to synthesize visual programs. Training an LLM to write better visual programs is an attractive
Instantaneous Perception of Moving Objects in 3D
The perception of 3D motion of surrounding traffic participants is crucial for driving safety. While existing works primarily focus on general large motions, we contend that the instantaneous detection and quantification of subtle motions is equally important as they indicate the nuances in driving behavior
LidaRF: Delving into Lidar for Neural Radiance Field on Street Scenes
Photorealistic simulation plays a crucial role in applications such as autonomous driving, where advances in neural radiance fields (NeRFs) may allow better scalability through the automatic creation of digital 3D assets. However, reconstruction quality suffers on street scenes due to largely collinear
Generating Enhanced Negatives for Training Language-Based Object Detectors
The recent progress in language-based open-vocabulary object detection can be largely attributed to finding better ways of leveraging large-scale data with free-form text annotations. Training such models with a discriminative objective function has proven successful, but requires good positive and negative
Long-HOT: A Modular Hierarchical Approach for Long-Horizon Object Transport
We aim to address key challenges in long-horizon embodied exploration and navigation by proposing a long-horizon object transport task called Long-HOT and a novel modular framework for temporally extended navigation. Agents in Long-HOT need to efficiently find and pick up target objects that are scattered
Our Media Analytics Research
We solve fundamental challenges in computer vision, with a focus on understanding and interaction in 3D scenes, foundational vision-language models, robust learning across domains and responsible AI. Our technological breakthroughs contribute to socially-relevant solutions that address key enterprise needs in mobility, safety and smart spaces. Our work aims to democratize autonomous driving, develop agentic LLMs that solve user workflows and enable interactive robots, with a commitment to computer vision that promotes privacy, fairness and sustainability. Our award-winning research regularly features at top-tier venues such as CVPR, ICCV, ECCV and NeurIPS.