NEC Labs America, Author at NEC Labs America

About NEC Labs America

This author has not written his bio yet.
But we are proud to say that NEC Labs America contributed 681 entries already.

Entries by NEC Labs America

CAMTUNER: Adaptive Video Analytics Pipelines via Real-time Automated Camera Parameter Tuning

March 31, 2025/in Publications/by NEC Labs America

In Video Analytics Pipelines (VAP), Analytics Units (AUs) such as object detection and face recognition operating on remote servers rely heavily on surveillance cameras to capture high-quality video streams to achieve high accuracy. Modern network cameras offer an array of parameters that directly influence video quality. While a few of such parameters, e.g., exposure, focus and white balance, are automatically adjusted by the camera internally, the others are not. We denote such camera parameters as non-automated (NAUTO) parameters. In this work, we first show that in a typical surveillance camera deployment, environmental condition changes can have significant adverse effect on the accuracy of insights from the AUs, but such adverse impact can potentially be mitigated by dynamically adjusting NAUTO camera parameters in response to changes in environmental conditions. Second, since most end-users lack the skill or understanding to appropriately configure these parameters and typically use a fixed parameter setting, we present CAMTUNER, to our knowledge, the first framework that dynamically adapts NAUTO camera parameters to optimize the accuracy of AUs in a VAP in response to adverse changes in environmental conditions. CAMTUNER is based on SARSA reinforcement learning and it incorporates two novel components: a light-weight analytics quality estimator and a virtual camera that drastically speed up offline RL training. Our controlled experiments and real-world VAP deployment show that compared to a VAP using the default camera setting, CAMTUNER enhances VAP accuracy by detecting 15.9% additional persons and 2.6%-4.2% additional cars (without any false positives) in a large enterprise parking lot. CAMTUNER opens up new avenues for elevating video analytics accuracy, transcending mere incremental enhancements achieved through refining deep-learning models.

Optimal Single-User Interactive Beam Alignment with Feedback Delay

March 25, 2025/in Publications/by NEC Labs America

Communication in Millimeter wave (mmWave) band relies on narrow beams due to directionality, high path loss, and shadowing. One can use beam alignment (BA) techniques to find and adjust the direction of these narrow beams. In this paper, BA at the base station (BS) is considered, where the BS sends a set of BA packets to scan different angular regions while the user listens to the channel and sends feedback to the BS for each received packet. It is assumed that the packets and feedback received at the user and BS, respectively, can be correctly decoded. Motivated by practical constraints such as propagation delay, a feedback delay for each BA packet is considered. At the end of the BA, the BS allocates a narrow beam to the user including its angle of departure for data transmission and the objective is to maximize the resulting expected beamforming gain. A general framework for studying this problem is proposed based on which a lower bound on the optimal performance as well as an optimality achieving scheme are obtained. Simulation results reveal significant performance improvements over the state-of-the-art BA methods in the presence of feedback delay.

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

March 25, 2025/in Publications/by NEC Labs America

Visual reasoning (VR), which is crucial in many fields for enabling human-like visual understanding, remains highly challenging. Recently, compositional visual reasoning approaches, which leverage the reasoning abilities of large language models (LLMs) with integrated tools to solve problems, have shown promise as more effective strategies than end-to-end VR methods. However, these approaches face limitations, as frozen LLMs lack tool awareness in VR, leading to performance bottlenecks. While leveraging LLMs for reasoning is widely used in other domains, they are not directly applicable to VR due to limited training data, imperfect tools that introduce errors and reduce data collection efficiency in VR, and challenging in fine-tuning on noisy workflows. To address these challenges, we propose DWIM: i) Discrepancy-aware training Workflow generation, which assesses tool usage and extracts more viable workflows for training; and ii) Instruct-Masking fine-tuning, which guides the model to only clone effective actions, enabling the generation of more practical solutions. Our experiments demonstrate that DWIM achieves state-of-the-art performance across various VR tasks, exhibiting strong generalization on multiple widely-used datasets.

NEC Labs America Attends OFC 2025 in San Francisco

March 21, 2025/in Events/by NEC Labs America

The NEC Labs America Optical Networking and Sensing team is attending the 2025 Optical Fiber Communications Conference and Exhibition (OFC), the premier global event for optical networking and communications. Bringing together over 13,500 attendees from 83+ countries, more than 670 exhibitors, and hundreds of sessions featuring industry leaders, OFC 2025 serves as the central hub for innovation and collaboration in the field. At this year’s conference, NEC Labs America will showcase its cutting-edge research and advancements through multiple presentations, demonstrations, and workshops.

Free-Space Optical Sensing Using Vector Beam Spectra

March 21, 2025/in Publications/by NEC Labs America

Vector beams are spatial modes that have spatially inhomogeneous states of polarization. Any light beam is a linear combination of vector beams, the coefficients of which comprise a vector beam spectrum. In this work, through numerical calculations, a novel method of free-space optical sensing is demonstrated using vector beam spectra, which are shown to be experimentally measurable via Stokes polarimetry. As proof of concept, vector beam spectra are numerically calculated for various beams and beam obstructions.

400-Gb/s mode division multiplexing-based bidirectional free space optical communication in real-time with commercial transponders

March 21, 2025/in Publications/by NEC Labs America

In this work, for the first time, we experimentally demonstrate mode division multiplexing-based bidirectional free space optical communication in real-time using commercial transponders. As proof of concept, via bidirectional pairs of Hermite-Gaussian modes (HG00, HG10, and HG01), using a Telecom Infra Project Phoenix compliant commercial 400G transponder, 400-Gb/s data signals (56-Gbaud, DP-16QAM) are bidirectionally transmitted error free, i.e., with less than 1e-2 pre-FEC BERs, over approximately 1-m of free space

EdgeSync: Efficient Edge-Assisted Video Analytics via Network Contention-Aware Scheduling

March 17, 2025/in Publications/by NEC Labs America

With the advancement of 5G, edge-assisted video analytics has become increasingly popular, driven by the technologys ability to support low-latency, high-bandwidth applications. However, in scenarios where multiple clients competing for network resources, network contention poses a significant challenge. In this paper, we propose a novel scheduling algorithm that intelligently batches and aligns the offloading of multiple video analytics clients to optimize both network and edge server resource utilization while meeting the Service Level Objective (SLO). Experiment with a cellular network testbed shows that our approach successfully processes 93% or more of inference requests from 7 different clients to the edge server while meeting the SLOs, whereas other approaches achieve a lower success rate, ranging from 65% to 85% under the same condition.

Attribute-Centric Compositional Text-to-Image Generation

March 13, 2025/in Publications/by NEC Labs America

Despite the recent impressive breakthroughs in text-to-image generation, generative models have difficulty in capturing thedata distribution of underrepresented attribute compositions while over-memorizing overrepresented attribute compositions,which raises public concerns about their robustness and fairness. To tackle this challenge, we propose ACTIG, an attributecentriccompositional text-to-image generation framework. We present an attribute-centric feature augmentation and a novelimage-free training scheme, which greatly improves models ability to generate images with underrepresented attributes.Wefurther propose an attribute-centric contrastive loss to avoid overfitting to overrepresented attribute compositions.We validateour framework on the CelebA-HQ and CUB datasets. Extensive experiments show that the compositional generalization ofACTIG is outstanding, and our framework outperforms previous works in terms of image quality and text-image consistency

G-Litter Marine Litter Dataset Augmentation with Diffusion Models and Large Language Models on GPU Acceleration

March 12, 2025/in Publications/by NEC Labs America

Marine litter detection is crucial for environmental monitoring, yet the imbalance in existing datasets limits model performance in identifying various types of waste accurately. This paper presents an efficient data augmentation pipeline that combines generative diffusion models (e.g., Stable Diffusion) and Large Language Models (LLMs) to expand the G-Litter dataset, a marine litter dataset designed for autonomous detection in heterogeneous environments. Leveraging scalable diffusion models for image generation and Alpaca LLMs for diverse prompt generation, our approach augments underrepresented classes by generating over 200 additional images per class, significantly improving the datasets balance. Training G-Litter augmented dataset using YOLOv8 for object detection demonstrated an increase in detection performance, improving recall by 7.82% and mAP50 by 3.87% (compared with baseline results). This study emphasizes the potential for combining generative AI with HPC resources to automate data augmentation on large-scale, unstructured datasets, particularly in edge computing contexts for real-time marine monitoring. The models were tested on real videos captured during simulated missions, demonstrating a superior ability to detect submerged objects in dynamic scenarios. These results highlight the potential of generative AI techniques to improve dataset quality and detection model performance, laying the foundation for further expansion in real-time marine monitoring.

Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection

March 4, 2025/in Publications/by NEC Labs America

Action detection aims to detect (recognize and localize) human actions spatially and temporally in videos. Existing approaches focus on the closed-set setting where an action detector is trained and tested on videos from a fixed set of action categories. However, this constrained setting is not viable in an open world where test videos inevitably come beyond the trained action categories. In this paper, we address the practical yet challenging Open-Vocabulary Action Detection (OVAD) problem. It aims to detect any action in test videos while training a model on a fixed set of action categories. To achieve such an open-vocabulary capability, we propose a novel method OpenMixer that exploits the inherent semantics and localizability of large vision-language models (VLM) within the family of query-based detection transformers (DETR). Specifically, the OpenMixer is developed by spatial and temporal OpenMixer blocks (S-OMBand T-OMB), and a dynamically fused alignment (DFA) module. The three components collectively enjoy the merits of strong generalization from pre-trained VLMs and end to-end learning from DETR design. Moreover, we established OVAD benchmarks under various settings, and the experimental results show that the OpenMixer performs the best over baselines for detecting seen and unseen actions.

About NEC Labs America

Entries by NEC Labs America

CAMTUNER: Adaptive Video Analytics Pipelines via Real-time Automated Camera Parameter Tuning

Optimal Single-User Interactive Beam Alignment with Feedback Delay

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

NEC Labs America Attends OFC 2025 in San Francisco

Free-Space Optical Sensing Using Vector Beam Spectra

400-Gb/s mode division multiplexing-based bidirectional free space optical communication in real-time with commercial transponders

EdgeSync: Efficient Edge-Assisted Video Analytics via Network Contention-Aware Scheduling

Attribute-Centric Compositional Text-to-Image Generation

G-Litter Marine Litter Dataset Augmentation with Diffusion Models and Large Language Models on GPU Acceleration

Contact Us

About Us

Our Pages

Read Our Blog Posts

Author Archive for: neclabsstg

About NEC Labs America

Entries by NEC Labs America

Contact Us

About Us

Our Pages

Read Our Blog Posts