Next-Generation Computing Finally Sees Light

Moore’s law is dead, as we have squeezed all the innovation out of silicon. Fiber optics is the solution to meet the computing needs of tomorrow. Today, we can already use the light traveling inside fiber optic cables as sensors that measure vibrations, sound, temperature, light, and pressure changes. We’re now developing the means to take this to the next level with photonic computing at the speed of light to provide faster reaction time, reduce energy consumption and improve battery range

NEC Labs America Heads to Stanford University’s SystemX Alliance Annual Fall Conference

NEC Labs America’s (NECLA) President Christopher White is attending Stanford University’s SystemX Alliance 2022 Fall Conference this week, where he is meeting with Ph.D. students, industry-leading researchers and business leaders presenting on a wide range of research topics. The annual conference will highlight exciting research in the areas of advanced materials, data analytics, energy and power management, 3D nanoprinting, and photonic and quantum computing, to name but a few!

Availability Analysis for Reliable Distributed Fiber Optic Sensors Placement

We perform the availability analysis for various reliable distributed fiber optic sensor placement schemes in the circumstances of multiple failures. The study can help the network carriers to select the optimal protection scheme for their network sensing services, considering both service availability and hardware cost.

Distributed Optical Fiber Sensing Using Specialty Optical Fibers

Distributed fiber optic sensing systems use long section of optical fiber as the sensing media. Therefore, the fiber characteristics determines the sensing capability and performance. In this presentation, various types of specialty optical fibers and their sensing applications will be introduced and discussed.

A Multi-sensor Feature Fusion Network Model for Bearings Grease Life Assessment in Accelerated Experiments

This paper presents a multi-sensor feature fusion (MSFF) neural network comprised of two inception layer-type multiple channel feature fusion (MCFF) networks for both inner-sensor and cross-sensor feature fusion in conjunction with a deep residual neural network (ResNet) for accurate grease life assessment and bearings health monitoring. The single MCFF network is designed for low-level feature extraction and fusion of either vibration or acoustic emission signals at multi-scales. The concatenation of MCFF networks serves as a cross-sensor feature fusion layer to combine extracted features from both vibration and acoustic emission sources. A ResNet is developed for high-level feature extraction from the fused feature maps and prediction. Besides, to handle the large volume of collected data, original time-series data are transformed to the frequency domain with different sampling intervals and targeted ranges. The proposed MSFF network outperforms other models based on different fusion methods, fully connected network predictors and/or a single sensor source.

Enhancing Video Analytics Accuracy via Real-time Automated Camera Parameter Tuning

In Video Analytics Pipelines (VAP), Analytics Units (AUs) such as object detection and face recognition running on remote servers critically rely on surveillance cameras to capture high-quality video streams in order to achieve high accuracy. Modern IP cameras come with a large number of camera parameters that directly affect the quality of the video stream capture. While a few of such parameters, e.g., exposure, focus, white balance are automatically adjusted by the camera internally, the remaining ones are not. We denote such camera parameters as non-automated (NAUTO) parameters. In this paper, we first show that environmental condition changes can have significant adverse effect on the accuracy of insights from the AUs, but such adverse impact can potentially be mitigated by dynamically adjusting NAUTO camera parameters in response to changes in environmental conditions. We then present CamTuner, to our knowledge, the first framework that dynamically adapts NAUTO camera parameters to optimize the accuracy of AUs in a VAP in response to adverse changes in environmental conditions. CamTuner is based on SARSA reinforcement learning and it incorporates two novel components: a light-weight analytics quality estimator and a virtual camera that drastically speed up offline RL training. Our controlled experiments and real-world VAP deployment show that compared to a VAP using the default camera setting, CamTuner enhances VAP accuracy by detecting 15.9% additional persons and 2.6%–4.2% additional cars (without any false positives) in a large enterprise parking lot and 9.7% additional cars in a 5G smart traffic intersection scenario, which enables a new usecase of accurate and reliable automatic vehicle collision prediction (AVCP). CamTuner opens doors for new ways to significantly enhance video analytics accuracy beyond incremental improvements from refining deep-learning models.

Semi-supervised Identification and Mapping of Water Accumulation Extent using Street-level Monitoring Videos

Urban flooding is becoming a common and devastating hazard, which causes life loss and economic damage. Monitoring and understanding urban flooding in a highly localized scale is a challenging task due to the complicated urban landscape, intricate hydraulic process, and the lack of high-quality and resolution data. The emerging smart city technology such as monitoring cameras provides an unprecedented opportunity to address the data issue. However, estimating water ponding extents on land surfaces based on monitoring footage is unreliable using the traditional segmentation technique because the boundary of the water ponding, under the influence of varying weather, background, and illumination, is usually too fuzzy to identify, and the oblique angle and image distortion in the video monitoring data prevents georeferencing and object-based measurements. This paper presents a novel semi-supervised segmentation scheme for surface water extent recognition from the footage of an oblique monitoring camera. The semi-supervised segmentation algorithm was found suitable to determine the water boundary and the monoplotting method was successfully applied to georeference the pixels of the monitoring video for the virtual quantification of the local drainage process. The correlation and mechanism-based analysis demonstrate the value of the proposed method in advancing the understanding of local drainage hydraulics. The workflow and created methods in this study have a great potential to study other street level and earth surface processes.

The Trade-off between Scanning Beam Penetration and Transmission Beam Gain in mmWave Beam Alignment

Beam search algorithms have been proposed to align the beams from an access point to a user equipment. The process relies on sending beams from a set of scanning beams (SB) and tailoring a transmission beam (TB) using the received feedback. In this paper, we discuss a fundamental trade-off between the gain of SBs and TBs. The higher the gain of an SB, the better the penetration of the SB and the higher the gain of the TB the better the communication link performance. However, TB depends on the set of SBs and by increasing the coverage of each SB and in turn reducing its penetration, there is more opportunity to find a sharper TB to increase its beamforming gain. We define a quantitative measure for such trade-off in terms of a trade-off curve. We introduce SB set design namely Tulip design and formally prove it achieves this fundamental trade-off curve for channels with a single dominant path. We also find closed-form solutions for the trade-off curve for special cases and provide an algorithm with its performance evaluation results to find the trade-off curve revealing the need for further optimization on the SB sets in the state-of-the-art beam search algorithms.

Single-Stream Multi-level Alignment for Vision-Language Pretraining

Self-supervised vision-language pretraining from pure images and text with a contrastive loss is effective, but ignores fine-grained alignment due to a dual-stream architecture that aligns image and text representations only on a global level. Earlier, supervised, non-contrastive methods were capable of finer-grained alignment, but required dense annotations that were not scalable. We propose a single stream architecture that aligns images and language at multiple levels: global, fine-grained patch-token, and conceptual/semantic, using two novel tasks: symmetric cross-modality reconstruction (XMM) and a pseudo-labeled key word prediction (PSL). In XMM, we mask input tokens from one modality and use cross-modal information to reconstruct the masked token, thus improving fine-grained alignment between the two modalities. In PSL, we use attention to select keywords in a caption, use a momentum encoder to recommend other important keywords that are missing from the caption but represented in the image, and then train the visual encoder to predict the presence of those keywords, helping it learn semantic concepts that are essential for grounding a textual token to an image region. We demonstrate competitive performance and improved data efficiency on image-text retrieval, grounding, visual question answering/reasoning against larger models and models trained on more data. Code and models available at zaidkhan.me/SIMLA.

Learning Semantic Segmentation from Multiple Datasets with Label Shifts

While it is desirable to train segmentation models on an aggregation of multiple datasets, a major challenge is that the label space of each dataset may be in conflict with one another. To tackle this challenge, we propose UniSeg, an effective and model-agnostic approach to automatically train segmentation models across multiple datasets with heterogeneous label spaces, without requiring any manual relabeling efforts. Specifically, we introduce two new ideas that account for conflicting and co-occurring labels to achieve better generalization performance in unseen domains. First, we identify a gradient conflict in training incurred by mismatched label spaces and propose a class-independent binary cross-entropy loss to alleviate such label conflicts. Second, we propose a loss function that considers class-relationships across datasets for a better multi-dataset training scheme. Extensive quantitative and qualitative analyses on road-scene datasets show that UniSeg improves over multi-dataset baselines, especially on unseen datasets, e.g., achieving more than 8%p gain in IoU on KITTI. Furthermore, UniSeg achieves 39.4% IoU on the WildDash2 public benchmark, making it one of the strongest submissions in the zero-shot setting. Our project page is available at https://www.nec-labs.com/~mas/UniSeg.