Efficient Controllable Multi-Task Architectures

We aim to train a multi-task model such that users can adjust the desired compute budget and relative importance of task performances after deployment, without retraining. This enables optimizing performance for dynamically varying user needs, without heavy computational overhead to train and save models for various scenarios. To this end, we propose a multi-task model consisting of a shared encoder and task-specific decoders where both encoder and decoder channel widths are slimmable. Our key idea is to control the task importance by varying the capacities of task-specific decoders, while controlling the total computational cost by jointly adjusting the encoder capacity. This improves overall accuracy by allowing a stronger encoder for a given budget, increases control over computational cost, and delivers high-quality slimmed sub-architectures based on user’s constraints. Our training strategy involves a novel `Configuration-Invariant Knowledge Distillation’ loss that enforces backbone representations to be invariant under different runtime width configurations to enhance accuracy. Further, we present a simple but effective search algorithm that translates user constraints to runtime width configurations of both the shared encoder and task decoders, for sampling the sub-architectures. The key rule for the search algorithm is to provide a larger computational budget to the higher preferred task decoder, while searching a shared encoder configuration that enhances the overall MTL performance. Various experiments on three multi-task benchmarks (PASCALContext, NYUDv2, and CIFAR100-MTL) with diverse backbone architectures demonstrate the advantage of our approach. For example, our method shows a higher controllability by 33.5% in the NYUD-v2 dataset over prior methods, while incurring much less compute cost.

Domain Generalization Guided by Gradient Signal to Noise Ratio of Parameters

Overfitting to the source domain is a common issue in gradient-based training of deep neural networks. To compensate for the over-parameterized models, numerous regularization techniques have been introduced such as those based on dropout. While these methods achieve significant improvements on classical benchmarks such as ImageNet, their performance diminishes with the introduction of domain shift in the test set i.e. when the unseen data comes from a significantly different distribution. In this paper, we move away from the classical approach of Bernoulli sampled dropout mask construction and propose to base the selection on gradient-signal-to-noise ratio (GSNR) of network’s parameters. Specifically, at each training step, parameters with high GSNR will be discarded. Furthermore, we alleviate the burden of manually searching for the optimal dropout ratio by leveraging a meta-learning approach. We evaluate our method on standard domain generalization benchmarks and achieve competitive results on classification and face anti-spoofing problems.

Personalized Semantics Excitation for Federated Image Classification

Federated learning casts a light on the collaboration of distributed local clients with privacy protected to attain a more generic global model. However, significant distribution shift in input/label space across different clients makes it challenging to well generalize to all clients, which motivates personalized federated learning (PFL). Existing PFL methods typically customize the local model by fine-tuning with limited local supervision and the global model regularizer, which secures local specificity but risks ruining the global discriminative knowledge. In this paper, we propose a novel Personalized Semantics Excitation (PSE) mechanism to breakthrough this limitation by exciting and fusing personalized semantics from the global model during local model customization. Specifically, PSE explores channel-wise gradient differentiation across global and local models to identify important low-level semantics mostly from convolutional layers which are embedded into the client-specific training.In addition, PSE deploys the collaboration of global and local models to enrich high-level feature representations and facilitate the robustness of client classifier through a cross-model attention module. Extensive experiments and analysis on various image classification benchmarks demonstrate the effectiveness and advantage of our method over the state-of-the-art PFL methods.

MSI: Maximize Support-Set Information for Few-Shot Segmentation

Few-Shot Segmentation FSS (Few-shot segmentation) aims to segment a target class using a small number of labeled images (support set). To extract information relevant to the target class, a dominant approach in best performing FSS methods removes background features using a support mask. We observe that this feature excision through a limiting support mask introduces an information bottleneck in several challenging FSS cases, e.g., for small targets and/or inaccurate target boundaries. To this end, we present a novel method (MSI), which maximizes the support-set information by exploiting two complementary sources of features to generate super correlation maps. We validate the effectiveness of our approach by instantiating it into three recent and strong FSS methods. Experimental results on several publicly available FSS benchmarks show that our proposed method consistently improves performance by visible margins and leads to faster convergence.

Few-Shot Video Classification via Representation Fusion and Promotion Learning

Recent few-shot video classification (FSVC) works achieve promising performance by capturing similarity across support and query samples with different temporal alignment strategies or learning discriminative features via Transformer block within each episode. However, they ignore two important issues: a) It is difficult to capture rich intrinsic action semantics from a limited number of support instances within each task. b) Redundant or irrelevant frames in videos easily weaken the positive influence of discriminative frames. To address these two issues, this paper proposes a novel Representation Fusion and Promotion Learning (RFPL) mechanism with two sub-modules: meta-action learning (MAL) and reinforced image representation (RIR). Concretely, during training stage, we perform online learning for seeking a task-shared meta-action bank to enrich task-specific action representation by injecting global knowledge. Besides, we exploit reinforcement learning to obtain the importance of each frame and refine the representation. This operation maximizes the contribution of discriminative frames to further capture the similarity of support and query samples from the same category. Our RFPL framework is highly flexible that it can be integrated with many existing FSVC methods. Extensive experiments show that RFPL significantly enhances the performance of existing FSVC models when integrated with them.

Degradation-Resistant Unfolding Network for Heterogeneous Image Fusion

Heterogeneous image fusion (HIF) aims to enhance image quality by merging complementary information of images captured by different sensors. Early model-based approaches have strong interpretability while being limited by non-adaptive feature extractors with poor generalizability.

Real-time Intrusion Detection and Impulsive Acoustic Event Classification with Fiber Optic Sensing and Deep Learning Technologies over Telecom Networks

We review various use cases of distributed-fiber-optic-sensing and machine-learning technologies that offer advantages to telecom fiber networks on existing fiber infrastructures. Byleveraging an edge-AI platform, perimeter intrusion detection and impulsive acoustic event classification can be performed locally on-the-fly, ensuring real-time detection with low latency.

First Field Demonstration of Automatic WDM Optical Path Provisioning over Alien Access Links for Data Center Exchange

We demonstrated under six minutes automatic provisioning of optical paths over field- deployed alien access links and WDM carrier links using commercial-grade ROADMs, whitebox mux-ponders, and multi-vendor transceivers. With channel probing, transfer learning, and Gaussian noise model, we achieved an estimation error (Q-factor) below 0.7 dB

Chris White Interviewed By Mike Vizard on Techstrong.AI

In this excellent Techstrong.ai videocast, Michael Vizard interviews our Christopher White, President of NEC Labs America, about #AI and its future. They discuss generative AI, its current hype, its potential impact on content creation and the augmentation of human abilities. Chris emphasizes that generative AI systems are not “thinking machines” but tools to enhance human capabilities.

Temporal Graph-Based Incident Analysis System for Internet of Things (ECML)

Internet-of-things (IoTs) deploy a massive number of sensors to monitor the system and environment. Anomaly detection on sensor data is an important task for IoT maintenance and operation. In real applications, the occurrence of a system-level incident usually involves hundreds of abnormal sensors, making it impractical for manual verification. The users require an efficient and effective tool to conduct incident analysis and provide critical information such as: (1) identifying the parts that suffered most damages and (2) finding out the ones that cause the incident. Unfortunately, existing methods are inadequate to fulfill these requirements because of the complex sensor relationship and latent anomaly influences in IoTs. To bridge the gap, we design and develop a Temporal Graph based Incident Analysis System (TGIAS) to help users’ diagnosis and reaction on reported anomalies. TGIAS trains a temporal graph to represent the anomaly relationship and computes severity ranking and causality score for each sensor. TGIAS provides the list of top k serious sensors and root-causes as output and illustrates the evidence on a graphical view. The system does not need any incident data for training and delivers high accurate analysis results in online time. TGIAS is equipped with a user-friendly interface, making it an effective tool for a broad range of IoTs.