MSI: Maximize Support-Set Information for Few-Shot Segmentation

MSI: Maximize Support-Set Information for Few-Shot Segmentation FSS (Few-shot segmentation) aims to segment a target class using a small number of labeled images (support set). To extract information relevant to the target class, a dominant approach in best performing FSS methods removes background features using a support mask. We observe that this feature excision through a limiting support mask introduces an information bottleneck in several challenging FSS cases, e.g., for small targets and/or inaccurate target boundaries. To this end, we present a novel method (MSI), which maximizes the support-set information by exploiting two complementary sources of features to generate super correlation maps. We validate the effectiveness of our approach by instantiating it into three recent and strong FSS methods. Experimental results on several publicly available FSS benchmarks show that our proposed method consistently improves performance by visible margins and leads to faster convergence.

Personalized Semantics Excitation for Federated Image Classification

Personalized Semantics Excitation for Federated Image Classification Federated learning casts a light on the collaboration of distributed local clients with privacy protected to attain a more generic global model. However, significant distribution shift in input/label space across different clients makes it challenging to well generalize to all clients, which motivates personalized federated learning (PFL). Existing PFL methods typically customize the local model by fine-tuning with limited local supervision and the global model regularizer, which secures local specificity but risks ruining the global discriminative knowledge. In this paper, we propose a novel Personalized Semantics Excitation (PSE) mechanism to breakthrough this limitation by exciting and fusing personalized semantics from the global model during local model customization. Specifically, PSE explores channel-wise gradient differentiation across global and local models to idetify important low-level semantics mostly from convolutional layers which are embedded into the client-specific training.In addition, PSE deploys the collaboration of global and local models to enrich high-level feature representations and facilitate the robustness of client classifier through a cross-model attention module. Extensive experiments and analysis on various image classification benchmarks demonstrate the effectiveness and advantage of our method over the state-of-the-art PFL methods.

Domain Generalization Guided by Gradient Signal to Noise Ratio of Parameters

Overfitting to the source domain is a common issue in gradient-based training of deep neural networks. To compensate for the over-parameterized models, numerous regularization techniques have been introduced such as those based on dropout. While these methods achieve significant improvements on classical benchmarks such as ImageNet, their performance diminishes with the introduction of domain shift in the test set i.e. when the unseen data comes from a significantly different distribution. In this paper, we move away from the classical approach of Bernoulli sampled dropout mask construction and propose to base the selection on gradient-signal-to-noise ratio (GSNR) of network’s parameters. Specifically, at each training step, parameters with high GSNR will be discarded. Furthermore, we alleviate the burden of manually searching for the optimal dropout ratio by leveraging a meta-learning approach. We evaluate our method on standard domain generalization benchmarks and achieve competitive results on classification and face anti-spoofing problems.

Efficient Controllable Multi-Task Architectures

Efficient Controllable Multi-Task Architectures We aim to train a multi-task model such that users can adjust the desired compute budget and relative importance of task performances after deployment, without retraining. This enables optimizing performance for dynamically varying user needs, without heavy computational overhead to train and save models for various scenarios. To this end, we propose a multi-task model consisting of a shared encoder and task-specific decoders where both encoder and decoder channel widths are slimmable. Our key idea is to control the task importance by varying the capacities of task-specific decoders, while controlling the total computational cost by jointly adjusting the encoder capacity. This improves overall accuracy by allowing a stronger encoder for a given budget, increases control over computational cost, and delivers high-quality slimmed sub-architectures based on user’s constraints. Our training strategy involves a novel `Configuration-Invariant Knowledge Distillation’ loss that enforces backbone representations to be invariant under different runtime width configurations to enhance accuracy. Further, we present a simple but effective search algorithm that translates user constraints to runtime width configurations of both the shared encoder and task decoders, for sampling the sub-architectures. The key rule for the search algorithm is to provide a larger computational budget to the higher preferred task decoder, while searching a shared encoder configuration that enhances the overall MTL performance. Various experiments on three multi-task benchmarks (PASCALContext, NYUDv2, and CIFAR100-MTL) with diverse backbone architectures demonstrate the advantage of our approach. For example, our method shows a higher controllability by 33.5% in the NYUD-v2 dataset over prior methods, while incurring much less compute cost.

LDP-Feat: Image Features with Local Differential Privacy

LDP-Feat: Image Features with Local Differential Privacy Modern computer vision services often require users to share raw feature descriptors with an untrusted server. This presents an inherent privacy risk, as raw descriptors may be used to recover the source images from which they were extracted. To address this issue, researchers recently proposed privatizing image features by embedding them within an affine subspace containing the original feature as well as adversarial feature samples. In this paper, we propose two novel inversion attacks to show that it is possible to (approximately) recover the original image features from these embeddings, allowing us to recover privacy-critical image content. In light of such successes and the lack of theoretical privacy guarantees afforded by existing visual privacy methods, we further propose the first method to privatize image features via local differential privacy, which, unlike prior approaches, provides a guaranteed bound for privacy leakage regardless of the strength of the attacks. In addition, our method yields strong performance in visual localization as a downstream task while enjoying the privacy guarantee.

OmniLabel: A Challenging Benchmark for Language-Based Object Detection

OmniLabel: A Challenging Benchmark for Language-Based Object Detection Language-based object detection is a promising direction towards building a natural interface to describe objects in images that goes far beyond plain category names. While recent methods show great progress in that direction, proper evaluation is lacking. With OmniLabel, we propose a novel task definition, dataset, and evaluation metric. The task subsumes standard and open-vocabulary detection as well as referring expressions. With more than 30K unique object descriptions on over 25K images, OmniLabel provides a challenge benchmark with diverse and complex object descriptions in a naturally open-vocabulary setting. Moreover, a key differentiation to existing benchmarks is that our object descriptions can refer to one, multiple or even no object, hence, providing negative examples in free-form text. The proposed evaluation handles the large label space and judges performance via a modified average precision metric, which we validate by evaluating strong language-based baselines. OmniLabel indeed provides a challenging test bed for future research on language-based detection.

Long Reach Fibre Optic Distributed Acoustic Sensing using Enhanced Backscatter Fibre

Long Reach Fibre Optic Distributed Acoustic Sensing using Enhanced Backscatter Fibre We report significant noise reduction in distributed acoustic sensing (DAS) link using enhanced-scatter fibre (ESF). The longest reach of 195km DAS link without inline amplifications is alsodemonstrated. We further present demonstration of simultaneous fibre-optic sensing and 400Gb/s data transmissions over 195km fibre using ESF.

Field Trial of Coexistence and Simultaneous Switching of Real-Time Fiber Sensing and Coherent 400 GbE in a Dense Urban Environment

Field Trial of Coexistence and Simultaneous Switching of Real-Time Fiber Sensing and Coherent 400 GbE in a Dense Urban Environment Recent advances in optical fiber sensing have enabled telecom network operators to monitor their fiber infrastructure while generating new revenue in various application scenarios including data center interconnect, public safety, smart cities, and seismic monitoring. However, given the high utilization of fiber networks for data transmission, it is undesirable to allocate dedicated fiber strands solely for sensing purposes. Therefore, it is crucial to ensure the reliable coexistence of fiber sensing and communication signals that co-propagate on the same fiber. In this paper, we conduct field trials in a reconfigurable optical add-drop multiplexer (ROADM) network enabled by the PAWR COSMOS testbed, utilizing metro area fibers in Manhattan, New York City. We verify the coexistence of real-time constant-amplitude distributed acoustic sensing (DAS), coherent 400 GbE, and analog radio-over-fiber (ARoF) signals. Measurement results obtained from the field trial demonstratethat the quality of transmission (QoT) of the coherent 400 GbE signal remains unaffected during co-propagation with DAS and ARoF signals in adjacent dense wavelength-division multiplexing (DWDM) channels. In addition, we present a use case of this coexistence system supporting preemptive DAS-informed optical path switching before link failure.

First Field Demonstration of Automatic WDM Optical Path Provisioning over Alien Access Links for Data Center Exchange

First Field Demonstration of Automatic WDM Optical Path Provisioning over Alien Access Links for Data Center Exchange We demonstrated under six minutes automatic provisioning of optical paths over field- deployed alien access links and WDM carrier links using commercial-grade ROADMs, whitebox mux-ponders, and multi-vendor transceivers. With channel probing, transfer learning, and Gaussian noise model, we achieved an estimation error (Q-factor) below 0.7 dB

Real-time Intrusion Detection and Impulsive Acoustic Event Classification with Fiber Optic Sensing and Deep Learning Technologies over Telecom Networks

Real-time Intrusion Detection and Impulsive Acoustic Event Classification with Fiber Optic Sensing and Deep Learning Technologies over Telecom Networks We review various use cases of distributed-fiber-optic-sensing and machine-learning technologies that offer advantages to telecom fiber networks on existing fiber infrastructures. Byleveraging an edge-AI platform, perimeter intrusion detection and impulsive acoustic event classification can be performed locally on-the-fly, ensuring real-time detection with low latency.