Sparsh Garg NEC Labs America

Sparsh Garg

Senior Associate Researcher

Media Analytics

Posts

Learning Semantic Segmentation from Multiple Datasets with Label Shifts

Learning Semantic Segmentation from Multiple Datasets with Label Shifts While it is desirable to train segmentation models on an aggregation of multiple datasets, a major challenge is that the label space of each dataset may be in conflict with one another. To tackle this challenge, we propose UniSeg, an effective and model-agnostic approach to automatically train segmentation models across multiple datasets with heterogeneous label spaces, without requiring any manual relabeling efforts. Specifically, we introduce two new ideas that account for conflicting and co-occurring labels to achieve better generalization performance in unseen domains. First, we identify a gradient conflict in training incurred by mismatched label spaces and propose a class-independent binary cross-entropy loss to alleviate such label conflicts. Second, we propose a loss function that considers class-relationships across datasets for a better multi-dataset training scheme. Extensive quantitative and qualitative analyses on road-scene datasets show that UniSeg improves over multi-dataset baselines, especially on unseen datasets, e.g., achieving more than 8%p gain in IoU on KITTI. Furthermore, UniSeg achieves 39.4% IoU on the WildDash2 public benchmark, making it one of the strongest submissions in the zero-shot setting. Our project page is available at https://www.nec-labs.com/~mas/UniSeg.

MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation

MM-TTA: Multi-Modal Test-Time Adaptation for 3D Semantic Segmentation Test-time adaptation approaches have recently emerged as a practical solution for handling domain shift without access to the source domain data. In this paper, we propose and explore a new multi-modal extension of test-time adaptation for 3D semantic segmentation. We find that, directly applying existing methods usually results in performance instability at test time, because multi-modal input is not considered jointly. To design a framework that can take full advantage of multi-modality, where each modality provides regularized self-supervisory signals to other modalities, we propose two complementary modules within and across the modalities. First, Intra-modal Pseudo-label Generation (Intra-PG) is introduced to obtain reliable pseudo labels within each modality by aggregating information from two models that are both pre-trained on source data but updated with target data at different paces. Second, Inter-modal Pseudo-label Refinement (Inter-PR) adaptively selects more reliable pseudo labels from different modalities based on a proposed consistency scheme. Experiments demonstrate that our regularized pseudo labels produce stable self-learning signals in numerous multi-modal test-time adaptation scenarios for 3D semantic segmentation. Visit our project website at https://www.nec-labs.com/~mas/MM-TTA

MM TTA: Multi Modal Test Time Adaptation for 3D Semantic Segmentation

MM TTA: Multi Modal Test Time Adaptation for 3D Semantic Segmentation Test time adaptation approaches have recently emerged as a practical solution for handling domain shift without access to the source domain data. In this paper, we propose and explore a new multi modal extension of test time adaptation for 3D semantic segmentation. We find that directly applying existing methods usually results in performance instability at test time because multi modal input is not considered jointly. To design a framework that can take full advantage of multi modality, where each modality provides regularized self supervisory signals to other modalities, we propose two complementary modules within and across the modalities. First, Intra modal Pseudolabel Generation (Intra PG) is introduced to obtain reliable pseudo labels within each modality by aggregating information from two models that are both pre trained on source data but updated with target data at different paces. Second, Inter modal Pseudo label Refinement (Inter PR) adaptively selects more reliable pseudo labels from different modalities based on a proposed consistency scheme. Experiments demonstrate that our regularized pseudo labels produce stable self learning signals in numerous multi modal test time adaptation scenarios for 3D semantic segmentation. Visit our project website at https://www.nec labs.com/˜mas/MM TTA

Learning Semantic Segmentation from Multiple Datasets with Label Shifts

Learning Semantic Segmentation from Multiple Datasets with Label Shifts With increasing applications of semantic segmentation, numerous datasets have been proposed in the past few years. Yet labeling remains expensive, thus, it is desirable to jointly train models across aggregations of datasets to enhance data volume and diversity. However, label spaces differ across datasets and may even be in conflict with one another. This paper proposes UniSeg, an effective approach to automatically train models across multiple datasets with differing label spaces, without any manual relabeling efforts. Specifically, we propose two losses that account for conflicting and co occurring labels to achieve better generalization performance in unseen domains. First, a gradient conflict in training due to mismatched label spaces is identified and a class independent binary cross entropy loss is proposed to alleviate such label conflicts. Second, a loss function that considers class relationships across datasets is proposed for a better multi dataset training scheme. Extensive quantitative and qualitative analyses on road scene datasets show that UniSeg improves over multi dataset baselines, especially on unseen datasets, e.g., achieving more than 8% gain in IoU on KITTI averaged over all the settings.