Research

We conduct research in computer vision and machine learning, with a focus on sustaining excellence in three main directions: (1) scene understanding; (2) visual recognition and representation learning; and (3) adaptation, fairness and privacy. Key applications of our research include visual surveillance and autonomous driving. We tackle fundamental problems in computer vision, such as object detection, semantic segmentation, face recognition, 3D reconstruction and behavior prediction. We develop and leverage breakthroughs in deep learning, particularly with a flavor of weak supervision, metric learning and domain adaptation.


Adaptation, Fairness and Privacy

We develop AI techniques and policies that benefit society by disrupting traditional utility-cost trade-offs. Our solutions achieve better accuracy and fairness of service across geographic, social or economic boundaries, while ensuring lower costs and higher guarantees of privacy.

CVPR 2020 | Private-kNN: Practical Differential Privacy for Computer Vision
Yuqing Zhu, Xiang Yu, Manmohan Chandraker, Yu-Xiang Wang

The Private Aggregation of Teacher Ensembles (PATE) approach requires the training sets for the teachers to be disjoint. As such, achieving desirable privacy bounds requires an often impractical amount of labeled data. We propose a data-efficient scheme, which altogether avoids splitting the training dataset. Our approach allows the use of privacy-amplification by subsampling and iterative refinement of the kNN feature embedding. Comparing to PATE, we achieve comparable or better utility while reducing more than 90% privacy cost, thereby providing the “most practical method to-date” in computer vision. 

PDF
AAAI 2020 | Adversarial Learning of Privacy-Preserving and Task-Oriented Representations
Taihong Xiao, Yi-Hsuan Tsai, Kihyuk Sohn, Manmohan Chandraker, Ming-Hsuan Yang

Our aim is to learn privacy-preserving and task-oriented representations that defend against model inversion attacks. To achieve this aim, we propose an adversarial reconstruction-based framework for learning latent representations that cannot be decoded to recover the original input images. By simulating the expected behavior of adversary, our framework is realized by minimizing the negative pixel reconstruction loss or the negative feature reconstruction (i.e., perceptual distance) loss.  

PDF
WACV 2020 | Active Adversarial Domain Adaptation
Jong-Chyi Su, Yi-Hsuan Tsai, Kihyuk Sohn, Buyu Liu, Subhransu Maji, Manmohan Chandraker

We propose an active learning approach for transferring representations across domains. Our approach, active adversarial domain adaptation (AADA), explores a duality between two related problems: adversarial domain alignment and importance sampling for adapting models across domains. The former uses a domain discriminative model to align domains, while the latter utilizes it to weigh samples to account for distribution shifts. Specifically, our importance weight promotes samples with large uncertainty in classification and diversity from labeled examples, thus serves as a sample selection scheme for active learning. 

PDF
WACV 2020 |  Unsupervised and Semi-Supervised Domain Adaptation for Action Recognition from Drones
Jinwoo Choi, Gaurav Sharma, Manmohan Chandraker, and Jia-Bin Huang

We address the problem of human action classification in drone videos. Due to the high cost of capturing and labeling large-scale drone videos with diverse actions, we present unsupervised and semi-supervised domain adaptation approaches that leverage both the existing fully annotated action recognition datasets and unannotated (or only a few annotated) videos from drones. To study the emerging problem of drone-based action recognition, we create a new dataset, NEC-DRONE, containing 5,250 videos to evaluate the task.

PDF | Project Site | Dataset
CVPR 2019 |  Gotta Adapt ’Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild
Luan Tran, Kihyuk Sohn, Xiang Yu, Xiaoming Liu, Manmohan Chandraker

We provide a solution that allows knowledge transfer from fully annotated source images to unlabeled target ones, often captured in a different condition. We adapt at multiple semantic levels from feature to pixel, with complementary insights for each type. Utilizing the proposal, we achieve better recognition accuracy of car images in unlabeled surveillance domains by adapting the knowledge from car images on the web.

PDF
ICLR 2019 |  Unsupervised Domain Adaptation for Distance Metric Learning
Kihyuk Sohn, Wenling Shang, Xiang Yu, Manmohan Chandraker

We propose Feature Transfer Network, a novel deep neural network for image-based face verification and identification that can adapt to biases like ethnicity, gender or age in a target set. Unlike existing methods, our network can even handle novel identities existing in the target domain. Our framework excels at both within-domain and cross-domain utility tasks, thus retaining discriminatory power in the adaptation. 

PDF
ICCV 2019 |  Domain Adaptation for Structured Output via Discriminative Patch Representations
Yi-Hsuan Tsai, Kihyuk Sohn, Samuel Schulter, Manmohan Chandraker

We tackle domain adaptive semantic segmentation via learning discriminative feature representations of patches in the source domain by discovering multiple modes of patch-wise output distribution through the construction of a clustered space. With such guidance, we use an adversarial learning scheme to push the feature representations of target patches in the clustered space closer to the distributions of source patches. we show that our framework is complementary to existing domain adaptation techniques.

PDF | Supplementary | Project Site | Dataset
ICCV 2017 | Reconstruction-Based Disentanglement for Pose-invariant Face Recognition
Xi Peng, Xiang Yu, Kihyuk Sohn, Dimitris N. Metaxas, Manmohan Chandraker

Generic data-driven deep face features might confound images of the same identity under large poses with other identities. We propose a feature reconstruction metric learning to disentangle identity and pose information in the latent feature space. The disentangled feature space encourages identity features of the same subject to be clustered together despite the pose variation. Experiments on both controlled and in-the-wild face datasets show that our method consistently outperforms the state-of-the-art, especially on images with large head pose variations. 

PDF