Hero Section

Media Analytics

Our team overcomes fundamental challenges in computer vision and addresses key needs in mobility, security, safety and socially relevant AI.

Tab Dropdown

We solve fundamental challenges in computer vision, with a focus on understanding and interaction in 3D scenes, representation learning in visual and multimodal data, learning across domains and tasks, as well as responsible AI. Our technological breakthroughs contribute to socially-relevant solutions that address key enterprise needs in mobility, safety and smart spaces. Meet our team of experts, check out our blog, or read our latest publications.

Interested in joining us? We are seeking the next generation of thought leaders in computer vision and machine learning for researcher positions. Outstanding applicants pursuing a career in all areas of computer vision are encouraged to apply for positions here.

We are hiring for the 2023 Internship season. Applicants can apply here.

Featured research project background

Featured Research Projects

Understanding Road Layout From Videos as a Whole

CVPR 2020 | We address the problem of inferring the layout of complex road scenes from video sequences. To this end, we formulate it as a top-view road attributes prediction problem, and our goal is to predict these attributes for each frame both accurately and consistently. In contrast to prior work, we exploit the following three novel aspects: (i) leveraging camera motions in videos (ii) including context cues and (iii) incorporating long-term video information. Specifically, we introduce a model that aims to enforce prediction consistency in videos.


Object Detection With a Unified Label Space From Multiple Datasets

ECCV 2020 | Given multiple datasets with different label spaces, the goal of this work is to train a single object detector predicting over the union of all the label spaces. The practical benefits of such an object detector are obvious and significant—application-relevant categories can be picked and merged from arbitrary existing datasets. However, naïve merging of datasets is not possible in this case due to inconsistent object annotations. To address this challenge, we design a framework that works with such partial annotations, and we exploit a pseudo-labeling approach that we adapt for our specific case.


Private-kNN: Practical Differential Privacy for Computer Vision

CVPR 2020 | The Private Aggregation of Teacher Ensembles (PATE) approach requires the training sets for the teachers to be disjoint. As such, achieving desirable privacy bounds requires an often impractical amount of labeled data. We propose a data-efficient scheme, which altogether avoids splitting the training dataset. Our approach allows the use of privacy amplification by subsampling and iterative refinement of the kNN feature embedding. Comparing to PATE, we achieve comparable or better utility while reducing more than 90% privacy cost, thereby providing the “most practical method to date” in computer vision.