Media Analytics
Our team overcomes fundamental challenges in computer vision and addresses key needs in mobility, security, safety and socially relevant AI.
Our team overcomes fundamental challenges in computer vision and addresses key needs in mobility, security, safety and socially relevant AI.
PROJECTS
PEOPLE
PUBLICATIONS
PATENTS
We solve fundamental challenges in computer vision, with a focus on understanding and interaction in 3D scenes, foundational vision-language models, robust learning across domains and responsible AI. Our technological breakthroughs contribute to socially-relevant solutions that address key enterprise needs in mobility, safety and smart spaces.
Computer vision is one of the fastest-growing fields today, with interdisciplinary connections to machine learning, language and robotics. Our work aims to democratize autonomous driving, develop agentic LLMs that solve user workflows and enable interactive robots, with a commitment to computer vision that promotes privacy, fairness and sustainability. Our award-winning research regularly features at top-tier venues such as CVPR, ICCV, ECCV and NeurIPS.
Meet our team of experts, check out our blog, or read our latest publications.
We seek the next generation of thought leaders in computer vision and machine learning for researcher positions. Outstanding applicants pursuing a career in all areas of computer vision are encouraged to visit our careers page to learn more and apply for our available positions.
Our exciting internship opportunities for this Summer 2025 are now open. We are looking for students pursuing advanced degrees in Computer Science and Electrical Engineering. Internships are typically 3 months long in duration during the summer. The benefits of working for us include the opportunity to quickly become part of a project team applying cutting-edge technology to industry-leading concepts. Apply to one of our internships below today!
To learn more about our Summer Internships, visit our Internship page.
We work on the problem of multi-dataset semantic segmentation, where each dataset has a different label space. We show that directly combining all the datasets and train the model would result in a gradient conflict issue when there is a label conflict in the unified label space. Such an issue could impact the testing time when inputting an image from an unseen dataset to the model. For example, the rider in the right image can be considered as both the “rider” or the “motorcyclist” categories in the unified space. Therefore, it is of great importance to develop a method that considers such label conflict during the training process.
READ MORE
We propose a novel framework for domain adaptation in semantic segmentation with image-level weak labels in the target domain. The weak labels may be obtained based on a model prediction for unsupervised domain adaptation (UDA), or from a human annotator in a new weakly supervised domain adaptation (WDA) paradigm for semantic segmentation. Using weak labels is both practical and useful, since (i) collecting image-level target annotations is comparably cheap in WDA and incurs no cost in UDA, and (ii) it opens up the opportunity for category-wise domain alignment.
READ MORE
We derive a new differential homography that can account for the scanline-varying camera poses in rolling shutter (RS) cameras, and demonstrate its application to carry out RS-aware image stitching and rectification at one stroke. Despite the high complexity of RS geometry, we focus in this paper on a special yet common input: two consecutive frames from a video stream wherein the interframe motion is restricted from being arbitrarily large.
READ MORE