Media Analytics

Our team overcomes fundamental challenges in computer vision and addresses key needs in mobility, security, safety and socially relevant AI.

PROJECTS

PEOPLE

PUBLICATIONS

PATENTS

Overview

We solve fundamental challenges in computer vision, with a focus on understanding and interaction in 3D scenes, foundational vision-language models, robust learning across domains and responsible AI. Our technological breakthroughs contribute to socially-relevant solutions that address key enterprise needs in mobility, safety and smart spaces.

Computer vision is one of the fastest-growing fields today, with interdisciplinary connections to machine learning, language and robotics. Our work aims to democratize autonomous driving, develop agentic LLMs that solve user workflows and enable interactive robots, with a commitment to computer vision that promotes privacy, fairness and sustainability. Our award-winning research regularly features at top-tier venues such as CVPR, ICCV, ECCV and NeurIPS.

Learn More

Meet our team of experts, check out our blog, or read our latest publications.

Interested in Joining Us?

We seek the next generation of thought leaders in computer vision and machine learning for researcher positions. Outstanding applicants pursuing a career in all areas of computer vision are encouraged to visit our careers page to learn more and apply for our available positions or internships.

Featured Media Analytics Research Projects

Autonomous Driving

While autonomous cars are rapidly becoming a reality, it remains a challenge to scalably deploy them across geographies and conditions. Our full-stack autonomy solutions include perception, prediction, planning, simulation and devops that leverage latest advances in generative AI, neural rendering, large language models, diffusion models and transformers.

Neural Rendering and Diffusion for Simulation Project

Our simulation framework utilizes advances in neural rendering, diffusion models and large language models to automatically transform drive data into a full 3D sensor simulation testbed with unmatched photorealism. We offer language-based control to generate safety-critical scenarios such as collisions, traffic rule violations and other unsafe behaviors, to improve the perception and planning abilities of autonomous vehicles.

Foundational Vision-Language Models Project

Our foundational models enable ubiquitous usage of computer vision across scenarios, applications and user preferences. By combining the power of very large-scale computer vision and natural language datasets, together with innovations in visual instruction following, our foundational models yield deeper domain-specific insights, at lower data center costs, and with lower hallucinations.

See all our Media Analytics Projects

Media Analytics

Overview

Learn More

Interested in Joining Us?

Featured Media Analytics Research Projects

Autonomous Driving

Neural Rendering and Diffusion for Simulation

Foundational Vision-Language Models

Contact Us

About Us

Our Pages

Read Our Blog Posts