Overview: We develop novel computational cameras that allow computer vision analysis even in sensitive environments like hospitals or smart homes. Our key innovation is a camera that removes private information.
Overview: Our simulation framework utilizes advances in neural rendering, diffusion models and large language models to automatically transform drive data into a full 3D sensor simulation testbed with unmatched photorealism.
Overview: Our AI devops pipeline builds a high-fidelity digital twin of sensor data which allows for self-improvement of deployed models. We leverage our foundational vision-language models to automatically determine issues in currently deployed AI, pseudo-label or simulate training data, develop models with continual learning and use an LLM-based verification over diverse scenarios.
Overview: We have pioneered the development of learned bird-eye view representations for road scenes which form a basis for 3D perception using images in applications like autonomous driving.
Overview: We propose a Multi-Modal Test-Time Adaptation (MM-TTA) framework that enables a model to be quickly adapted to multi-modal test data without access to the source domain training data.
Overview: We derive a new differential homography that can account for the scanline-varying camera poses in rolling shutter (RS) cameras, and demonstrate its application to carry out RS-aware image stitching and rectification at one stroke.
Overview: We make a theoretical contribution by proving that RS two-view geometry is degenerate in the case of pure translational camera motion. In view of the complex RS geometry, we then propose a convolutional neural network-based method which learns the underlying geometry (camera motion and scene structure) from just a single RS image and performs RS image correction.