deep patel Archives | Page 3 of 3

Deep Patel is a Senior Associate Researcher in the Machine Learning Department at NEC Laboratories America in Princeton, NJ. He earned his Bachelor of Science (BS) in Computer Science from Towson University.

At NEC, Deep contributes to platforms for intelligent visual analytics, visual search, and vision-language interaction, helping to develop video-based reasoning models that operate in real-time across multi-camera systems.

His work includes optimizing neural architectures for embedded systems and designing scalable inference pipelines for video AI applications. He plays a key role in bringing NEC’s media analytics solutions from lab prototypes to production-ready systems used in smart cities and enterprise monitoring.

Posts

Source-Free Video Domain Adaptation with Spatial-Temporal-Historical Consistency Learning

June 18, 2023/in Publications/by NEC Labs America

Source-free domain adaptation (SFDA) is an emerging research topic that studies how to adapt a pretrained source model using unlabeled target data. It is derived from unsupervised domain adaptation but has the advantage of not requiring labeled source data to learn adaptive models. This makes it particularly useful in real-world applications where access to source data is restricted. While there has been some SFDA work for images, little attention has been paid to videos. Naively extending image-based methods to videos without considering the unique properties of videos often leads to unsatisfactory results. In this paper, we propose a simple and highly flexible method for Source-Free Video Domain Adaptation (SFVDA), which extensively exploits consistency learning for videos from spatial, temporal, and historical perspectives. Our method is based on the assumption that videos of the same action category are drawn from the same low-dimensional space, regardless of the spatio-temporal variations in the high-dimensional space that cause domain shifts. To overcome domain shifts, we simulate spatio-temporal variations by applying spatial and temporal augmentations on target videos, and encourage the model to make consistent predictions from a video and its augmented versions. Due to the simple design, our method can be applied to various SFVDA settings, and experiments show that our method achieves state-of-the-art performance for all the settings.

Learning Higher-order Object Interactions for Keypoint-based Video Understanding

October 11, 2021/in Publications/by NEC Labs America

Action recognition is an important problem that requires identifying actions in video by learning complex interactions across scene actors and objects. However, modern deep-learning based networks often require significant computation and may capture scene context using various modalities that further increases compute costs. Efficient methods such as those used for AR/VR often only use human-keypoint information but suffer from a loss of scene context that hurts accuracy. In this paper, we describe an action-localization method, KeyNet, that uses only the keypoint data for tracking and action recognition. Specifically, KeyNet introduces the use of object based keypoint information to capture context in the scene. Our method illustrates how to build a structured intermediate representation that allows modeling higher-order interactions in the scene from object and human keypoints without using any RGB information. We find that KeyNet is able to track and classify human actions at just 5 FPS. More importantly, we demonstrate that object keypoints can be modeled to recover any loss in context from using keypoint information over AVA action and Kinetics datasets.

Posts

Source-Free Video Domain Adaptation with Spatial-Temporal-Historical Consistency Learning

Learning Higher-order Object Interactions for Keypoint-based Video Understanding

Contact Us

About Us

Our Pages

Recent Publications

Events

News

Tag Archive for: deep patel

Posts

Contact Us

About Us

Our Pages

Recent Publications

Events

News