AutoScape: Geometry-Consistent Long-Horizon Scene Generation

This paper proposes AutoScape, a long-horizon driving scene generation framework. At its core is a novel RGB-D diffusion model that iteratively generates sparse, geometrically consistent keyframes, serving as reliable anchors for the scenes appearance and geometry. To maintain long-range geometric consistency, the model 1) jointly handles image and depth in a shared latent space, 2) explicitly conditions on the existing scene geometry (i.e., rendered point clouds) from previously generated keyframes, and 3) steers the sampling process with a warp-consistent guidance. Given high-quality RGB-D keyframes, a video diffusion model then interpolates between them to produce dense nd coherent video frames. AutoScape generates realistic and geometrically consistent driving videos of over 20 seconds, improving the long-horizon FID and FVD scores over the prior state-of-the-art by 48.6% and 43.0%, respectively.

Murugan Sankaradas presents TalentScout: Multimodal AI-Driven Expert Finding in Organizations at PICom2025 on October 21st

Murugan Sankaradas (presenting virtually) will present “TalentScout: Multimodal AI-Driven Expert Finding in Organizations” at the IEEE International Conference on Pervasive Intelligence and Computing (PICom2025) on Tuesday, October 21 (10:30am–12pm JST) | Monday, October 20 (9:30–11pm ET) in Hokkaido, Japan.

Abhishek Aich is Organizing the Anomaly Detection with Foundation Models Workshop, held in conjunction with ICCV 2025

We are proud to share that our Abhishek Aich is serving as one of the organizers of the Anomaly Detection with Foundation Models Workshop, held in conjunction with the International Conference on Computer Vision, October 20, 2025, 08:55 AM – 12:15 PM HST in Room 314 at theHawaii Convention Center, Honolulu, HI.

Kunal Rao presents SlideCraft: Context-Aware Slides Generation Agent at PICom 2025 on October 21st

Kunal Rao (presenting virtually) will present “SlideCraft: Context-Aware Slides Generation Agent” at the IEEE International Conference on Pervasive Intelligence and Computing hashtag#PICom2025 on Tuesday, Oct 21 (10:30am–12pm JST) | Monday, Oct 20 (9:30–11pm ET) in Hokkaido, Japan. SlideCraft uses AI to automatically generate presentation slides from research content, making technical communication faster and context-aware for scientists and professionals.

Sparsh Garg Presents Mapillary Vistas Validation for Fine-Grained Traffic Signs at DataCV 2025

Our Sparsh Garg, a Senior Associate Researcher in the Media Analytics Department, will present “Mapillary Vistas Validation for Fine-Grained Traffic Signs: A Benchmark Revealing Vision-Language Model Limitations” at the Data Computer Vision (DataCV) 2025 workshop as part of ICCV 2025 in Honolulu, Hawai’i, on Sunday, October 19th, from 11:15 am – 11:25 am.

THAT: Token-wise High-frequency Augmentation Transformer for Hyperspectral Pansharpening

Transformer-based methods have demonstrated strong potential in hyperspectral pansharpening by modeling long-range dependencies. However, their effectiveness is often limited by redundant token representations and a lack of multiscale feature modeling. Hyperspectral images exhibit intrinsic spectral priors (e.g., abundance sparsity) and spatial priors(e.g., non-local similarity), which are critical for accurate reconstruction. From a spectral–spatial perspective, Vision Transformers (ViTs) face two major limitations: they struggle to preserve high-frequency components—such as material edges and texture transitions, and suffer from attention dispersion across redundant tokens. These issues stem from the global self-attention mechanism, which tends to dilute high-frequency signals and overlook localized details. To address these challenges, we propose the Token-wise High-frequency AugmentationTransformer (THAT), a novel framework designed to enhance hyperspectral pansharpening through improved high-frequency feature representation and token selection. Specifically, THAT introduces: (1) Pivotal Token Selective Attention (PTSA) to prioritize informative tokens and suppress redundancy; (2) a Multi-level Variance-aware Feed-forward Network (MVFN) to enhance high-frequency detail learning. Experiments on standard benchmarks show that THAT achieves state-of-the-art performance with improved reconstruction quality and efficiency.

Utilizing Distributed Acoustic Sensing with Telecom Fibers for Entomological Observations

The 2021 emergence of Brood X cicadas was monitored in situ in our testbed using a DAS system connected to an outdoor telecom fiber over a 16-day period. The spectral and energy characteristics of the cicada calling signal has been measured and analyzed.

Optical Network Tomography over Live Production Network in Multi-Domain Environment

We report the first trial of network tomography over a live network in a multi-domain environ­ment. We visualize end-to-end optical powers along multiple routes across multiple domains solely from a commercial B00G transponder, enabling performance bottleneck localization, power and routing opti­mization, and lightpath provisioning.

Observing the Worst- and Best-Case Line-System Transmission Conditions in a C-Band Variable Spectral Load Scenario

We experimentally investigated variable spectral loading in an OMS, identifying performance under best and worst transmission conditions. Metrics and data visualization allowed correlation between channel configurations and OSNR variations, enabling the derivation of a simple spectrum allocation rule.

Energy-based Generative Models for Distributed Acoustic Sensing Event Classification in Telecom Networks

Distributed fiber-optic sensing combined with machine learning enables continuous monitoring of telecom infrastructure. We employ generative modeling for event classification, supporting semi­ supervised learning, uncertainty calibration, and noise resilience. Our approach offers a scalable, data-efficient solution for real-world deployment in complex environments.