The University of Southern California (USC), founded in 1880, is a private, comprehensive research university in Los Angeles. It provides undergraduates with an extraordinary range of academic programs and encourages study and research across disciplines, making significant contributions as a major research institution. NEC Labs America teams up with USC to explore cutting-edge topics in human-machine interaction, gesture recognition, and privacy-conscious visual analytics. Our joint publications contribute to the development of secure, socially responsible AI systems. Please read about our latest news and collaborative publications with the University of Southern California.

Posts

THAT: Token-wise High-frequency Augmentation Transformer for Hyperspectral Pansharpening

Transformer-based methods have demonstrated strong potential in hyperspectral pansharpening by modeling long-range dependencies. However, their effectiveness is often limited by redundant token representations and a lack of multiscale feature modeling. Hyperspectral images exhibit intrinsic spectral priors (e.g., abundance sparsity) and spatial priors(e.g., non-local similarity), which are critical for accurate reconstruction. From a spectral–spatial perspective, Vision Transformers (ViTs) face two major limitations: they struggle to preserve high-frequency components—such as material edges and texture transitions, and suffer from attention dispersion across redundant tokens. These issues stem from the global self-attention mechanism, which tends to dilute high-frequency signals and overlook localized details. To address these challenges, we propose the Token-wise High-frequency AugmentationTransformer (THAT), a novel framework designed to enhance hyperspectral pansharpening through improved high-frequency feature representation and token selection. Specifically, THAT introduces: (1) Pivotal Token Selective Attention (PTSA) to prioritize informative tokens and suppress redundancy; (2) a Multi-level Variance-aware Feed-forward Network (MVFN) to enhance high-frequency detail learning. Experiments on standard benchmarks show that THAT achieves state-of-the-art performance with improved reconstruction quality and efficiency.

Hierarchical Gaussian Mixture based Task Generative Model for Robust Meta-Learning

Meta-learning enables quick adaptation of machine learning models to new tasks with limited data. While tasks could come from varying distributions in reality, most of the existing meta-learning methods consider both training and testing tasks as from the same uni-component distribution, overlooking two critical needs of a practical solution: (1) the various sources of tasks may compose a multi-component mixture distribution, and (2) novel tasks may come from a distribution that is unseen during meta-training. In this paper, we demonstrate these two challenges can be solved jointly by modeling the density of task instances. We develop a meta training framework underlain by a novel Hierarchical Gaussian Mixture based Task Generative Model (HTGM). HTGM extends the widely used empirical process of sampling tasks to a theoretical model, which learns task embeddings, fits the mixture distribution of tasks, and enables density-based scoring of novel tasks. The framework is agnostic to the encoder and scales well with large backbone networks. The model parameters are learned end-to-end by maximum likelihood estimation via an Expectation-Maximization (EM) algorithm. Extensive experiments on benchmark datasets indicate the effectiveness of our method for both sample classification and novel task detection.

Pose-variant 3D Facial Attribute Generation

We address the challenging problem of generating facial attributes using a single image in an unconstrained pose. In contrast to prior works that largely consider generation on 2D near-frontal images, we propose a GAN-based framework to generate attributes directly on a dense 3D representation given by UV texture and position maps, resulting in photorealistic, geometrically-consistent and identity-preserving outputs. Starting from a self-occluded UV texture map obtained by applying an off-the-shelf 3D reconstruction method, we propose two novel components. First, a texture completion generative adversarial network (TC-GAN) completes the partial UV texture map. Second, a 3D attribute generation GAN (3DA-GAN) synthesizes the target attribute while obtaining an appearance consistent with 3D face geometry and preserving identity. Extensive experiments on CelebA, LFW and IJB-A show that our method achieves consistently better attribute generation accuracy than prior methods, a higher degree of qualitative photorealism and preserves face identity information.

Coherent optical wireless communication link employing orbital angular momentum multiplexing in a ballistic and diffusive scattering medium

We experimentally investigate the scattering effect on an 80 Gbit/s orbital angular momentum (OAM) multiplexed optical wireless communication link. The power loss, mode purity, cross talk, and bit error rate performance are measured and analyzed for different OAM modes under scattering levels from ballistic to diffusive regions. Results show that (i) power loss is the main impairment in the ballistic scattering, while the mode purities of different OAM modes are not significantly affected; (ii) in the diffusive scattering, however, the performance of an OAM-multiplexed link further suffers from the increased cross talk between the different OAM modes.