disentangled representation Archives

Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation

March 4, 2025/in Publications/by NEC Labs America

We consider the conditional generation of 3D drug-like molecules with explicit control over molecular properties such as drug-like properties (e.g., Quantitative Estimate of Druglikenessor Synthetic Accessibility score) and effectively binding to specific protein sites. To tackle this problem, we propose an E(3)-equivariant Wasserstein autoencoder and factorize thelatent space of our generative model into two disentangled aspects: molecular properties and the remaining structural context of 3D molecules. Our model ensures explicit control over these molecular attributes while maintaining equivariance of coordinate representation and invariance of data likelihood. Furthermore, we introduce a novel alignment-based coordinate loss to adapt equivariant networks for auto-regressive denovo 3D molecule generation from scratch. Extensive experiments validate our models effectiveness on property-guidedand context-guided molecule generation, both for de-novo 3D molecule design and structure-based drug discovery against protein targets.

Disentangled Wasserstein Autoencoder for T-Cell Receptor Engineering

December 10, 2023/in Publications/by NEC Labs America

In protein biophysics, the separation between the functionally important residues (forming the active site or binding surface) and those that create the overall structure (the fold) is a well-established and fundamental concept. Identifying and modifying those functional sites is critical for protein engineering but computationally nontrivial, and requires significant domain knowledge. To automate this process from a data-driven perspective, we propose a disentangled Wasserstein autoencoder with an auxiliary classifier, which isolates the function-related patterns from the rest with theoretical guarantees. This enables one-pass protein sequence editing and improves the understanding of the resulting sequences and editing actionsinvolved. To demonstrate its effectiveness, we apply it to T-cell receptors (TCRs), a well-studied structure-function case. We show that our method can be used to alterthe function of TCRs without changing the structural backbone, outperforming several competing methods in generation quality and efficiency, and requiring only 10% of the running time needed by baseline models. To our knowledge, this is the first approach that utilizes disentangled representations for TCR engineering.

AE-StyleGAN: Improved Training of Style-Based Auto-Encoders

January 4, 2022/in Publications/by NEC Labs America

StyleGANs have shown impressive results on data generation and manipulation in recent years, thanks to its disentangled style latent space. A lot of efforts have been made in inverting a pretrained generator, where an encoder is trained ad hoc after the generator is trained in a two-stage fashion. In this paper, we focus on style-based generators asking a scientific question: Does forcing such a generator to reconstruct real data lead to more disentangled latent space and make the inversion process from image to latent space easy? We describe a new methodology to train a style-based autoencoder where the encoder and generator are optimized end-to-end. We show that our proposed model consistently outperforms baselines in terms of image inversion and generation quality. Supplementary, code, and pretrained models are available on the project website.

Disentangled Recurrent Wasserstein Auto-Encoder

May 4, 2021/in Publications/by NEC Labs America

Learning disentangled representations leads to interpretable models and facilitates data generation with style transfer, which has been extensively studied on static data such as images in an unsupervised learning framework. However, only a few works have explored unsupervised disentangled sequential representation learning due to challenges of generating sequential data. In this paper, we propose recurrent Wasserstein Autoencoder (R-WAE), a new framework for generative modeling of sequential data. R-WAE disentangles the representation of an input sequence into static and dynamic factors (i.e., time-invariant and time-varying parts). Our theoretical analysis shows that, R-WAE minimizes an upper bound of a penalized form of the Wasserstein distance between model distribution and sequential data distribution, and simultaneously maximizes the mutual information between input data and different disentangled latent factors, respectively. This is superior to (recurrent) VAE which does not explicitly enforce mutual information maximization between input data and disentangled latent representations. When the number of actions in sequential data is available as weak supervision information, R-WAE is extended to learn a categorical latent representation of actions to improve its disentanglement. Experiments on a variety of datasets show that our models outperform other baselines with the same settings in terms of disentanglement and unconditional video generation both quantitatively and qualitatively.

Improving Disentangled Text Representation Learning with Information Theoretical Guidance

July 5, 2020/in Publications/by NEC Labs America

Learning disentangled representations of natural language is essential for many NLP tasks, e.g., conditional text generation, style transfer, personalized dialogue systems, etc. Similar problems have been studied extensively for other forms of data, such as images and videos. However, the discrete nature of natural language makes the disentangling of textual representations more challenging (e.g., the manipulation over the data space cannot be easily achieved). Inspired by information theory, we propose a novel method that effectively manifests disentangled representations of text, without any supervision on semantics. A new mutual information upper bound is derived and leveraged to measure dependence between style and content. By minimizing this upper bound, the proposed method induces style and content embeddings into two independent low-dimensional spaces. Experiments on both conditional text generation and text-style transfer demonstrate the high quality of our disentangled representation in terms of content and style preservation.

Posts

Learning Disentangled Equivariant Representation for Explicitly Controllable 3D Molecule Generation

Disentangled Wasserstein Autoencoder for T-Cell Receptor Engineering

AE-StyleGAN: Improved Training of Style-Based Auto-Encoders

Disentangled Recurrent Wasserstein Auto-Encoder

Improving Disentangled Text Representation Learning with Information Theoretical Guidance

Contact Us

About Us

Our Pages

Read Our Blog Posts

Tag Archive for: disentangled representation

Posts

Contact Us

About Us

Our Pages

Read Our Blog Posts