Rethinking Molecular Drug Design: From Generation to Control

April 24, 2026|byNEC Labs America|inNews|tags3d molecule generation, ai for drug discovery, chemical space exploration, computational chemistry, controllable generation, diffusion models, generative ai, haoran liu, machine learning, martin renqiang min, molecular design, multi objective optimization, national research council canada, pub to post series, scientific machine learning, semantic embeddings, texas a&m university, tianxiao li, yale university

Designing a single viable drug candidate can take years and billions of dollars, largely due to the difficulty of optimizing multiple molecular properties at once.

Today, the real bottleneck lies in how precisely researchers can shape molecular properties without disrupting critical structures.

Rethinking Molecular Drug Design From Generation to Control

As AI-driven molecular drug design continues to mature, the ability to guide generation with intent, rather than rely on trial and error, is becoming essential.

In the paper “Disentangled Autoencoding Equivariant Diffusion Model for Controlled Generation of 3d Molecules,” NEC Laboratories America researchers Tianxiao Li and Martin Renqiang Min lead this work alongside Haoran Liu, PhD, at Texas A&M, Intern at NEC Laboratories America; Hongyu Guo, National Research Council Canada; and Mark Gerstein, Molecular Biophysics & Biochemistry, Yale University. Published in Nature Communications, the research introduces a significant step forward: a framework that enables fine-grained, multi-objective control over the generation of 3D molecules.

At the core of this advance is MolDiffdAE, a disentangled autoencoding equivariant diffusion model designed to both generate and manipulate 3D molecular structures. By combining diffusion models with a structured latent representation, the framework enables researchers to independently control molecular composition, geometry, and physicochemical properties within a single, unified system.

A New Perspective: From Random Generation to Semantic Control

Traditional diffusion models excel at generating realistic molecules, but they struggle with control. They lack an explicit structure for manipulating molecular properties, forcing researchers to rely on external guidance or retraining for each new objective.

“Molecular design is not just about generating valid structures; it is about navigating trade-offs between multiple properties while preserving what already works. Our approach reframes generation as a controllable process by learning a semantic space that captures these relationships.” — Tianxiao Li

This work introduces a shift: instead of treating molecule generation as a black box, it becomes a navigable space of semantic meaning, where properties can be adjusted directly and efficiently.

For example, a researcher could start with a molecule that binds effectively to a target protein but lacks stability. Using MolDiffdAE, they can adjust stability-related properties while preserving binding geometry—without restarting the design process.

What Is Semantic-Guided Diffusion?

At the core of this research is a semantic embedding, which is a learned representation that captures the full meaning of a molecule, including its structure, shape, and properties. MolDiffdAE works by encoding molecules into this space and then using it to guide generation. Key components include:

Semantic embedding: A compact representation that captures molecular composition, geometry, and properties
Diffusion decoder: Generates 3D molecules guided by the embedding
Disentanglement mechanism: Separates different properties to enable independent control

This approach transforms molecular generation into a latent space optimization problem, where modifying the embedding directly steers the output.

Real-World Implications

This represents a shift from trial-and-error design to guided, data-efficient molecular engineering. The ability to control molecule generation at this level has an immediate and far-reaching impact:

Drug discovery acceleration: Researchers can optimize multiple properties simultaneously, such as solubility, synthesis feasibility, and binding affinity
Reduced reliance on labeled data: The model learns in an unsupervised way, enabling efficient use of limited experimental datasets
Improved candidate quality: Generated molecules maintain structural integrity while achieving targeted improvements
Faster design iteration: Instead of retraining models, scientists can directly manipulate embeddings to explore design alternatives
Enhanced exploration of chemical space: Retrieval-augmented generation allows the model to leverage known molecules to guide new designs

Why This Matters Now

The pharmaceutical and materials industries are increasingly relying on AI to shorten development cycles and reduce costs. However, most generative models still operate with limited controllability, especially when multiple objectives must be balanced.

This work arrives at a critical moment:

AI models are scaling, but interpretability and control remain bottlenecks
Drug discovery requires multi-objective optimization, not single-property tuning
Data scarcity continues to limit traditional supervised approaches

Unlike many generative models that require retraining for each objective, MolDiffdAE enables direct manipulation within a unified latent space by reducing the number of iterations and making it more viable for real-world R&D pipelines.

It also raises important practical questions:

How can we systematically explore trade-offs between competing molecular properties?
Can we design molecules that meet real-world constraints without iterative retraining?
What does it mean to “navigate” chemical space rather than sample from it?

Final Thoughts

This research demonstrates that the future of molecular design lies not just in generating candidates, but in controlling them with precision. By introducing a disentangled semantic space for 3D molecules, MolDiffdAE enables a new level of flexibility, efficiency, and insight.

NEC Laboratories America continues to push the boundaries of applied AI, bridging advanced machine learning with real-world scientific challenges. This work highlights how foundational research can directly reshape workflows in drug discovery and beyond. As generative models evolve, the ability to steer them intelligently will define their impact. This approach offers a compelling blueprint for that future.

About The Authors

Tianxiao Li is a postdoctoral scientist in the Machine Learning Department at NEC Laboratories America. He received his undergraduate degree in Biological Sciences from Tsinghua University and earned his Ph.D. from Yale University. His expertise spans deep learning, predictive analytics, generative modeling and computational biology, with a focus on building machine learning models that support real-world decision-making in healthcare and pharmaceuticals.

Martin Renqiang Min is the Department Head of the Machine Learning Department at NEC Laboratories America. He holds a Ph.D. in Computer Science from the University of Toronto and completed postdoctoral research at Yale University, where he also taught courses on deep learning. His research has been published in top venues, including Nature, NeurIPS, ICML, ICLR, CVPR, and ACL, and his innovations have been recognized internationally, with features in Science and MIT Technology Review.

Read Our Blog Posts

Beyond Explainability: How We Are Redefining Interpretability in AI

April 6, 2026

AI interpretability has long been the focus, but what if it’s only part of the story? New research introduces model semantics, a framework for understanding what AI systems truly represent and how their internal structures connect to real-world phenomena.

Rethinking Molecular Drug Design: From Generation to Control

A New Perspective: From Random Generation to Semantic Control

What Is Semantic-Guided Diffusion?

Real-World Implications

Why This Matters Now

Final Thoughts

About The Authors

Read Our Blog Posts

Beyond Explainability: How We Are Redefining Interpretability in AI

Contact Us

About Us

Our Pages

Recent Publications

Events

News