Beyond Explainability: How We Are Redefining Interpretability in AI

April 6, 2026|byNEC Labs America|inNews|tagsai, ai in healthcare, deep learning, human ai interaction, hussain mohsen, interpretability, jonathan warrell, machine learning, mark gerstein, michael gancz, model semantics, prashant emani, pub to post series, scientific ai

As artificial intelligence becomes more deeply embedded in scientific discovery and healthcare, one challenge continues to stand out: understanding what models actually know.

This paper takes a major step toward answering that important question about what AI models know.

Beyond Explainability How We Are Redefining Interpretability in AI

Jonathan Warrell from our Machine Learning team is the lead author of “Interpretability and Implicit Model Semantics in Biomedicine and Deep Learning,” written in collaboration with Michael Gancz, Hussein Mohsen, Prashant Emani, and Mark Gerstein of Yale University. The paper, published in Nature Machine Intelligence, introduces a new framework for thinking about AI systems.

A New Perspective: Interpretability Is Not Enough

Together, the team introduces a broader framework for understanding AI, shifting the conversation beyond explainability toward a more fundamental question: what do models actually represent?

Much of today’s AI discussion centers on interpretability.

Can we explain a model’s decisions? Can we visualize what it has learned?

Warrell says the research challenges one of the most widely held assumptions in artificial intelligence.

“Interpretability is only one aspect of a model’s semantics. Models do more than generate outputs. They encode relationships and structure about the world, often in ways that are not directly accessible to human understanding. If we focus only on interpretability, we risk overlooking the deeper scientific meaning captured within these systems.”

What Are Model Semantics?

Borrowing from the philosophy of science, the paper defines model semantics as the way a model represents real-world phenomena. This includes:

What patterns the model captures
How internal representations relate to real-world variables
Whether those representations align with scientific reality

Importantly, these semantics can be implicit. Deep learning systems often learn features that are highly predictive but difficult to interpret directly. As the authors explain, the goal is not just to make models explainable, but to understand what they actually represent and whether those representations are meaningful.

Real-World Implications

This shift from interpretability to semantics has major implications, especially in biomedicine.

Trusting High-Performance Models

In healthcare, accuracy can be life-saving. But interpretability is often required for trust. This research suggests a more nuanced approach. A model may be:

Hard to interpret
Yet still scientifically valid

If its learned semantics align with real biological processes, it can still be trustworthy, even without full transparency.

Advancing Drug Discovery and Genomics

These systems often uncover patterns that humans do not already understand. By focusing on semantics, researchers can evaluate whether those patterns correspond to real biological phenomena, opening the door to new discoveries rather than just explanations. Deep learning models are increasingly used to:

Predict protein structures
Model gene expression
Identify disease mechanisms

Bridging AI and Scientific Theory

One of the most powerful ideas in the paper is that AI models can function similarly to scientific theories. The framework provides a way to formally analyze this connection, helping scientists move from “black box” skepticism to structured validation of AI knowledge. Instead of simply fitting data, they can:

Encode hypotheses about the world
Capture latent structures in complex systems
Provide new ways of understanding phenomena

Why This Matters Now

As AI systems become more complex and widely deployed, the limits of traditional interpretability are becoming increasingly clear. Models are growing more powerful yet less transparent, real-world applications demand both high performance and trust, and scientific domains require deeper understanding, not just accurate predictions. Our work with Yale offers a path forward by reframing the problem. Instead of asking whether we can explain a model, the more important questions are what the model actually represents and whether those representations correspond to reality.

Final Thoughts

This publication marks an important shift in how we think about AI. By expanding the conversation beyond interpretability to include implicit model semantics, the authors provide a more complete framework for evaluating modern machine learning systems. For industries like healthcare, where accuracy, trust, and discovery intersect, this perspective could prove transformative. And as Jonathan Warrell and his co-authors make clear, the future of AI understanding will not be defined by how well we can explain models, but by how well we can connect them to the real world.

About Jonathan Warrell

Jonathan Warrell is a Researcher in the Machine Learning Department at NEC Laboratories America. He received his BA in Music from the University of Cambridge, his Master’s as well as his PhD in Music Theory and Analysis from King’s College London, an MS in Computer Science (Distinction) from the University College London, and he went on to postdoctoral work in computational genomics and neuroscience in Yale’s Department of Molecular Biophysics and Biochemistry.

His research focuses particularly on computational biology, spatial genomics, and optimization theory. At NEC, Dr. Warrell contributes to projects involving molecular design, large-scale genomic reasoning, compositionality of diffusion models, biomarker discovery, reinforcement learning, and variational methods. His work leverages hybrid approaches that combine symbolic and neural methods, particularly for solving discrete optimization problems in genomics.

He collaborates closely with NEC Bio, NEC OncoImmunity, and NEC’s Biometric Research Laboratories on developing interpretable and efficient methods for solving biological and medical problems. He has published widely in high impact journals such as Science, Cell, Nature Genetics and Nature Machine Intelligence.

Publication to Blog Post Series

Our Publication-to-Blog Post Series highlights the real-world impact of our latest research, translating complex innovations into practical applications. From AI and machine learning to optical networking and intelligent systems, we showcase how our work goes beyond theory to address real-world challenges. Explore how cutting-edge research at NEC Laboratories America is driving measurable outcomes across industries.

Rethinking Molecular Drug Design: From Generation to Control

April 24, 2026

Designing drug molecules is no longer just about generation, but control. NEC Laboratories America introduces MolDiffdAE, a diffusion-based framework that enables precise, multi-objective tuning of 3D molecular properties. By learning a semantic space, researchers can efficiently guide design, accelerating drug discovery and exploration of chemical space.

Driving the Future of Scene Editing with HorizonForge

April 17, 2026

HorizonForge introduces a new approach to driving scene generation, enabling precise control over both vehicle behavior and identity. By allowing arbitrary trajectories and flexible vehicle insertion, it creates realistic, scalable simulations for autonomous driving, digital twins, and advanced AI development.

Beyond Explainability: How We Are Redefining Interpretability in AI

A New Perspective: Interpretability Is Not Enough

What Are Model Semantics?

Real-World Implications

Why This Matters Now

Final Thoughts

About Jonathan Warrell

Publication to Blog Post Series

Rethinking Molecular Drug Design: From Generation to Control

Driving the Future of Scene Editing with HorizonForge

Contact Us

About Us

Our Pages

Recent Publications

Events

News