Deploying GenAI in CMC: Why Structure Comes Before Scale

AI is Here, #PharmaGeeks. How Ready Are We, Really?

By now, we’ve all heard the siren song: AI is the transformative technology that will solve all the many operational woes of drug development.

Forget about “good old-fashioned” machine learning algorithms that can “just” streamline risk analyses and regulatory document generation. Wherever you look, industry thought leaders can’t stop glowing about the ways AI will inevitably reshape drug development from turning clumsy document-driven processes into powerful agentic workflows to transforming vast troves of unstructured data into fuel for supercharged CMC productivity.

So, unleash the agents and watch new modalities come online overnight, right? All we need’s the right prompt? These technologies have immense potential, but not even the shiniest silver bullet flies by itself. Not without the right foundational knowledge to aim it.

But before we peel off that wet blanket, let’s take a closer look at what it truly takes to support and deploy these technologies at scale, and ensure the results they deliver will pass muster where it matters: down in Silver Spring.

Balancing the Promise and the Probabilistic Realities

Now, I’ll be the last to say that possibilities aren’t tantalizing.

Imagine: You’re preparing for a raw material change, so you prompt your GenAI-enabled CMC system to evaluate the predicted impact on your processes. While you sip your coffee/tea/matcha/Celsius, your GenAI autopilot goes to work:

Your system is powered by a bleeding-edge LLM that has already been fed and trained on your latest qTPP, so it knows exactly which CQAs and CPPs may be impacted.
The system tags in an agent to simulate how the change will impact leachables and oxidation CQAs, another to run a risk analysis to assess the criticality of that impact, and one more to cross-check how the changes will impact your clinical CDMO’s version of your CPPs.
It balks not for a second when it finds historical and contextual data buried in unstructured sources. After all, LLMs eat unstructured data three meals a day, effortlessly extracting insights from PDFs, emails, and spreadsheets alike.

Boom: Assessment complete. The results read out in a perfectly templated risk analysis. One prompt, clear results, zero meetings. Or at least that’s the version on the box.

Here’s a more likely reality:

There are 16 versions of your qTPP in your database, 4 of which are marked FINAL. The LLM has to guess which is actually the FINAL-FINAL (V3, of course). It might decide to reference several different versions.
Past CQA and CPP assessments also conflict, and the most up-to-date ones were never written down by your last process engineer before he left for another company. So the agents also made their best guess (or made up some data to fill the gaps).
The last agent ran headlong into your CDMO’s paper-based version of their iteration of your processes, so it defaulted back to your in-house data.
None of these potential pitfalls are flagged in the very confident outputs your system received. So there’s no way to catch any errors or adjust the intake and analysis process.

It might just take a few meetings to get to the bottom of that.

Here’s the ice in that cold cup of coffee: Models don’t inherently “understand” the data presented to them. They identify patterns, make predictions, and draw inferences based on probabilistic calculations. True understanding, even if it’s “just” convincingly simulated, requires something more: data that’s connected, interpreted, and contextualized.

In other words, not data: knowledge.

Models don’t inherently “understand” the data presented to them. They identify patterns, make predictions, and draw inferences based on probabilistic calculations. True understanding, even if it’s “just” convincingly simulated, requires something more: data that’s connected, interpreted, and contextualized.

Data vs knowledge: Is that a flat black surface or the I95?

Next time you hop in a Waymo, ask yourself this: Did you stop to think if it knows what a stop sign is?

Probably not. You expect it to know and understand the rules of the road, the signs and signals that guide traffic, and whether that’s a detour marker or your neighbor’s “In this house we believe” sign.

Any intelligent AI system for CMC will need its own clear, detailed, contextual understanding of the world in which it operates. But in the world of technical development, a model doesn’t just need to know “this red octagon means ‘stop.’” It needs to understand how certain material characteristics will impact the solubility and tolerable oxidative stress of a monoclonal antibody, whether either can be changed and stay within a given set of CPPs, and many more complex, multi-dimensional facts.

Just like the Waymo, though, any gaps or uncertainties in that understanding can raise serious risks. Your self-driving taxi might, oh, confuse a pedestrian with a speed bump. Or, in a technical development model, it might very easily:

Hallucinate incorrect information: No one wants their Waymo to see “100” and make its own decision about whether that means MPH or KMH. Similarly, without detailed knowledge of what parameters impact what attributes, an AI model may incorrectly suggest a combination of non-active drug ingredients that could destabilize a compound.
Misinterpret regulatory context: Imagine your Waymo took the long way because it misread a detour as a road closure. An AI model without detailed regulatory knowledge could misunderstand updated guidelines, leading to costly delays in approvals.
Untraceable routes: The same way you want to know where your Waymo is taking you, AI for CMC needs clear traceability. Without a defined knowledge model that clearly links outputs to process trails and causal series, you’ll never be able to show the directions you took to get to your destination.

So how does any model, general, automotive, or CMC, learn the rules of the “road” it will operate on? Typically, the answer you hear is “data”—lots, and lots, and lots, and LOTS of it. And that IS true… but it’s only part of the story. Just like the colors green, red, and yellow are only part of understanding how traffic works.

To understand, interpret, and eventually design drug development processes, GenAI models will need a much more sophisticated, dimensional, and longitudinal understanding of some equally complex subjects and workflows. So how do we get from “APX0013_qTPP_REVISED_032724_051725 UPDATE_FINAL_FINALL_REV3.xls” to that?

Hint: Not by feeding vast amounts of unstructured data to an LLM and hoping for the best. Here’s what we truly need.

“Knowledge-first” AI: Building ground truth for generative capabilities in CMC

For technical development applications of GenAI, success has to mean much, much more than swift, natural-sounding, and reasonably credible responses to a prompt. It has to mean something much more complex: accurate, consistent, traceable outputs based on ALCOA++-compliant training data with regulatory-ready audit paths.

To deliver that, GenAI models will really, really need to know what they’re talking about. And “know” in the sense of contextual awareness, historical perspective, embedded compliance like why things are the way they are, how they got there, how and why that might change, and what might happen when that changes knowledge.

Knowledge is data made meaningful. Creating that meaning takes several purposefully implemented layers:

Data: Like the map in a Waymo’s brain, data is the raw material demonstrating the objective performance, results, and variation of any process. And like any raw material, it needs to be leveraged in a well-defined and thoughtfully designed way.
Information: To be truly understood, data then needs to be organized into a semantic layer that defines the connections between datapoints, establishes correlations, and defines dependencies. In Driver’s Ed, we all learned what red, yellow, and green mean in this specific context.
Knowledge: Finally, context truly is king… and queen, and jester, and the whole royal court. Knowledge comes from understanding how connections create mutually influential relationships, how contingencies are different from causal factors, contingencies, what changes can simply be allowed to happen and which create risk factors. This is the layer where what, when, where, how, and why all connect, creating a full understanding of a process.

For AI systems to deliver accurate, trustworthy results, each of these layers needs to be fully established and integrated into a robust knowledge model that can guide the AI’s decision-making processes. Just like a Waymo needs to understand that red, yellow, and green can mean both “stop, prepare to stop, go” and “stop, yield, this way to Sausolito.”

Establishing those layers, of course, takes rules, processes, and technologies of its own: As our CEO Yash mapped out at the Digital CMC Summit, CMC knowledge models are built through detailed SOPs, a dedicated platform designed to facilitate knowledge management, and a commitment to foster an organizational culture where curating CMC knowledge is second nature. They’re all essential transformation behind the transformation: turning raw data into structured information, and information into a foundation of contextual knowledge that supports consistently accurate and traceable inference.

Every layer is essential to the success of GenAI, meaning applications that can be trusted to return regulatory ready outputs based on a detailed understanding and correct interpretation of CMC knowledge. But don’t take my word for it: listen to what they’re saying in Silver Spring.

For AI systems to deliver accurate, trustworthy results, each of these layers needs to be fully established and integrated into a robust knowledge model that can guide the AI’s evaluatory and decision-making processes. Just like a Waymo needs to understand that red, yellow, and green can mean both “stop, prepare to stop, go” and “stop, yield, this way to Sausolito.”

The FDA’s stance: Black box AI need not apply

You don’t need bat’s ears to hear what US regulators have to say about AI, not when they’re blasting “all in on AI” loud enough to bring John Entwistle back from the dead. Listen no further than their timeline for “Aggressive Agency-Wide AI Rollout”, an announcement that effectively put laggards and skeptics on blast across the drug development industry.

The message is loud and clear: The FDA is firmly in the AI driver’s seat, with its foot on the floor, and the call to adopt this technology is not a suggestion.

But before you rush to deploy the latest Anthropic model, don’t forget the other thing the FDA is shouting into the mic: When it’s applied to CMC, GenAI will need to clear a much, much higher bar for transparency, control, qualification, and continuous validation. Sound familiar?

It’s all there in the FDA’s recent initial guidance on AI use. When AI outputs are submitted for review, they’ll be held to an already familiar standard: auditable, explainable, and guided by demonstrable understanding of how the outputs came to be, how they can be replicated, and how they can be controlled.

In other words, a paid ChatGPT plan won’t cut it. FDA, EMA, or any other authority, don’t expect approval on the use of models that make unexplainable or unverifiable claims.

So what will they be looking for? Here’s what we can already see coming in the FDA’s initial guidance.

Rules for an evolving road: What to expect when the FDA reviews your CMC AI model

While the newest FDA’s guidance is directionally high-level and will undoubtedly be fleshed out over time, it establishes clear parallels between how the Agency currently evaluates processes and the way it aims to evaluate models. Silver Spring has made it clear it plans to hold AI models to the same standards as manufacturing processes:

Deep visibility: Regulators will likely want to see as deeply into AI models development as they do into process development, and into how those methods will be documented, controlled, and continually validated. As with your PFD, so with your model training, testing, and maintenance processes.
Detailed Explainability: The FDA is taking a page out of middle school algebra and asking drug developers to show their work when they leverage GenAI outputs. Applicants already expect to demonstrate what raw materials they use, how they define on-spec results, and how they control their processes. The same will be true of AI’s raw material: the data it’s tested and trained on.
Continuous validation: Just like manufacturing processes, the FDA clearly expects to see how applicants plan to maintain the level of quality AI models demonstrate at submission. Those models will need their own form of CPV, a purposeful and risk-based approach to maintaining their accuracy, preventing drift, aligning with current best practices.

TLDR; If you can’t prove how your model is designed, trained, tested, and maintained, don’t expect approval. Full stop. But to show the work the FDA expects to see, it needs to be organized, historicized, and contextualized. Robust raw data won’t do, and neither will the outputs from even the most advanced LLM chewing through unstructured data lakes.

CMC programs need to know exactly what “correct” looks like, and how to arrive at that conclusion before they can trust any model to find that answer for itself.

DGMW: Modern AI models hold great potential, for CMC programs and far beyond. But to be the magic bullet it’s supposed to be, it needs purposeful aim, clear sights, and a well-defined target. We’ll always have unstructured data, the “messy paper trail of being human,” and the need for efficient ways to process and extract value from it. But doing so in a technical development workflow, we’ll need special kinds of models, trained, tested, and validated in ways as special as the molecules it will support.

And to create those models, we’ll need to know (demonstrably, controllably, traceably know) how to guide the engines we hope will power tomorrow’s drug development.

The path forward starts with a foundation of knowledge

AI is here, and yes you can be ready to capitalize on its applications for CMC. But without a robust, structured knowledge foundation, even the most sophisticated AI models can stumble. And with new and evolving regulatory guidance, CMC leaders can’t afford to miss out.

Building and maintaining the foundation needed to power AI models isn’t a one-time exercise: it’s a continued commitment to quality, accuracy, and programmatic change. The organizations that invest in this foundation today will be the ones leading tomorrow’s CMC innovation; organizations that invest in structured data now will have the competitive edge in AI-powered CMC.

And if you’re ready to put your program’s foundation in place, we’d be happy to help you get started.

GET IN TOUCH

Ready for an expert partner on your AI journey?

Tag us in any time. We’re happy to help you explore responsible and effective ways to harness AI’s power and potential for CMC.

Patrick Riordan

Content Marketing Manager, QbDVision

Deploying GenAI in CMC: Why Structure Comes Before Scale

AI is Here, #PharmaGeeks. How Ready Are We, Really?

Balancing the Promise and the Probabilistic Realities

Data vs knowledge: Is that a flat black surface or the I95?

“Knowledge-first” AI: Building ground truth for generative capabilities in CMC

The FDA’s stance: Black box AI need not apply

Rules for an evolving road: What to expect when the FDA reviews your CMC AI model

The path forward starts with a foundation of knowledge

Ready for an expert partner on your AI journey?

Patrick Riordan

Recent posts

How QbDVision helped Bayer eliminate up to 80% of tech-transfer-related meetings

Preparing for a Perfect Storm: How Digital CMC is Empowering a Smarter, More Agile Future for Biopharma

Inside the M4Q(R2) Draft: How it Embeds Digital CMC in the CTD

Tina Beaumont

Whitney Pung

Tommy Cronin

Christoph Pistek

Kevin Healy

Devendra Deshmukh

Lewis Shipp

Mike Greene

Bill Pasutti

Victor Goetz

Isabel Guerrero Montero

Vijay Raju

Andy Zheng

Tim Adkins

Ravi Medandravu

Barbara Tessier

Luke Guerrero

Michael Stapleton

Yash Sabharwal​

Laurent Lefebvre

James Maxwell

Paul Denny-Gouldson

Chris McCurdy

Isabell Hagemann

Ganga Kalidindi

Fran Leira

Florian Aupert

Devendra Deshmukh

Mark Fish

Chris Puzzo

Victor Goetz, Ph.D

Rachelle Howard

Vijay Raju

Greg Troiano

Pat Sacco

Diana Bowley

Robert Dimitri, M.S., M.B.A.

Devendra Deshmukh

Grant Henderson

Ryan Nielsen

Shameek Ray

Max Peterson​

Michael Stapleton

Matthew Schulze

Daniel R. Matlis

Kir Henrici

Oliver Hesse

John Maguire

Chris Kopinski

Tim Adkins

Blake Hotz

Anthony DeBiase

Andy Zheng

Sue Plant

Yash Sabharwal​

Joschka Buyel

Luke Guerrero

Gloria Gadea Lopez

Speaker Name

Yash Sabharwal

Max Peterson

Yash Sabharwal