When AI Meets the FDA: Building a Regulatory Layer for AI Development

The hardest software you'll ever ship is the software where getting it wrong has real consequences.


I've spent time in IVD - in vitro diagnostics, the machines that screen blood supplies and run the PCR tests that became household names during COVID. The kind of software where a misclassified sample means contaminated blood reaches a patient.

That experience rewired how I think about software development. Not because the code is harder - it's not - but because the consequences are real, the regulators are watching, and "move fast and break things" will get your product pulled from the market and your company sued into oblivion.

So when we built our third multi-agent architecture, we didn't just ask "how do you build software with AI agents?" We asked: "how do you build software that the FDA will approve?"

The answer wasn't to start from scratch. It was to build a regulatory layer on top of what we already had.

Why Regulated Software Is a Different Animal

If you've only built commercial software, here's the gap: in regulated development, the process is the product. The FDA doesn't just care whether your software works. They care whether you can prove it works, prove you thought about the risks, and prove that every requirement traces from user need to verified test result.

This isn't bureaucracy for its own sake. Medical device failures can have consequences ranging from misdiagnosis to patient harm, and in the worst cases, fatal outcomes. When something goes wrong, the first question is "can we trace back through the process to find the root cause?" Because the goal isn't blame. It's making sure it never happens again. That's why documentation and traceability matter so much. Without them, you can't find the root cause, and you can't fix it.

The key regulations:

Miss any of these and your 510(k) submission goes in the trash.

The Key Insight: Don't Rebuild, Layer

Our first instinct was to design a completely separate pipeline for regulated software. That's how most companies think about it too: "regulated development is different, so we need different tools."

But that's wrong. The development isn't different. The oversight is different.

A developer writing code for a medical device uses the same languages, the same patterns, the same testing approaches as a developer building a SaaS app. What changes is that every decision needs regulatory traceability, every risk needs documentation, and every test needs to map back to a requirement.

So instead of building a 10-agent monolith, we built AgentMedReg as a regulatory layer that wraps around our existing AgentForge development pipeline.

The Architecture: A Regulatory Layer + AgentForge

AgentMedReg adds four specialized regulatory agents that sit around AgentForge's development team:

                ┌──────────────────────────┐
                │   Regulatory Strategist   │ ← Runs FIRST and LAST
                │  (classification, pathway, │   (the "bookend")
                │   submission readiness)    │
                └────────────┬─────────────┘
                             │
              ┌──────────────┼──────────────┐
              │              │              │
              ▼              ▼              ▼
     ┌──────────────┐ ┌───────────┐ ┌──────────────┐
     │ Risk Manager │ │  Human    │ │Design Controls│
     │ (ISO 14971)  │ │ Factors   │ │    Lead       │
     │              │ │(IEC 62366)│ │   (RTM)       │
     └──────┬───────┘ └─────┬─────┘ └──────┬───────┘
            │               │              │
            └───────────────┼──────────────┘
                            │
                            ▼
              ┌─────────────────────────┐
              │      AgentForge         │
              │  (existing dev pipeline) │
              │                         │
              │  Orchestrator → Strategist│
              │  → Analyst → Architect   │
              │  → Developer → QA        │
              │  → DevOps → Monitor      │
              └─────────────────────────┘

The regulatory layer sets constraints before development starts and validates compliance after development finishes. AgentForge does what it already does: build software. It just does it inside a regulatory box.

The Four Regulatory Agents

Regulatory Strategist - Runs first and last. Classifies the device, determines the submission pathway (510(k), De Novo, PMA), maps applicable standards, and at the end, does a submission readiness assessment. This agent is the "bookend" - it frames everything and validates the result.

Risk Manager - Implements ISO 14971 end-to-end. Hazard identification, FMEA, risk estimation, risk controls, residual risk evaluation. This agent doesn't just list risks - it traces every risk control to a design requirement and every residual risk to an acceptance rationale.

Human Factors Engineer - IEC 62366-1 usability engineering. Use specifications, task analysis, critical task identification, formative and summative usability evaluations. In medical devices, the user interface IS a safety feature. If a clinician can misread a result because of bad UI, that's a design defect, not user error.

Design Controls Lead - Owns the requirements traceability matrix (RTM). Every user need traces to a design input, every input to an output, every output to a verification test, every test to a validation result. Gaps in the RTM are gaps in your submission.

How the Layers Talk

The regulatory layer doesn't micromanage AgentForge. It communicates through documents, the same way AgentForge's internal agents do.

Before development starts, the regulatory agents produce:

These documents feed into AgentForge as constraints. The Strategist and Analyst inside AgentForge incorporate them into requirements. The Developer builds to those requirements. QA tests against them.

After development finishes, the regulatory layer runs again:

If anything fails, it loops back. Not to rewrite the regulatory strategy, but to send specific issues back into AgentForge for resolution.

The "Bookend" Pattern

The biggest architectural insight from building AgentMedReg is what we call the Bookend Pattern: start and end your pipeline with a Constraints Agent.

The Regulatory Strategist runs first to set classification, applicable standards, and submission strategy. Then the entire pipeline runs. Then the Regulatory Strategist runs again to verify submission readiness - checking that what was built actually satisfies the regulatory framework that was defined at the start.

This pattern is so powerful that we're retrofitting it to our other architectures:

The insight: every pipeline should start and end with a constraints agent. Define the box, then build in the box, then verify you're still in the box.

Why Composability Beats Monoliths

The old approach would have been to build a single, massive regulated development pipeline from scratch. That's how most compliance-heavy organizations think: specialized tools for specialized work.

The composable approach is better for three reasons:

  1. You don't duplicate effort. AgentForge already knows how to build software. Why rebuild that capability with regulatory-specific agents that are worse at coding?
  2. Improvements propagate. When we make AgentForge's QA agent smarter, AgentMedReg gets that improvement for free. When we add a Code Reviewer agent to AgentForge, every pipeline that uses it benefits.
  3. Domain layers are reusable. The regulatory layer we built for FDA could be adapted for other regulated domains: fintech (SOX, PCI-DSS), automotive (ISO 26262), aerospace (DO-178C). Different regulations, same pattern: domain experts wrap a development team.

This is how real organizations work. You don't hire a completely separate engineering team for every regulated product. You have a dev team, and you add compliance specialists who guide that team's work.

What Building This Taught Us

Creating a regulated layer forced a level of rigor that exposed weaknesses in our other architectures:

For AgentForge:

For AgentMinds:

Universal lesson: Building the most regulated pipeline first would have made all our architectures better from the start. The FDA doesn't add unnecessary steps. Every requirement exists because something went wrong when it was missing.

The Composable Vision

Here's where this is heading. Three architectures, each composable:

They're not three separate products. They're layers that compose. A medical device company might use all three: AgentMinds to analyze the market opportunity, AgentMedReg + AgentForge to build the regulated software. A startup might just use AgentForge. A consulting firm might only need AgentMinds.

That's the real power of multi-agent architecture. Not building bigger pipelines, but building composable teams that snap together based on what the problem requires.

The Bottom Line

Three architectures, three lessons:

  1. AgentMinds taught us that AI agents need process discipline, not just knowledge
  2. AgentForge taught us that pipeline architecture matters - who talks to whom, in what order
  3. AgentMedReg taught us that domain expertise should layer on top of development, not replace it

If you're building AI agents for a regulated domain, don't start over. Build a regulatory layer around a development pipeline that already works. The constraints make the architecture better, and the composability makes it reusable.

And if you're in medical devices wondering whether AI can work in your regulatory environment - yes. But only if the AI understands the regulations as deeply as it understands the code.


This is part 3 of a series on multi-agent AI architecture. Part 1: How We Built an AI Consulting Team covers strategic analysis. Part 2: One Agent Per Layer covers software development. All three architectures are developed by Nananami.