Why We Need Frameworks
A Greek philosopher, a Terminator, and TheDude walk into a trust-building workshop. The philosopher wants to contemplate the nature of reliability. The Terminator wants to optimize trust metrics. TheDude just wants to know: "Man, can I count on you when it matters?"
Most medical AI critique falls into two camps: pure hype ("this will revolutionize healthcare!") or pure fear ("AI will kill us all!"). Neither is useful.
We need frameworks that ground our analysis in reality—biological reality, clinical reality, and liability reality. Not theoretical concerns. Not abstract ethics. Real-world conditions that determine whether AI helps or harms.
These six frameworks guide every case analysis on this site:
The Velociraptor Test
What It Means
Evolution is the ultimate debugger. Natural selection refined threat detection, pattern recognition, and contextual judgment over 3.8 billion years. Every successful adaptation was tested against survival pressure. Every failed approach was eliminated from the gene pool.
AI trained on text and images for 18 months missed some edge cases.
Why It Matters
When a patient walks into a clinic pale, sweaty, and clutching their chest, a human physician doesn't need an algorithm to know something's wrong. The velociraptor brain—that ancient threat detection system debugged over millions of years of "get it wrong and you die"—already knows.
AI sees: text describing symptoms, maybe an image, statistical correlations in training data.
Humans sense: actual environmental threats that required immediate, correct response or you didn't survive to reproduce.
Application to Medical AI
- Pattern recognition without survival pressure is just sophisticated guessing
- Training data doesn't include "die if you're wrong" feedback
- Context matters, and context comes from environmental sensing
- Confidence without consequence creates dangerous overreach
Example: MedGemma MRI Case
The AI confidently diagnosed from a single MRI slice because it faced no consequences for being wrong. A human radiologist knows: recommend brain surgery based on inadequate imaging and someone's skull gets opened. That's selection pressure. That's why humans say "I need more views" when they need more views.
The 10 Billion Sensors Principle
What It Means
Intelligence isn't just processing power. It's environmental awareness. And environmental awareness requires sensing.
Humans have:
- ~126 million photoreceptors (vision)
- ~16,000 hair cells (hearing)
- ~10 million olfactory receptors (smell)
- ~2-4 million mechanoreceptors (touch, proprioception)
- ~10,000 taste receptors
- Billions of nociceptors (pain detection)
All constantly sampling the environment, integrating information, detecting threats, sensing context.
AI has: whatever pixels or text you give it.
Why It Matters
Clinical medicine depends on sensing. A surgeon can feel tissue tension. A cardiologist can hear subtle murmurs. An emergency physician can smell ketoacidosis before lab results confirm it. A pediatrician can see when a mother's concern goes beyond typical parental worry.
✅ What Humans Detect
- 👀 Diaphoresis (sweating)
- 👃 Ketoacidosis odor
- 👂 Voice tremor
- 🤚 Skin temperature
- 🧠 Patient fear/confusion
- ⏰ How fast things change
❌ What AI Detects
- 📊 Text patterns
- 📊 Image pixels
- 📊 Statistical correlations
- No smell. No touch. No hearing. No environmental context. No temporal awareness.
Application to Medical AI
- The sensing gap is unbridgeable with current technology
- AI cannot detect what it cannot sense
- Clinical decision-making requires multi-modal sensory integration
- Pattern recognition without sensing is incomplete data processing
The Malpractice Insurance Reality Check
What It Means
I pay malpractice insurance. I've been paying it for 20 years. I pay it because when I make a mistake, someone gets hurt, and I'm accountable for that harm.
Google doesn't pay malpractice insurance. OpenAI doesn't pay malpractice insurance. They release systems with disclaimers: "Not for clinical use. No warranty. Use at your own risk."
But here's the thing: when those systems are used clinically (and they will be), who faces consequences?
Why It Matters
Accountability structures shape behavior. When you face meaningful consequences for failures, you build systems differently. You test more carefully. You validate more thoroughly. You acknowledge limitations honestly.
When you face zero consequences, you optimize for different metrics. Speed. Capability. Impressive demos. Market share.
The Accountability Asymmetry
❌ AI Companies
Liability: Zero (disclaimer protected)
Malpractice Insurance: $0
Consequences: None when system fails
Incentives: Capability, speed, market adoption
❌ Physicians
Liability: Complete
Malpractice Insurance: $50K-200K+ annually
Consequences: Lawsuits, license loss, career ending
Incentives: Patient safety, accuracy, validation
Application to Medical AI
- Until developers face consequences, they're not motivated to prevent failures
- Physicians bear 100% of liability for trusting AI systems
- This asymmetry creates perverse incentives for rapid deployment
- Real accountability requires meaningful consequences for failures
Intelligent Humility
What It Means
Intelligent Humility is a design principle: build systems that know what they don't know.
Not as a guardrail added later. Not as a disclaimer in the terms of service. As an architectural feature from the ground up.
When an AI system encounters a query outside its validated knowledge domain, it should say: "I don't have reliable information on this" rather than generating confident-sounding nonsense.
Why It Matters
Most AI failures in medicine come from confident confabulation—retrieving tangentially related content and presenting it as if it answers the question.
The most dangerous medical statement isn't "I don't know." It's "I'm confident" when you shouldn't be.
What Intelligent Humility Looks Like:
Query: "Should I increase the patient's dosage?"
System Without Humility: Retrieves information about the medication, generates confident-sounding dosing recommendations based on pattern-matching, presents with citations.
System With Humility: "I don't have access to this patient's complete medical record, current medications, lab values, or contraindications. Dosing decisions require comprehensive clinical context I don't possess. This requires physician judgment."
Application to Medical AI
- Build systems that can't generate responses outside validated domains
- Constrain knowledge sources to curated, validated content
- Eliminate hallucination through architecture, not filtering
- Make "I don't know" a first-class output, not a failure state
Content-Controlled Intelligence
What It Means
Most medical AI systems are trained on everything: peer-reviewed journals, Reddit threads, blog posts, that one article about essential oils curing cancer, and approximately 47 million pages of SEO-optimized garbage.
When they retrieve information, they're pulling from all of that, with no way to distinguish reliable from unreliable sources.
Content-Controlled Intelligence flips this: constrain the AI's knowledge to validated, curated sources. If it's not in the verified corpus, it doesn't exist for the AI.
Why It Matters
Hallucinations happen when AI systems try to fill knowledge gaps by generating plausible-sounding content. The solution isn't better hallucination detection—it's preventing hallucination through architectural constraint.
Case Study: EdAI Systems
We generate medical board certification questions using Claude Sonnet constrained to StatPearls© content only. Zero hallucinations over 18 months. How? The AI literally cannot access information outside the curated medical corpus.
Result: 100% customer retention across four medical specialty boards, producing 20% of actual certification exams for two boards.
The Architecture
- Curate validated knowledge sources (peer-reviewed, board-approved)
- Constrain AI access to only these sources
- When query is outside validated domain → "I don't know"
- No retrieval from general internet
- No pattern-matching from unreliable training data
Application to Medical AI
- Garbage in, garbage out—so control what goes in
- Prevention through constraint beats detection through filtering
- Validated sources eliminate need for hallucination detection
- Specialization over generalization for high-stakes domains
Evolution as Debugger
What It Means
Evolution is the most rigorous testing framework ever devised. Every organism alive today represents a lineage that survived countless challenges: predators, disease, environmental change, resource competition, mate selection.
Every adaptation was field-tested under survival pressure. Failed approaches were eliminated. Successful strategies were refined over millions of generations.
The result: biological systems with exquisite sensing, contextual judgment, uncertainty tolerance, and threat detection capabilities.
Why It Matters
We're trying to replicate human intelligence using algorithms trained on text. But human intelligence isn't separable from:
- Embodied sensing (10 billion sensors)
- Evolutionary selection pressure (3.8 billion years)
- Environmental context (real-time threats and opportunities)
- Consequence awareness (survival depends on accuracy)
AI has none of these. It's pattern-matching without the debugging that makes pattern-matching reliable.
What Evolution Debugged
- Threat Detection: False negatives killed you; humans evolved to be slightly paranoid
- Uncertainty Tolerance: Overconfidence killed you; humans evolved appropriate caution
- Contextual Sensing: Missing context killed you; humans evolved multi-modal integration
- Rapid Assessment: Slow decisions killed you; humans evolved fast pattern recognition
- Social Cues: Misreading others killed you (or prevented reproduction); humans evolved sophisticated empathy
Application to Medical AI
- Don't assume AI can replicate evolved capabilities without similar selection pressure
- Trust biological intuitions that survived millions of years of testing
- Maternal instinct (threat detection) > algorithmic reassurance
- Physician pattern recognition (refined through consequences) > statistical correlation
- Evolution's false positive bias (caution) > AI's confidence bias
How We Apply These Frameworks
Every case in our archive is analyzed through all six lenses:
- Velociraptor Test: What survival pressure would have prevented this failure?
- 10 Billion Sensors: What did the AI fail to sense that humans would detect?
- Malpractice Insurance: Who faces consequences when this fails?
- Intelligent Humility: Should the system have said "I don't know"?
- Content Control: Would constraining knowledge sources have prevented this?
- Evolution as Debugger: What evolutionary wisdom was ignored?
This isn't anti-AI critique. It's pro-patient-safety analysis grounded in biological and clinical reality.
We're not asking "can AI do this?" We're asking "should AI do this, and if so, how do we build systems that don't kill people?"
← Back to Medical AI Gone Wrong