Multi-model consensus looks like verification. Three models agree, so it must be right. But when the models share training data, architecture patterns, and cultural biases, agreement is just synchronized error.
Agreement ≠ truth. Sometimes it's just correlated mistakes.
Why Consensus Fails
Correlated Training Data
Most large language models train on similar web corpora. The same myths, the same biases, the same overrepresented perspectives. When they agree, they often agree on the same errors.
Architectural Similarity
Transformer architectures share fundamental patterns. Attention mechanisms have similar blindspots. Agreement may reflect shared architectural limitations, not convergent truth.
Cultural Homogeneity
Models trained predominantly on English text, Western perspectives, recent decades. They share cultural assumptions. Consensus may just mean "this is what the English-speaking internet believes."
How The Trap Springs
The Confidence Illusion
One model says something confidently. You're skeptical. You check with a second model - same answer. A third - same again. Your skepticism fades. But all three may have learned the same confident error.
The Verification Theater
You've built a multi-model pipeline for safety. Model A generates, Model B checks, Model C validates. It feels robust. But if they share the same blindspots, the check is theater.
The Minority Report Problem
What if one model dissents? You're trained to trust the majority. But the dissenting model might be the only one that escaped the shared error. Consensus silences the signal.
Where The Trap Bites
Medical Misinformation
Multiple models confidently repeat the same health myth because it appeared in many training sources. Consensus makes the myth harder to question. The Wikipedia entry was wrong, and everyone learned from it.
Legal Research
Models cite the same non-existent cases because they learned from each other's hallucinations in web content. Cross-model verification finds the same fake citations.
Financial Analysis
Models share consensus views about markets that reflect common biases. When they agree that something is safe, they may all be wrong in the same direction. The 2008 risk models agreed too.
Beyond Consensus Verification
Diverse Training Origins
If you must use multiple models, choose ones with genuinely different training data. A model trained primarily on academic papers vs. one trained on code vs. one trained on non-English web. Diversity reduces correlation.
Structural Diversity
Different architectures have different blindspots. A retrieval-augmented model vs. a pure transformer vs. a symbolic system. Agreement across structural diversity means more.
Human Orthogonality
Humans make different errors than models. A human checker adds uncorrelated perspective. Don't replace human judgment with multi-model consensus - that's trading one correlation for another.
Cryptographic Verification
zkML and proof systems let you verify computation without trusting judgment. Prove the math is correct rather than asking multiple models if the answer seems right. Verification by proof, not poll.
Ground Truth Anchors
Connect to external reality. Can you verify against primary sources, physical measurements, or formal proofs? Ground truth doesn't share model biases.
Are You In The Trap?
The Correlation Question
Ask: "Do my verification models share training data, architecture, or cultural context?" If yes, their agreement proves less than you think.
The Dissent Signal
When one model disagrees, do you investigate or override? If you consistently trust the majority, you're optimizing for consensus, not truth.
The Surprise Check
When was the last time your verification system caught something surprising? If it only ever confirms, it's not verifying - it's ratifying.
The verification trap is seductive because consensus feels like truth. When everyone agrees, doubt seems unreasonable. But correlated systems produce correlated errors. Agreement between things that learned together proves only that they learned together.
Real verification requires independence. Independent data. Independent architecture. Independent perspective. When those align, you might be approaching truth. When similar systems agree, you might just be counting the same vote multiple times.
Don't count votes. Verify independence. Then count votes.
