Redteam AI

How it works

Redteam AI runs your submission through a four-stage adversarial loop. Four different AI models — each from a different US-owned company — evaluate your decision from opposing perspectives. No model runs two consecutive stages. The Prosecutor and Refiner are randomly selected on each run to prevent epistemic lock-in.

The four-stage loop

0
ClarifierGrok (xAI)Fixed

Grok reads your submission and determines whether it is specific enough for adversarial analysis. If a critical field is too vague to produce specific output, it asks one clarifying question before the loop continues. This stage is fixed — Grok runs it every time.

Output — Proceeds to Stage 1, or returns a single clarifying question.
1
AdvocateClaude (Anthropic)Fixed

Claude constructs the strongest honest version of your idea as it currently stands. It does not fix weaknesses — it presents the best case. It identifies the real customer, the mechanism of value, and the three critical assumptions the entire idea depends on. This stage is fixed — Claude runs it every time.

Output — The strongest case for your idea, with critical assumptions made explicit.
2
ProsecutorGPT-5 or Grok (rotates)Rotates

A randomly selected model — GPT-5 or Grok — attacks the idea at its load-bearing points. It reads the Advocate's strongest case and finds what would actually cause this to fail. Every risk must reference specific mechanisms, precedents, or real-world behavior. Generic risks are not permitted. The Prosecutor and Refiner are selected randomly on each run to maximize epistemic diversity.

Output — Three risks ordered by severity: Fatal, Serious, and Manageable. Plus the hidden assumption most likely to be false and the uncomfortable truth.
3
RefinerGemini or GPT-5 (rotates)Rotates

A different randomly selected model — Gemini or GPT-5 — synthesizes the Advocate's case and the Prosecutor's risks into a meaningfully improved version. It starts with the Fatal risk and does not avoid it. It resolves what can be resolved, names what cannot, and produces a diff table showing exactly what changed and why.

Output — A revised strategy, risk resolutions for each identified risk, and a before/after diff of what changed.

Why the Prosecutor and Refiner rotate

Different AI models have different epistemic tendencies, training data emphases, and failure modes. A model that is consistently the Prosecutor would develop predictable attack patterns. By randomly selecting the Prosecutor and Refiner on each run — subject to the constraint that no model runs two consecutive stages and no model appears more than once — each analysis reflects a fresh independent perspective.

The model chain used for your specific run is logged and shown in your results and your PDF, so you know exactly which models evaluated your decision.

Start red team →