PromptLeak — Prompt-Injection Catalog (OWASP LLM01)

What it is

The shape behind LLM red-teaming + AI safety teams (Lakera, Robust Intelligence, Anthropic / OpenAI internal red-teams). 28 production-relevant prompt injections with the payload, the impact, the 4-layer defense stack.

What’s in it

28 payloads across 8 classes:
- Direct — DAN, refusal-bypass via fiction, authority impersonation, multi-turn escalation, tool-call injection, XPIA stored-injection (calendar event titles)
- Indirect (RAG) — poisoned wiki doc, email-body injection, web-search content injection, PDF hidden-text injection
- System-prompt exfil — “repeat the words above”, paraphrase requests, training-data extraction via repetition (the 2023 “poem poem” attack)
- Jailbreaks — grandma exploit, hypothetical, timeshift
- Encoded — base64, foreign-language, unicode confusables / zero-width
- Multimodal — image-OCR hidden text, adversarial pixel perturbation, audio injection
- Tool / plugin abuse — SQL injection, SSRF via URL-fetcher (AWS metadata exfil), email-tool exfiltration (Gemini/ChatGPT pattern)
- Context overflow — long-input distraction, in-context learning poisoning
4-layer defense per payload:
1. System prompt — anchor role, mark untrusted data
2. Input filter — pattern detection, encoding normalization
3. Output filter — domain classifier, PII/role-swap detector
4. Model behavior — adversarial fine-tuning, RLHF coverage, evals
Frameworks per payload — OWASP LLM01-08, NIST AI RMF MEASURE.2.7, EU AI Act Art 15, MITRE ATLAS AML.T0051.

Why this shape

OWASP LLM Top-10 ranks prompt injection at #1. Every LLM-app team needs a working catalog. The pure-payload sites (Promptbench, AdvBench) leave out the mitigation stack. The pure-mitigation lists leave out the actual payloads. PromptLeak ships both — payload + impact + 4-layer defense + framework citations.

How it ships

Single HTML file, ~24KB. Zero dependencies. 28 payloads × 8 classes × 4-layer mitigation matrix in 230 lines of vanilla JavaScript.

Open the tool →