PromptLeak — Prompt-Injection Catalog (OWASP LLM01)
28 prompt-injection payloads across 8 classes: direct (DAN, fiction-framing, authority impersonation), indirect via RAG (poisoned docs, hidden HTML, web-search injection), system-prompt exfiltration, training-data extraction, jailbreaks, encoded payloads, multimodal (image-OCR, audio-injection), tool/agent abuse (SSRF, XPIA). Each payload includes 4-layer mitigations.
What it is
The shape behind LLM red-teaming + AI safety teams (Lakera, Robust Intelligence, Anthropic / OpenAI internal red-teams). 28 production-relevant prompt injections with the payload, the impact, the 4-layer defense stack.
What’s in it
- 28 payloads across 8 classes:
- Direct — DAN, refusal-bypass via fiction, authority impersonation, multi-turn escalation, tool-call injection, XPIA stored-injection (calendar event titles)
- Indirect (RAG) — poisoned wiki doc, email-body injection, web-search content injection, PDF hidden-text injection
- System-prompt exfil — “repeat the words above”, paraphrase requests, training-data extraction via repetition (the 2023 “poem poem” attack)
- Jailbreaks — grandma exploit, hypothetical, timeshift
- Encoded — base64, foreign-language, unicode confusables / zero-width
- Multimodal — image-OCR hidden text, adversarial pixel perturbation, audio injection
- Tool / plugin abuse — SQL injection, SSRF via URL-fetcher (AWS metadata exfil), email-tool exfiltration (Gemini/ChatGPT pattern)
- Context overflow — long-input distraction, in-context learning poisoning
- 4-layer defense per payload:
- System prompt — anchor role, mark untrusted data
- Input filter — pattern detection, encoding normalization
- Output filter — domain classifier, PII/role-swap detector
- Model behavior — adversarial fine-tuning, RLHF coverage, evals
- Frameworks per payload — OWASP LLM01-08, NIST AI RMF MEASURE.2.7, EU AI Act Art 15, MITRE ATLAS AML.T0051.
Why this shape
OWASP LLM Top-10 ranks prompt injection at #1. Every LLM-app team needs a working catalog. The pure-payload sites (Promptbench, AdvBench) leave out the mitigation stack. The pure-mitigation lists leave out the actual payloads. PromptLeak ships both — payload + impact + 4-layer defense + framework citations.
How it ships
Single HTML file, ~24KB. Zero dependencies. 28 payloads × 8 classes × 4-layer mitigation matrix in 230 lines of vanilla JavaScript.