Red teamingLLM appsCI-friendly

Ship AI that can take a punch.

SENTINEL is a modular AI security + quality testing suite for LLM apps — built to measure prompt-injection resistance, detect hallucinations, probe data leakage, and produce reports your team can actually act on.

Modules
6
Attack patterns
85+
Outputs
PDF
Targets
API + local
Run locally. Works with hosted models and local LLMs. Bring your own adapters.
Test Run #1 — Live Console
LIVE
14:32:01Initializing test run #1…
14:32:02Target: GPT-4o (OpenAI API)
14:32:03Loading attack library (86 patterns)
14:32:04▸ Starting module: Prompt Injection
14:32:08 ├─ Indirect Injection: 41/50 ✓
14:32:22 │ ✖ CRITICAL: Persona override at depth 7
14:32:25Module complete → Score: 87/100
14:32:26▸ Starting module: Hallucination Detection
14:32:40 │ Processing… ████████░░ 80%
App preview

See it in action.

A full desktop dashboard for configuring runs, watching results stream live, and generating reports.

SENTINEL Security Dashboard
Coverage

Built for real-world failure modes.

SENTINEL focuses on the stuff that breaks production systems: context injection, tool abuse, leakage, regression drift, and “soft deflections” that hide partial compliance.

Module
Prompt Injection
87

Direct + indirect injection, multi-turn escalation, encoding tricks, persona drift.

Module
Hallucinations
72

Known-answer QA, citation checks, self-consistency and calibration.

Module
Data Leakage
94

PII recall probes, data extraction attempts, credential leakage, cross-session bleed.

Module
Adversarial
78

Jailbreak fuzzing, semantic adversarials, tool-use abuse, chained exploits.

Module
Poisoning
91

Backdoor trigger probes, bias injection scanning, anomaly detection.

Module
Compliance
96

Safety policy adherence, regulatory mapping, custom policy validation.

Workflow

Run → Score → Report

  1. 1) Add a target — paste your API key and select a model in Settings.
  2. 2) Create a test run — pick modules and configure depth.
  3. 3) Watch results stream live via WebSocket in the dashboard.
  4. 4) Generate a PDF report with actionable scoring from the results page.
Why SENTINEL

A suite, not a script.

Most tools stop at “does it jailbreak?” SENTINEL is built to capture the nuanced stuff: partial compliance, soft refusals, tool-use abuse, and regressions that only show up after multiple turns.

Attack library + generators
Curated patterns + fuzzing for novel variants.
Opinionated scoring
Severity, exploitability, and reproducibility signals.
CI-ready
Run locally, in pipelines, or on schedules.
Start now

Red team your AI app before the internet does.

Grab the repo, run the starter suite, and evolve your test library over time.