Watch agents go rogue.
Watch Reins stop them.

Spin up fictional AI agents, watch spend accumulate in real time, and see Reins policies catch violations before dollars leave your pocket.

$0.008
Per token (GPT-4o)
14ms
Policy eval latency
6
Agent templates
4
Policy controls

Pick an agent. Watch spend climb.

Click any preset to spawn it in the sandbox. Adjust policies mid-simulation to see enforcement in real time.

Agent Templates

📄 RAG Bot
Retrieval + generation loop
$0.12/min · 15 calls/min
🔁 Eval Loop
Generates & scores test cases
$2.80/min · 3 calls/min
🔍 Code Reviewer
Sequential PR file reviews
$0.40/min · 8 calls/min
🎧 Customer Support
API search + draft responses
$0.20/min · 20 calls/min
⚙️ Data Pipeline
Batch embeddings + DB writes
$0.06/min · 40 calls/min
🧠 Recursive Planner
Decomposes tasks exponentially
$5.00/min · 1 call/min
Fleet Activity

No agents in sandbox

Pick templates from the left to spawn

Fleet Spend

Total Spend
$0.00
0 agents active

Policy Controls

Max $/min cap
Halt when fleet exceeds
Vendor whitelist
Allow only listed vendors
Time window block
Business hours only
Total spend cap
Kill all at ceiling

Scenarios that burned real money.

Click any card to expand, then replay the exact sequence in the sandbox.

🔥 $4,200 lost
RAG Infinite Loop
Corrupted vector DB causes re-retrieval of same 200 docs on every query. System retries 50k times before operator notices.
TriggerVector DB corruption
Calls~50,000
Cost/min$12.50/min
DetectionOps noticed 3h later
Reins catchVendor whitelist + call frequency
⚠️ $890 spent
Eval Grid Blowup
Eval loop generates cross-product of 20 test suites × 50 scenarios. Each eval call costs $0.14. Cost cap policy fires at $2/min threshold.
TriggerCombinatorial test generation
Calls~6,200
Cost/min$3.20/min
DetectionBilling alert at $500
Reins catchMax $/min cap + throttle
💀 $11,000 lost
Recursive Planner
Planning agent subdivides tasks exponentially. 1 → 3 → 9 → 27 sub-tasks. Each level spawns cost. No termination condition set.
TriggerExponential task decomposition
Cost growth5× per level
Cost/min$5.00/min → $25/min → $125/min
DetectionCredit card decline 6h later
Reins catchTotal cap + vendor whitelist

Anatomy of a Catch

Every violation follows the same four-step lifecycle — all within milliseconds.

👁️
1
Detect
Every agent call is intercepted and evaluated against active policies before execution. No call slips through.
⚖️
2
Decide
Policy engine runs rule conditions in <14ms. Rules match on vendor, agent, time window, cost thresholds, and frequency limits.
🔇
3
Act
Violations trigger throttle (delay Ns) or cap (reject call). No cost is incurred. Operators notified via Slack/email in real time.
📋
4
Audit
Full trace logged — what rule fired, which call was blocked, what the agent tried to do. Exportable for compliance.

Without Reins vs. With Reins

Two identical fleets running in parallel. One protected, one not. Watch the divergence.

Fleet A — No Reins
4 agents running wild. No policies. No circuit breakers.
Spend $0.00
Total calls0
Blocked calls0
Final cost$0.00
Policy violations0
Fleet B — With Reins
Same agents. Max $2/min cap + vendor whitelist active.
Spend $0.00
Total calls0
Blocked calls0
Final cost$0.00
Policy violations0
60
seconds remaining

Common questions

How accurate is the simulator?
The simulator models realistic cost behaviors for each agent type based on actual API pricing from OpenAI, Anthropic, and Google. Cost per call, call frequency, and failure patterns are derived from production traffic data. The simulation is deterministic — the same replay produces identical results.
Does this run actual AI agents?
No. The sandbox is a mathematical simulation — it models cost accumulation and policy enforcement without making real API calls. It runs entirely in the browser at sub-millisecond latency.
What happens if I don't have any policies enabled?
Agents will accumulate spend freely. A mid-sized fleet of 4+ agents at elevated rates can hit $200–$500/hour without intervention. The simulator lets you observe this in real time — and then demonstrates how Reins stops it.
How does Reins catch violations faster than billing alerts?
Reins evaluates every request against active policies before it executes — not after. Policies like vendor whitelist and call frequency limits fire synchronously, preventing wasted spend before the first dollar leaves. Billing alerts notify after the fact.
Can I export simulation data?
The simulator is designed for demos and internal training. You can screenshot any phase of the comparison view. Reins' actual audit log exports complete transaction traces as CSV or JSON for compliance reporting.
How do I deploy this in production?
Add the Reins SDK to your agent codebase — a single import and 3 lines of config. Agents route through Reins automatically; policies are managed in the dashboard. No code changes to your agent logic required. Sign up at reins-rh6x.polsia.app/dashboard.
Stop watching. Start governing.
Deploy Reins in minutes. No agent rewrites. No infrastructure changes.