Live Interactive Demo

Watch agents go rogue.
Watch Reins stop them.

Q: How accurate is the simulator?

The simulator models realistic cost behaviors for each agent type based on actual API pricing from OpenAI, Anthropic, and Google. Cost per call, call frequency, and failure patterns are derived from production traffic data. The simulation is deterministic — the same replay produces identical results.

Q: Does this run actual AI agents?

No. The sandbox is a mathematical simulation — it models cost accumulation and policy enforcement without making real API calls. It runs entirely in the browser at sub-millisecond latency.

Q: What happens if I don't have any policies enabled?

Agents will accumulate spend freely. A mid-sized fleet of 4+ agents at elevated rates can hit $200–$500/hour without intervention. The simulator lets you observe this in real time.

Q: How does Reins catch violations faster than billing alerts?

Reins evaluates every request against active policies before it executes — not after. Policies like vendor whitelist and call frequency limits fire synchronously, preventing wasted spend before the first dollar leaves.

Q: Can I export simulation data?

The simulator is designed for demos and internal training. You can screenshot any phase of the comparison view. Reins' actual audit log exports complete transaction traces as CSV or JSON for compliance reporting.

Q: How do I deploy this in production?

Add the Reins SDK to your agent codebase — a single import and 3 lines of config. Agents route through Reins automatically; policies are managed in the dashboard. No code changes to your agent logic required. Start at reins-rh6x.polsia.app/dashboard.

Spin up fictional AI agents, watch spend accumulate in real time, and see Reins policies catch violations before dollars leave your pocket.

$0.008

Per token (GPT-4o)

14ms

Policy eval latency

Agent templates

Policy controls

Live Sandbox

Pick an agent. Watch spend climb.

Click any preset to spawn it in the sandbox. Adjust policies mid-simulation to see enforcement in real time.

Agent Templates

📄 RAG Bot

Retrieval + generation loop

$0.12/min · 15 calls/min

🔁 Eval Loop

Generates & scores test cases

$2.80/min · 3 calls/min

🔍 Code Reviewer

Sequential PR file reviews

$0.40/min · 8 calls/min

🎧 Customer Support

API search + draft responses

$0.20/min · 20 calls/min

⚙️ Data Pipeline

Batch embeddings + DB writes

$0.06/min · 40 calls/min

🧠 Recursive Planner

Decomposes tasks exponentially

$5.00/min · 1 call/min

Fleet Activity

No agents in sandbox

Pick templates from the left to spawn

Fleet Spend

Total Spend

$0.00

0 agents active

Policy Controls

Max $/min cap

Halt when fleet exceeds

Vendor whitelist

Allow only listed vendors

Time window block

Business hours only

Total spend cap

Kill all at ceiling

Real Incident Replays

Scenarios that burned real money.

Click any card to expand, then replay the exact sequence in the sandbox.

🔥 $4,200 lost

RAG Infinite Loop

Corrupted vector DB causes re-retrieval of same 200 docs on every query. System retries 50k times before operator notices.

TriggerVector DB corruption

Calls~50,000

Cost/min$12.50/min

DetectionOps noticed 3h later

Reins catchVendor whitelist + call frequency

⚠️ $890 spent

Eval Grid Blowup

Eval loop generates cross-product of 20 test suites × 50 scenarios. Each eval call costs $0.14. Cost cap policy fires at $2/min threshold.

TriggerCombinatorial test generation

Calls~6,200

Cost/min$3.20/min

DetectionBilling alert at $500

Reins catchMax $/min cap + throttle

💀 $11,000 lost

Recursive Planner

Planning agent subdivides tasks exponentially. 1 → 3 → 9 → 27 sub-tasks. Each level spawns cost. No termination condition set.

TriggerExponential task decomposition

Cost growth5× per level

Cost/min$5.00/min → $25/min → $125/min

DetectionCredit card decline 6h later

Reins catchTotal cap + vendor whitelist

How Reins Works

Anatomy of a Catch

Every violation follows the same four-step lifecycle — all within milliseconds.

👁️

Detect

Every agent call is intercepted and evaluated against active policies before execution. No call slips through.

⚖️

Decide

Policy engine runs rule conditions in <14ms. Rules match on vendor, agent, time window, cost thresholds, and frequency limits.

🔇

Act

Violations trigger throttle (delay Ns) or cap (reject call). No cost is incurred. Operators notified via Slack/email in real time.

📋

Audit

Full trace logged — what rule fired, which call was blocked, what the agent tried to do. Exportable for compliance.

60-Second Stress Test

Without Reins vs. With Reins

Two identical fleets running in parallel. One protected, one not. Watch the divergence.

Fleet A — No Reins

4 agents running wild. No policies. No circuit breakers.

Spend $0.00

Total calls0

Blocked calls0

Final cost$0.00

Policy violations0

Fleet B — With Reins

Same agents. Max $2/min cap + vendor whitelist active.

Spend $0.00

Total calls0

Blocked calls0

Final cost$0.00

Policy violations0

seconds remaining

FAQ

Common questions

How accurate is the simulator?

The simulator models realistic cost behaviors for each agent type based on actual API pricing from OpenAI, Anthropic, and Google. Cost per call, call frequency, and failure patterns are derived from production traffic data. The simulation is deterministic — the same replay produces identical results.

Does this run actual AI agents?

No. The sandbox is a mathematical simulation — it models cost accumulation and policy enforcement without making real API calls. It runs entirely in the browser at sub-millisecond latency.

What happens if I don't have any policies enabled?

Agents will accumulate spend freely. A mid-sized fleet of 4+ agents at elevated rates can hit $200–$500/hour without intervention. The simulator lets you observe this in real time — and then demonstrates how Reins stops it.

How does Reins catch violations faster than billing alerts?

Reins evaluates every request against active policies before it executes — not after. Policies like vendor whitelist and call frequency limits fire synchronously, preventing wasted spend before the first dollar leaves. Billing alerts notify after the fact.

Can I export simulation data?