TheproductionharnessforAIagents.
Cost control, security, observability, memory, evaluation, and governance — Archon wraps around your LLM calls, not the other way around.
1from archon import Agent, tool, Budget23@tool4def search(query: str) -> str:5 """Search the web."""6 return web_search(query)78agent = Agent(9 name="researcher",10 tools=[search],11 model="auto",12 budget=Budget(max_per_run=0.50),13)1415result = await agent.run("Why did SVB fail?")16print(f"${result.cost:.4f} / {result.step_count} steps")The 80% nobody wants to build twice.
Every agent framework solves the same 20% — call the LLM, run a tool, return an answer. Archon ships the rest.
Cost Control
Hard per-run, per-day, per-month budgets. Auto-route to the cheapest model that can handle the task. No surprise bills.
Security
Default-deny policy engine. Subprocess sandbox for tool execution. Seven-category output sanitizer blocks prompt injection.
Observability
Built-in trace store and dashboard. Every run returns cost, step count, and a trace URL. No 'add observability later' step.
Memory
Four tiers — working, episodic, semantic, procedural. Temporal decay with configurable half-life. Auto-consolidation.
Evaluation
Inline schema checks, async quality scoring, regression detection. Shadow deployments to validate before promoting.
Governance
Event sourcing for every action. RBAC for tool permissions. GDPR-compliant data export and right-to-erasure.
Every LLM call passes five gates.
Request comes in. It leaves with an answer, a cost, and a trace. What happens between is deterministic, observable, and safe.
Policy Check
Is this tool allowed? Are the args within bounds? The default-deny engine evaluates every invocation before it runs.
Model Routing
Classify complexity from 27 lexical signals. Pick the cheapest model for the tier. Downgrade when budget tightens.
Execute
Call the LLM via LiteLLM (140+ models). Tool calls run in a sandboxed subprocess with a hard timeout.
Validate Output
Schema check on structured output. Strip injection patterns across seven categories. Detect loops.
Log Trace
Record model, tokens, cost, latency, tool calls, routing decision. Append to immutable audit log.
One registry. Every model.
Pricing for every major model, used live by the router for cost-aware selection. Pin one, filter by budget, or let it pick.
Typical 60/25/15 split saves 60–70% vs sending everything to a frontier model.
Stop rebuilding the 80%.
Install Archon, wrap your LLM calls, ship with budgets, observability, and governance from day one.