Core Concepts
Understanding AgentCost's data model and how cost tracking works.
Traces
A trace is a single LLM API call. Every time your application calls OpenAI, Anthropic, or any other provider through an AgentCost-wrapped client, a trace is recorded with:
| Field | Description |
|---|---|
trace_id |
Unique identifier |
project |
Logical grouping (e.g., "customer-support") |
model |
Model name (e.g., "gpt-4o") |
provider |
Provider name (e.g., "openai") |
input_tokens |
Tokens sent to the model |
output_tokens |
Tokens received from the model |
cost |
Computed cost in USD |
latency_ms |
Round-trip time in milliseconds |
status |
success or error |
agent_id |
Optional: which agent made the call |
session_id |
Optional: session grouping |
timestamp |
When the call was made |
Projects
A project is a logical namespace for traces. Use projects to separate different applications, environments, or teams:
# Different projects for different use cases
support_client = trace(OpenAI(), project="customer-support")
pipeline_client = trace(OpenAI(), project="data-pipeline")
research_client = trace(OpenAI(), project="research")
Agents & Sessions
Agents are identifiers for specific AI components within a project:
Sessions group multiple calls into a conversation or workflow:
Cost Calculation
AgentCost calculates costs using a vendored pricing database of 2,610+ models from 40+ providers (sourced from LiteLLM's community-maintained dataset, synced weekly):
The vendored data lives in agentcost/cost/model_prices.json and is the single source of truth. Custom pricing overrides can be added via overrides.json or at runtime with register_model(). Cache-aware pricing is supported for Anthropic prompt caching and OpenAI cached tokens.
Cost Tiers
Every model is automatically classified into a cost tier based on input pricing:
| Tier | Price Range (per 1M input tokens) | Examples |
|---|---|---|
| Economy | < $0.50 | gpt-4o-mini, Claude 3 Haiku |
| Standard | $0.50 – $5.00 | gpt-4o, Claude Sonnet |
| Premium | > $5.00 | o1, Claude Opus |
| Free | $0.00 | Ollama/local models |
Tiers integrate with the policy engine (restrict agents to specific tiers), budget gates (block premium when budget is low), and the complexity router.
Complexity Router
The complexity router auto-classifies each prompt and routes to the appropriate cost tier:
| Level | Routes To | Triggers |
|---|---|---|
| SIMPLE | Economy | Short factual questions, yes/no, lookups |
| MEDIUM | Standard | Summarization, moderate generation |
| COMPLEX | Standard | Code review, architecture design, analysis |
| REASONING | Premium | Mathematical proofs, chain-of-thought, logic |
Budget Gates
Pre-execution budget checks at each workflow step:
- ALLOW — Budget is healthy, proceed
- WARN (80%) — Budget warning, proceed but emit alert
- DOWNGRADE (90%) — Auto-switch to cheaper model (e.g., gpt-4o → gpt-4o-mini)
- BLOCK (100%) — Budget exhausted, deny the call
Cost Intelligence
AgentCost provides five intelligence modules on top of raw trace data:
Forecasting — Predicts future costs using linear regression, exponential moving average (EMA), and ensemble methods. Includes budget exhaustion prediction.
Optimizer — Analyzes your usage patterns and recommends cheaper models that could handle the same workloads. Shows estimated savings.
Analytics — Breakdowns by model, project, agent, and time. Token efficiency metrics and chargeback reports.
Estimator — Pre-call cost estimation. Before making an expensive LLM call, estimate what it will cost across 2,610+ models.
Token Analyzer — Context efficiency scoring (0–100). Detects wasteful patterns: excessive system prompts, under-utilized context windows, and low output ratios.
Plugin Architecture
AgentCost uses an 8-slot plugin system. Every integration point is swappable:
| Slot | Plugin Class | Purpose |
|---|---|---|
| 1. Notifier | NotifierPlugin |
Alerts (Slack, email, webhook, PagerDuty) |
| 2. Policy | PolicyPlugin |
Custom policy evaluation rules |
| 3. Exporter | ExporterPlugin |
Export traces (S3, Snowflake, Datadog) |
| 4. Provider | ProviderPlugin |
Cost calculation for custom LLM providers |
| 5. Tracker | TrackerPlugin |
Cost tracking backends (in-memory, DB, Langfuse) |
| 6. Reactor | ReactorPlugin |
Custom reaction action handlers |
| 7. Runtime | RuntimePlugin |
Model routing, rate limiting, feature flags |
| 8. Agent | AgentPlugin |
Agent lifecycle management, workspace config |
Built-in plugins ship out of the box: 4 notifiers, InMemoryTracker, AgentLifecycle, PagerDutyReactor.
Data Storage
Community edition: SQLite database (zero configuration, file-based).
Enterprise edition: PostgreSQL with connection pooling for production workloads.
The database stores trace_events and benchmark_runs tables. Enterprise adds orgs, users, cost_centers, policies, approval_requests, and more.
Editions
| Feature | Community (MIT) | Enterprise (BSL 1.1) |
|---|---|---|
| Tracing SDK | ✅ | ✅ |
| Dashboard + Models Explorer | ✅ | ✅ |
| 2,610+ Model Pricing | ✅ | ✅ |
| Cost Tiers & Complexity Router | ✅ | ✅ |
| Budget Gates | ✅ | ✅ |
| Token Analyzer | ✅ | ✅ |
| Forecasting | ✅ | ✅ |
| Optimizer | ✅ | ✅ |
| Analytics | ✅ | ✅ |
| Estimator | ✅ | ✅ |
| 8-Slot Plugin System | ✅ | ✅ |
| Reactions Engine (YAML) | ✅ | ✅ |
| CLI | ✅ | ✅ |
| OTel/Prometheus | ✅ | ✅ |
| SSO (any OIDC/SAML provider) | — | ✅ |
| Organizations | — | ✅ |
| Budget Enforcement | — | ✅ |
| Policy Engine | — | ✅ |
| Approval Workflows | — | ✅ |
| Notifications | — | ✅ |
| Anomaly Detection | — | ✅ |
| AI Gateway | — | ✅ |