Record & Replay
Capture real agentic coding sessions and replay them against any endpoint.
Why Record & Replay?
Synthetic benchmarks are useful, but nothing beats measuring with your actual coding sessions. Record a real session with Claude Code, Cursor, or any LLM-powered coding agent, then replay it against any endpoint to get apples-to-apples comparisons.
acb record - Capture a Session
Starts a recording proxy between your coding agent and your LLM endpoint. Every request/response pair is saved as a JSONL line.
Record with an OpenAI-compatible upstream
acb record \
-e http://your-gpu-server:8000 \
-m your-modelRecord with Anthropic (auto-detected from URL)
acb record \
-e https://api.anthropic.com \
-m claude-sonnet-4-20250514 \
-k $ANTHROPIC_API_KEY \
--api-key-header x-api-key \
-o my-session.jsonlCustom output file and port
acb record \
-e http://your-gpu-server:8000 \
-m your-model \
-o my-session.jsonl \
-P 9000Point Your Agent at the Proxy
Once the recording proxy is running, point your coding agent at it:
ANTHROPIC_BASE_URL=http://localhost:19000 claudeStop recording with Ctrl+C when done.
Upstream Modes
The recorder supports two upstream modes:
OpenAI-compatible (default)
Translates Anthropic Messages API → OpenAI format before forwarding.
Anthropic passthrough
Forwards requests natively to Anthropic's API - no translation, full fidelity. Auto-detected when the endpoint is api.anthropic.com, or set explicitly with --upstream-api anthropic.
Both modes save the workload in OpenAI format for replay.
acb replay - Replay Against Any Endpoint
Take a recorded workload and replay it against a different endpoint, hardware, or configuration.
Replay against a new endpoint
acb replay \
-e http://new-server:8000 \
-m my-model \
-w my-session.jsonlGenerate a full report
acb replay \
-e http://new-server:8000 \
-m my-model \
-w my-session.jsonl \
-o report.mdPreview without sending requests
acb replay -e URL -m MODEL -w session.jsonl --dry-runSlicing Workloads
Real sessions grow from small contexts to large ones. --slice-tokens N replays requests from the start until cumulative prompt tokens reach N - preserving the natural context growth while capping how much you send through the endpoint.
acb replay -e URL -m MODEL -w session.jsonl --slice-tokens 1000000Useful for targeting specific model context limits or keeping replay costs down.
Record CLI Flags
| Flag | Description |
|---|---|
| -e, --endpoint | Upstream LLM endpoint URL |
| -m, --model | Model name |
| -k, --api-key | API key for the upstream endpoint |
| --api-key-header | Custom API key header name |
| -o, --output | Output JSONL file path |
| -P, --port | Proxy listen port (default: 19000) |
| --upstream-api | Force upstream API type (openai or anthropic) |
Replay CLI Flags
| Flag | Description |
|---|---|
| -e, --endpoint | Target endpoint URL |
| -m, --model | Model name |
| -w, --workload | JSONL workload file path |
| -o, --output | Report output path |
| --dry-run | Preview without sending requests |
| --slice-tokens | Stop replaying after N cumulative prompt tokens |