Reports

Understanding reports, verdicts, metrics, and comparisons.

Overview

Reports are designed to answer one question fast: is this endpoint good enough for agentic coding?

Every report produces a verdict based on TTFT and throughput at each context size:

GOOD- Responsive editing experience. TTFT < 3s at 40K, tok/s > 30.

MARGINAL- Usable but noticeable lag. TTFT 3–6s at 40K, tok/s 15–30.

POOR- Sluggish, frustrating. TTFT > 6s at 40K, tok/s < 15.

Every report includes:

Section	Description
Verdict	GOOD / MARGINAL / POOR for agentic coding, with the key numbers
Key Findings	Auto-generated insights: TTFT scaling, throughput range, concurrency efficiency
Summary Table	TTFT, tok/s, ITL at each concurrency and context size, color-coded
UX Mapping	Maps raw metrics to user experience ("instant response", "smooth streaming")
Context Scaling	ASCII chart showing how TTFT and tok/s change as context grows
Concurrency Scaling	Efficiency percentages at each concurrency level with grades
Per-profile Breakdown	Detailed numbers per context size
Reasoning Analysis	Thinking overhead when using reasoning models
Methodology	What was measured, how, and what the grade thresholds mean

Add -o to any command to generate a report file:

acb speed -e URL -m MODEL --suite full -o report.md

acb speed -e URL -m MODEL --format json -o results.json

The CLI also prints a final verdict line after every benchmark:

CLI output

  Verdict: GOOD for agentic coding at medium context

The compare command produces a head-to-head table, ASCII bar chart, and winner summary:

acb compare --baseline a.json --candidate b.json -o comparison.md

Reference ranges for agentic coding (your numbers will vary by hardware, model, and serving stack):

TTFT < 3s at 40K context → responsive editing

Tok/s > 30/user → code streams smoothly

TTFT < 10s at 100K context → acceptable for deep sessions

Agg tok/s scales sub-linearly with users - ~60–70% efficiency at 8× concurrency

Use --format json for CI/CD pipelines. The JSON contains all metrics, per-request data, and the computed verdict:

acb speed -e URL -m MODEL --format json -o results.json