Reports

Understanding reports, verdicts, metrics, and comparisons.

Overview

Reports are designed to answer one question fast: is this endpoint good enough for agentic coding?

Verdicts

Every report produces a verdict based on TTFT and throughput at each context size:

GOOD- Responsive editing experience. TTFT < 3s at 40K, tok/s > 30.
MARGINAL- Usable but noticeable lag. TTFT 3–6s at 40K, tok/s 15–30.
POOR- Sluggish, frustrating. TTFT > 6s at 40K, tok/s < 15.

Report Contents

Every report includes:

SectionDescription
VerdictGOOD / MARGINAL / POOR for agentic coding, with the key numbers
Key FindingsAuto-generated insights: TTFT scaling, throughput range, concurrency efficiency
Summary TableTTFT, tok/s, ITL at each concurrency and context size, color-coded
UX MappingMaps raw metrics to user experience ("instant response", "smooth streaming")
Context ScalingASCII chart showing how TTFT and tok/s change as context grows
Concurrency ScalingEfficiency percentages at each concurrency level with grades
Per-profile BreakdownDetailed numbers per context size
Reasoning AnalysisThinking overhead when using reasoning models
MethodologyWhat was measured, how, and what the grade thresholds mean

Generating Reports

Add -o to any command to generate a report file:

acb speed -e URL -m MODEL --suite full -o report.md
acb speed -e URL -m MODEL --format json -o results.json

The CLI also prints a final verdict line after every benchmark:

CLI output
  Verdict: GOOD for agentic coding at medium context

Comparing Two Runs

The compare command produces a head-to-head table, ASCII bar chart, and winner summary:

acb compare --baseline a.json --candidate b.json -o comparison.md

Rules of Thumb

Reference ranges for agentic coding (your numbers will vary by hardware, model, and serving stack):

TTFT < 3s at 40K context → responsive editing
Tok/s > 30/user → code streams smoothly
TTFT < 10s at 100K context → acceptable for deep sessions
Agg tok/s scales sub-linearly with users - ~60–70% efficiency at 8× concurrency

JSON Output

Use --format json for CI/CD pipelines. The JSON contains all metrics, per-request data, and the computed verdict:

acb speed -e URL -m MODEL --format json -o results.json