Scenario A / Chat latency
Approximates a short interactive exchange and emphasizes TTFT, total latency, and output throughput.
Metrics tracked: TTFT, Total latency, Output tokens per second, Token counts
V0-alpha early access
The methodology page is public before broad access because the benchmark story needs to be reviewable before it needs to be persuasive. The Standard mode uses a controlled profile so results are comparable within the same runtime family. YOLO mode preserves user-chosen settings for exploratory data and stays labeled separately. Public pages stay open for review while benchmark routes remain admin-gated during pre-alpha validation.
LocalBench is in V0-alpha. The public site is for inspection first: visitors can review the benchmark story, legal posture, and waitlist before broader access opens.
Mode policy
Standard mode uses a controlled benchmark configuration and never silently inherits arbitrary user presets.
Randomness
Low-randomness generation with fixed prompt templates and versioned workload definitions.
Runtime separation
Ollama and LM Studio results stay separated in the MVP to avoid false apples-to-oranges comparisons.
Submission review
The app should make uploaded benchmark data reviewable before public submission.
Approximates a short interactive exchange and emphasizes TTFT, total latency, and output throughput.
Metrics tracked: TTFT, Total latency, Output tokens per second, Token counts
Stresses prompt processing with a larger fixed prompt and a minimal output budget.
Metrics tracked: TTFT, Prompt eval time, Prompt tokens per second, Total latency
Measures steady-state decode throughput using a short prompt and a larger fixed output target.
Metrics tracked: TTFT, Completion tokens per second, Generation time, Output tokens