Performance Benchmark Report Generator

A Python toolchain that ingests daily CSV exports from Unreal Engine capture-point benchmarks and outputs a single self-contained HTML report. No server, no database, no dependencies beyond the Python standard library.

Live demo available
Real benchmark data with anonymised capture point names — all tabs and charts are fully interactive.

Open Report →

Preview

Report overview — summary cards and trend charts

Summary cards and trend charts on the Overview tab — each metric graded against configurable thresholds

The Problem

Every nightly CI run produces one CSV per map with frame time, GPU time, draw calls, memory, and physics counts for every capture point. Tracking regressions across builds meant manually diffing spreadsheets. The goal was a zero-friction pipeline anyone on the team could run without installing anything extra.

Pipeline

Three scripts, one output file.

download_benchmarks.py   ← crawls the build server, saves dated CSVs locally
        │
        ▼
  csvs/capture_point_data_YYYY-MM-DD.csv   (one per build day)
        │
        ▼
bench.py                 ← parses, aggregates, renders
        │
        ▼
  report/perf_report.html  ← self-contained, no server needed

run_bench.bat wraps both scripts so anyone can regenerate the report with a double-click — no terminal needed.

CSV Parsing

Unreal's CSV exports can come out as UTF-8-BOM, UTF-8, or latin-1 depending on the machine. The parser tries all three in sequence rather than crashing on an unexpected encoding.

def parse_csv(path: Path) -> list:
    for encoding in ("utf-8-sig", "utf-8", "latin-1"):
        try:
            with open(path, newline="", encoding=encoding) as f:
                ...
            return rows  # success — stop trying
        except UnicodeDecodeError:
            continue     # try next encoding

Numeric fields are coerced to float; anything that fails is kept as a string. Blank trailing rows common in Unreal exports are dropped by checking the Capture Point key is non-empty.

Threshold System

All performance targets live in one list of tuples at the top of the script. Change a number there and every badge, bar, and table cell in the report updates automatically.

THRESHOLDS = [
    # key               display name          unit   good    warn    lower_is_better
    ("FrameAvg",        "Frame Time Avg",     "ms",  16.67,  33.33,  True),
    ("GpuAvg",          "GPU Time Avg",       "ms",  14.00,  20.00,  True),
    ("GameAvg",         "Game Thread Avg",    "ms",   4.00,   6.00,  True),
    ("DrawCallsAvg",    "Draw Calls Avg",     "",  2500.0, 3500.0,   True),
    ("PrimitivesAvg",   "Primitives Avg",     "",  4e6,    6e6,      True),
    # ...
]

The grade() function returns "good", "warn", or "bad" — used directly as a CSS class, so there's no separate rendering logic to maintain.

Self-Contained Report

All parsed data is serialised to a single JSON object and injected into the HTML template as a const DATA = {...} assignment. The file needs no server or build step to open — just a browser.

The report has six tabs, all reading from that one DATA object:

Overview — summary cards with pass/warn/fail grading and trend charts across all builds.
Comparison — cross-build table with delta % per metric and a total delta from baseline.
Capture Points — live-searchable, filterable per-capture-point table.
Deep Dive — multi-select capture points, compare trends on a shared Chart.js chart with optional threshold overlays.
Worst / Best — ranked tables for worst performers, most improved, and most regressed.
Thresholds — reference view of all metric limits, rendered from the same data used for grading.

Handling Missing Data

Capture point names are collected into an ordered union across all datasets. A capture point added mid-sprint is backfilled with null in earlier builds so trend charts never break when the level changes between builds.

per_capture[cp] = (
    {m: row.get(m) for m in TABLE_METRICS}
    if row else
    {m: None for m in TABLE_METRICS}  # backfill missing with null
)

Chart.js's spanGaps: true then connects across those gaps rather than leaving holes in the line.

← Back to Tools