SQLite as a Backtest Results Store: Schema Design, Query Interface, and Sensitivity Heatmaps

The Problem: Ephemeral Results

Before the persistence layer, a backtest run produced:

Console output with key metrics (Sharpe, profit factor, max drawdown)
A JSONL event stream if the run was executed through the control plane
Nothing persistent. No history. No comparisons. No way to answer simple questions like "which parameter set produced the highest Sharpe on EUR_USD last week?"

The backtest engine itself was solid. The run context, event stream, and strategy catalog were well-structured. But the results vanished on exit.

Why SQLite

The existing architecture already used SQLite for the event store (events.db), managed through RuntimeDataPaths.defaultEventStorePath(). Adding a backtest_runs table to the same database meant zero additional infrastructure. No PostgreSQL migration, no Docker volumes, no cloud dependencies.

SQLite WAL (Write-Ahead Logging) mode solved the concurrency concern. Multiple CLI sessions and the control plane can write results simultaneously while the desktop GUI reads them. No lock contention under the expected load. At most a few hundred runs per day, each one a single INSERT.

WAL also means reads never block writes and vice versa. The GUI can render the history view while a nightly batch backtest saves results to the same database.

Schema Design

The backtest_runs table stores 20+ columns spanning three categories:

Run metadata: run_id (TEXT PRIMARY KEY), strategy_id, symbol, period_start, period_end, parameters (JSON), parameter_hash (SHA-256), initial_capital, created_at.

Performance metrics: final_equity, total_pnl, total_return_pct, total_trades, winning_trades, losing_trades, win_rate_pct, max_drawdown_pct, avg_trade_pnl, sharpe_ratio, sortino_ratio, profit_factor, calmar_ratio.

Large data: equity_curve (JSON array of double values), total_commission, total_slippage.

The parameter_hash deserves a closer look. It is a deterministic SHA-256 hash of the sorted parameter JSON keys and values. Sorting the keys alphabetically before serialization guarantees that {fastPeriod:20, slowPeriod:50} and {slowPeriod:50, fastPeriod:20} produce the same hash. Without this, comparing configurations across runs would require string matching on the raw parameters JSON, which breaks the moment JSON key ordering changes.

The equity curve is stored as a JSON array rather than a separate table. Normalization would add JOIN overhead for a data structure that is always loaded and displayed as a unit. A single TEXT column with a Jackson-serialized array keeps the query simple and the GUI fast.

Auto-Save on Completion

The persistence hook lives in the backtest engine completion callback. When BacktestEngine.run() finishes, it calls BacktestPersistenceService.saveResult(BacktestResult result, Map parameters).

The service handles three things:

Computing the deterministic parameter hash
Serializing the equity curve and parameters to JSON
Inserting the row via SqliteBacktestRunStore

Failed or interrupted runs do not write incomplete data. The save only triggers on clean completion.

A console confirmation follows every save: [BacktestPersistence] Saved run bt_20260606_184522_abc123 for strategy SmaCrossover (hash: e3b0c44...).

CLI Query Interface

The RunBacktest CLI accepts query flags that turn it from a run-and-exit tool into a database explorer:

--query, --strategy SmaCrossover, --symbol EUR_USD, --min-sharpe 1.5, --sort-by sharpe_ratio, --limit 10

This is the most-used feature in practice. Instead of re-running SmaCrossover to check which period combination was best, you run a single CLI command with query flags. The table output shows the run ID, strategy, symbol, parameters, and key metrics. No log parsing. No grep. No re-runs.

REST API and Desktop Heatmaps

The control plane exposes three endpoints:

GET /api/backtests -- filtered listing with same query parameters as the CLI
GET /api/backtests/{runId} -- full details including equity curve
GET /api/backtests/analytics/heatmap -- parameter sensitivity matrix

The heatmap endpoint is the most technically interesting. It accepts a strategy ID and a date range, groups runs by their parameter hash, and computes a 2D grid where each cell represents a unique parameter configuration. The grid rows and columns are the two most sensitive parameters (automatically detected by variance in the results). Cell color indicates the metric value, typically Sharpe or profit factor.

For SmaCrossover, the heatmap reveals the fast/slow period combinations that cluster around high Sharpe. The visual pattern is immediate: a diagonal band where slow = 2x-3x fast performs well, and a dead zone where the periods converge (both below 10 or both above 100).

Pareto Frontier Charts

The GET /api/backtests/analytics/pareto endpoint returns a set of risk-reward points plotted on a 2D plane (return vs drawdown, or Sharpe vs trade frequency). The Pareto frontier shows which runs are objectively optimal. No other run offers both higher return and lower drawdown.

This turns the backtest database from a lookup table into a decision support tool. Instead of manually scanning 50 runs to find the best trade-off, the frontier highlights the non-dominated configurations. Any run below the frontier is strictly worse than at least one run on the curve.

What We Did Not Build

Individual trade-level storage is deferred. The PRD explicitly scopes it out of the MVP. Storing every fill, SL hit, and TP exit for every run would balloon the database and add complexity to the schema. For now, the summary metrics and equity curve provide enough signal for parameter tuning and strategy selection. If trade-level forensics become necessary, they belong in a separate table with its own retention policy.

The Result

A backtest runs once and its result lives forever in a queryable store. The CLI turns from a run-and-forget tool into an analytical interface. The desktop GUI renders history views and heatmaps that reveal parameter sensitivity at a glance. The Pareto frontier eliminates guesswork from strategy selection.

And it all runs on a single SQLite file with WAL mode enabled. No services to deploy, no databases to provision, no credentials to rotate. The same database that stores the event stream now stores the backtest history. The architectural constraint (SQLite-only, no external infra) turned into an advantage.