Binary Bars and Byte Orders: Building a High-Performance Backtest Data Pipeline

The Problem with CSVs

When you backtest 15 forex strategies across 9 pairs over 20 years of H1 data, CSV files become a bottleneck. A single year of H1 data for one pair is a few hundred kilobytes. Twenty years times 9 pairs quickly becomes gigabytes of text files. Parsing CSVs repeatedly is slow, and the data layout wastes space.

The fix: a custom binary format that loads 1 million bars in under 50 milliseconds with zero heap allocation.

The Binary Bar Format

Each bar is exactly 44 bytes, stored in a memory-mapped file:

┌──────────┬──────────┬──────────┬──────────┬──────────┬──────────┐
│ timestamp│   open   │   high   │   low    │  close   │  volume  │
│  (long)  │ (double) │ (double) │ (double) │ (double) │  (int)   │
│   8B     │   8B     │   8B     │   8B     │   8B     │   4B     │
└──────────┴──────────┴──────────┴──────────┴──────────┴──────────┘

44 bytes per bar, fixed-width. No variable-length fields, no parsing overhead. The file is just a flat array of records that you index into directly.

I chose MappedByteBuffer from java.nio for the implementation. The key insight is that MappedByteBuffer lets the OS kernel handle paging -- bars are never fully loaded into heap memory. A 20-year H1 file with approximately 693,000 bars takes about 30 MB on disk and effectively zero Java heap.

public class BarStore {
    public static final int BAR_SIZE = 44;
    private MappedByteBuffer buffer;

    public void open() throws IOException {
        try (var channel = (FileChannel) Files.newByteChannel(filePath)) {
            long size = channel.size();
            this.barCount = (int) (size / BAR_SIZE);
            this.buffer = channel.map(MapMode.READ_ONLY, 0, size);
            this.buffer.order(DATA_BYTE_ORDER);
        }
    }

    public Bar get(int i) {
        int pos = i * BAR_SIZE;
        var ts = readTimestamp(buffer.getLong(pos));
        return new Bar(symbol, ts,
            buffer.getDouble(pos + 8),   // open
            buffer.getDouble(pos + 16),  // high
            buffer.getDouble(pos + 24),  // low
            buffer.getDouble(pos + 32),  // close
            buffer.getInt(pos + 40));    // volume
    }
}

Access is O(1) by index and completes in under 1 microsecond (code Javadoc estimate). Binary search over 693,000 bars for date range queries completes in about 20 iterations.

The Conversion Pipeline

Historical data comes from Dukascopy, downloaded via dukascopy-node. A Python script converts the CSVs to the binary format using struct.pack:

# '<qddddi' = little-endian, long, double x4, int
f.write(struct.pack('<qddddi', ts, o, h, l, c, v))

This lives in scripts/download-data.sh which handles the full pipeline:

Download CSV from Dukascopy (rate-limited, one pair at a time)
Convert to .bars binary format
Store in data/historical/bars/ alongside metadata

The BarStore has a main() entry point too, so you can batch-convert existing CSVs from the Java side.

The Endianness Bug

For months, this worked perfectly on my x86-64 machine. Here is why it worked: x86 uses little-endian byte order. The JVM on x86 defaults to little-endian. Python's struct.pack('<qddddi') writes little-endian. Everything was accidentally aligned.

The problem would surface on any big-endian platform -- a SPARC server, a network processor, or even a different JVM configuration. The BarStore wrote files using the JVM default byte order, and the Python script used an explicit little-endian directive. The contract was implicit and fragile.

The fix was a single line declaring the byte order explicitly on the buffer:

private static final ByteOrder DATA_BYTE_ORDER = ByteOrder.LITTLE_ENDIAN;

Applied in both write() and open() paths. Now the contract with the Python pipeline is self-documenting and portable. If someone runs this on a big-endian system, the byte order is asserted explicitly rather than relying on platform coincidence.

The commit message (5711fe6, June 2 2026) tells the full story: "On x86-64 Linux the default happens to be LE, so existing data was correct by accident, but the mismatch would break on any big-endian platform."

Backward Compatibility

The BarStore handles legacy files that used epoch seconds instead of milliseconds. The readTimestamp method checks a heuristic:

private static Instant readTimestamp(long raw) {
    return raw > 1_000_000_000_000L
        ? Instant.ofEpochMilli(raw)
        : Instant.ofEpochSecond(raw);
}

Values over 1 trillion are clearly millis (September 2001 onward); anything below is seconds. This lets the system read old files without a re-conversion step. A unit test (read_supportsLegacyEpochSeconds) validates this with a manually constructed binary buffer.

The Module Architecture

The full Trading Bridge project is an 11-module Maven monorepo with a strict zero-circular-dependency rule. Here is the dependency graph:

Trading Bridge Architecture

trading-core          Domain models, Strategy interface, Indicators
trading-backtest      BacktestEngine, RunContext, reports
trading-data          BarStore, HistoricalDataLoader, OANDA client
trading-broker        Broker connectors (OANDA REST, IBKR TWS)
trading-strategies    45+ creative and imported strategies
trading-runtime       ControlPlane HTTP+WS, EventStore, promote gates
trading-examples      RunBacktest CLI, golden tests
trading-parser        StrategyQuant XML to Java conversion
trading-genetics      Genetic strategy search (offline)
trading-tui           JLine3 terminal client
dashboard/            Laravel control room (outside Maven reactor)

The dependency graph is acyclic by design. trading-core has zero internal trading dependencies -- it defines the domain models (Bar, Order, Strategy, Position) and every other module depends on it. This makes each module independently testable.

The Data Flow

Here is how the binary bars feed into the backtest engine:

Download: dukascopy-node fetches historical H1 data as CSV
Convert: Python struct.pack('<qddddi') writes binary .bars files
Load: HistoricalDataLoader opens BarStore files via memory-mapped I/O
Feed: BacktestEngine iterates bars, calling each strategy's onBar()
Record: Results go to RunEvent JSONL and an SQLite event store

The Golden Baseline

Every backtest run is validated against a golden baseline. The canonical numbers for LondonOpenRangeBreakout on EUR/USD H1 2012 are:

8760 bars, 61 trades
Total return: 0.1397% ($139.67 on $100k capital)
Max drawdown: 0.048%
Tolerance: +/-1% on return, +/-0.01 pp on drawdown

A smaller CI subset (744 bars, 3 trades) runs on every push so no local historical data is needed for the basic smoke test.

The Endianness Lesson

This bug was never visible in production. It was a latent time bomb that only code review caught. The lesson is simple: whenever two systems agree on a binary format, state the byte order explicitly in both. Do not rely on platform defaults, even if both systems currently run on x86.

A lot of software engineering is like this -- fixing things that are technically correct but accidentally so. The endianness fix did not change any behavior on my machine. It changed the contract from implicit to explicit, which matters when the system grows beyond one developer on one architecture.

What I Would Do Differently

I would have written the BarStore's byte order assertion on day one, before the first conversion script ran. The ByteOrder.LITTLE_ENDIAN constant is three words that would have saved a commit and a documentation note. But the debugging process -- comparing Python output bytes to Java expectations -- was itself educational and led to a robust test for legacy file support.

This is one module in a larger trading infrastructure. The backtest engine, strategy promotion pipeline with qualification gates, and broker reconciliation system each have their own stories.