Skip to content

modern_fm

Benchmark plan

Matapanino/modern_fm

Benchmark Plan¶

Benchmarks start in Phase 2 (no fast path exists before the Rust backend).

Goals¶

Measure fit speed, predict throughput (rows/sec), peak memory
Compare dense vs CSR paths
Compare against the pure-Python reference (sanity floor)
Optional: compare against libffm / xLearn when installed

Datasets¶

synthetic dense classification (small/medium)
synthetic sparse classification (one-hot, ~1e5–1e6 features)
synthetic field-aware sparse classification (FFM)
small CTR-like dataset (Criteo-like sample, not full Criteo)
Kaggle-style tabular encoded dataset

Metrics¶

wall-clock training time, prediction throughput
logloss, AUC, balanced accuracy (multiclass)
peak memory

Rules¶

fixed seeds, report machine specs, run 3x and report median
do not tune the library to a benchmark (CLAUDE.md rule)

Scripts live in benchmarks/: bench_synthetic.py, bench_criteo_like.py, bench_against_libffm.py.