LDF v0.1 · extraction wall-clock
How fast does the extractor run?
Per-document wall-clock, throughput, and the relationship between source
size and extraction time. Measurements come from the same
run-benchmark.ts harness used in
/storage.
Total wall-clock
3.48 s
36 documents
PDF avg / doc
110 ms
Σ 2.63 s
PPTX avg / doc
71 ms
Σ 853 ms
Throughput
5.6 MB/s
effective input bytes per second
§1Wall-clock per document
Figure 1. Each row is one document; bar length is wall-clock. Click a name to jump to its full storage row.
PDFs (n = 24)
PPTXs (n = 12)
§2Source-size vs wall-clock
Figure 2. Scatter of source-bytes against extraction wall-clock. A roughly linear relationship is expected; outliers usually reflect page count rather than byte count (a 50-slide outline-only deck takes longer than a 5-page raster-heavy document).
log-log fit slope ≈ 0.52 (≈ ms per decade of source bytes)
§3Throughput
| Corpus | Σ wall-clock | Σ source bytes | avg / doc | throughput |
|---|---|---|---|---|
| 2.63 s | 9.67 MB | 110 ms | 3.7 MB/s | |
| PPTX | 853 ms | 6.37 MB | 71 ms | 7.5 MB/s |
| Combined | 3.48 s | 16.03 MB | 97 ms | 4.6 MB/s |