LDF v0.1 · extraction wall-clock

How fast does the extractor run?

Per-document wall-clock, throughput, and the relationship between source size and extraction time. Measurements come from the same run-benchmark.ts harness used in /storage.

Total wall-clock
3.48 s
36 documents
PDF avg / doc
110 ms
Σ 2.63 s
PPTX avg / doc
71 ms
Σ 853 ms
Throughput
5.6 MB/s
effective input bytes per second

§1Wall-clock per document

Figure 1. Each row is one document; bar length is wall-clock. Click a name to jump to its full storage row.

PDFs (n = 24)

Chapter 5 Model Predictive Control.pdf · 1.32 MB 392 ms
diji01.pdf · 805.1 KB 377 ms
DIJJI.ai.pdf · 2.60 MB 344 ms
data0.pdf · 153.2 KB 312 ms
cal1-somepages.pdf · 1.06 MB 234 ms
rubin-pdf5.pdf · 1.12 MB 158 ms
Ali-Argun-Sayilgan-CV-ML.pdf · 117.4 KB 96 ms
2505.18706v3.pdf · 403.1 KB 94 ms
Chapter 5 Model Predictive Control-somepages.pdf · 298.5 KB 76 ms
cal1-somepages11.pdf · 191.3 KB 62 ms
06.pdf · 167.2 KB 50 ms
12-12.pdf · 191.1 KB 49 ms
08.pdf · 152.4 KB 47 ms
sat-complexnumbers0.pdf · 201.3 KB 45 ms
data0-10.pdf · 24.8 KB 44 ms
cal1-somepage1.pdf · 88.7 KB 41 ms
05.pdf · 132.1 KB 35 ms
data0-1.pdf · 22.3 KB 31 ms
07.pdf · 123.0 KB 28 ms
Chapter 5-11111.pdf · 126.9 KB 28 ms
2505.18706v3-9.pdf · 105.8 KB 26 ms
Ali_Argun_Sayilgan_CV.pdf · 104.1 KB 24 ms
letterspcwht.pdf · 131.2 KB 20 ms
matrixful1.pdf · 120.8 KB 19 ms

PPTXs (n = 12)

Chapter8-Pres.pptx · 1.39 MB 152 ms
cloud.pptx · 314.0 KB 124 ms
RUBIN UX UI.pptx · 510.5 KB 112 ms
teach-a-level-computing-1-data-structures-2018.pptx · 188.9 KB 97 ms
Recordkeeping_Software_Presentation.pptx · 972.0 KB 86 ms
1-Introduction.pptx · 1.14 MB 77 ms
onenote-math-features.pptx · 841.5 KB 58 ms
split_presentations_2.pptx · 154.1 KB 42 ms
charts-generated-basic.pptx · 68.7 KB 37 ms
charts-generated-extra.pptx · 68.6 KB 25 ms
PrimeFactorisation.pptx · 705.8 KB 22 ms
ink-maybedraw.pptx · 106.7 KB 21 ms

§2Source-size vs wall-clock

Figure 2. Scatter of source-bytes against extraction wall-clock. A roughly linear relationship is expected; outliers usually reflect page count rather than byte count (a 50-slide outline-only deck takes longer than a 5-page raster-heavy document).

9.8 KB 97.7 KB 976.6 KB 9.54 MB 10 ms 100 ms 1.00 s 10.00 s PDF · 05.pdf: 132.1 KB → 35 ms PDF · 06.pdf: 167.2 KB → 50 ms PDF · 07.pdf: 123.0 KB → 28 ms PDF · 08.pdf: 152.4 KB → 47 ms PDF · 12-12.pdf: 191.1 KB → 49 ms PDF · 2505.18706v3-9.pdf: 105.8 KB → 26 ms PDF · 2505.18706v3.pdf: 403.1 KB → 94 ms PDF · Ali-Argun-Sayilgan-CV-ML.pdf: 117.4 KB → 96 ms PDF · Ali_Argun_Sayilgan_CV.pdf: 104.1 KB → 24 ms PDF · Chapter 5 Model Predictive Control-somepages.pdf: 298.5 KB → 76 ms PDF · Chapter 5 Model Predictive Control.pdf: 1.32 MB → 392 ms PDF · Chapter 5-11111.pdf: 126.9 KB → 28 ms PDF · DIJJI.ai.pdf: 2.60 MB → 344 ms PDF · cal1-somepage1.pdf: 88.7 KB → 41 ms PDF · cal1-somepages.pdf: 1.06 MB → 234 ms PDF · cal1-somepages11.pdf: 191.3 KB → 62 ms PDF · data0-1.pdf: 22.3 KB → 31 ms PDF · data0-10.pdf: 24.8 KB → 44 ms PDF · data0.pdf: 153.2 KB → 312 ms PDF · diji01.pdf: 805.1 KB → 377 ms PDF · letterspcwht.pdf: 131.2 KB → 20 ms PDF · matrixful1.pdf: 120.8 KB → 19 ms PDF · rubin-pdf5.pdf: 1.12 MB → 158 ms PDF · sat-complexnumbers0.pdf: 201.3 KB → 45 ms PPTX · 1-Introduction.pptx: 1.14 MB → 77 ms PPTX · Chapter8-Pres.pptx: 1.39 MB → 152 ms PPTX · PrimeFactorisation.pptx: 705.8 KB → 22 ms PPTX · RUBIN UX UI.pptx: 510.5 KB → 112 ms PPTX · Recordkeeping_Software_Presentation.pptx: 972.0 KB → 86 ms PPTX · charts-generated-basic.pptx: 68.7 KB → 37 ms PPTX · charts-generated-extra.pptx: 68.6 KB → 25 ms PPTX · cloud.pptx: 314.0 KB → 124 ms PPTX · ink-maybedraw.pptx: 106.7 KB → 21 ms PPTX · onenote-math-features.pptx: 841.5 KB → 58 ms PPTX · split_presentations_2.pptx: 154.1 KB → 42 ms PPTX · teach-a-level-computing-1-data-structures-2018.pptx: 188.9 KB → 97 ms source size (log) wall-clock (log) PDF PPTX log-fit

log-log fit slope ≈ 0.52 (≈ ms per decade of source bytes)

§3Throughput

Corpus Σ wall-clock Σ source bytes avg / doc throughput
PDF 2.63 s 9.67 MB 110 ms 3.7 MB/s
PPTX 853 ms 6.37 MB 71 ms 7.5 MB/s
Combined 3.48 s 16.03 MB 97 ms 4.6 MB/s