Architecture-Agnostic Curriculum Learning for Document Understanding: Empirical Evidence from Text-Only and Multimodal
Mohammed Hamdan, Vincenzo Dentamaro, Giuseppe Pirlo, Mohamed Cheriet

TL;DR
This study shows that progressive curriculum learning reduces training time for document understanding models, with benefits varying based on model capacity and task complexity, demonstrated through experiments on BERT and LayoutLMv3.
Contribution
It provides empirical evidence that progressive data scheduling can effectively reduce training time across different model architectures in document understanding.
Findings
Curriculum scheduling reduces training time by ~33%.
Significant performance gains for BERT with curriculum, not for LayoutLMv3.
Performance on CORD dataset reaches a ceiling regardless of scheduling.
Abstract
We investigate whether progressive data scheduling -- a curriculum learning strategy that incrementally increases training data exposure (33\%67\%100\%) -- yields consistent efficiency gains across architecturally distinct document understanding models. By evaluating BERT (text-only, 110M parameters) and LayoutLMv3 (multimodal, 126M parameters) on the FUNSD and CORD benchmarks, we establish that this schedule reduces wall-clock training time by approximately 33\%, commensurate with the reduction from 6.67 to 10.0 effective epoch-equivalents of data. To isolate curriculum effects from compute reduction, we introduce matched-compute baselines (Standard-7) that control for total gradient updates. On the FUNSD dataset, the curriculum significantly outperforms the matched-compute baseline for BERT (F1 = +0.023, , ), constituting evidence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Handwritten Text Recognition Techniques · Data Visualization and Analytics
