Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks
Le Xue, Mingfei Gao, Zeyuan Chen, Caiming Xiong, Ran Xu

TL;DR
This paper introduces a framework to evaluate the robustness of transformer-based form field extractors against various form attacks, revealing their vulnerability to perturbations and suggesting improvements.
Contribution
It presents 14 novel form transformations for robustness testing and provides a comprehensive analysis of model vulnerabilities using real-world invoice data.
Findings
Models drop ~15% in F1 score under field-value variations
Disarrangement of input text causes ~15% F1 score decrease
Disruption of neighboring words reduces F1 score by ~10%
Abstract
We propose a novel framework to evaluate the robustness of transformer-based form field extraction methods via form attacks. We introduce 14 novel form transformations to evaluate the vulnerability of the state-of-the-art field extractors against form attacks from both OCR level and form level, including OCR location/order rearrangement, form background manipulation and form field-value augmentation. We conduct robustness evaluation using real invoices and receipts, and perform comprehensive research analysis. Experimental results suggest that the evaluated models are very susceptible to form perturbations such as the variation of field-values (~15% drop in F1 score), the disarrangement of input text order(~15% drop in F1 score) and the disruption of the neighboring words of field-values(~10% drop in F1 score). Guided by the analysis, we make recommendations to improve the design of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital and Cyber Forensics · Advanced Malware Detection Techniques · Security and Verification in Computing
