ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data
Yufan Shen, Chuwei Luo, Zhaoqing Zhu, Yang Chen, Qi Zheng, Zhi Yu,, Jiajun Bu, Cong Yao

TL;DR
ProcTag introduces a novel process-oriented tagging method to evaluate and enhance document instruction datasets, significantly reducing the amount of data needed for effective training of document VQA models.
Contribution
The paper presents ProcTag, a new data-centric evaluation approach that assesses instruction efficacy through process-based tags, enabling better dataset filtering and sampling.
Findings
ProcTag outperforms existing evaluation methods.
Only 30.5% of instructions are needed for full efficacy.
Sampling with ProcTag improves document VQA performance.
Abstract
Recently, large language models (LLMs) and multimodal large language models (MLLMs) have demonstrated promising results on document visual question answering (VQA) task, particularly after training on document instruction datasets. An effective evaluation method for document instruction data is crucial in constructing instruction data with high efficacy, which, in turn, facilitates the training of LLMs and MLLMs for document VQA. However, most existing evaluation methods for instruction data are limited to the textual content of the instructions themselves, thereby hindering the effective assessment of document instruction datasets and constraining their construction. In this paper, we propose ProcTag, a data-oriented method that assesses the efficacy of document instruction data. ProcTag innovatively performs tagging on the execution process of instructions rather than the instruction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management · Mathematics, Computing, and Information Processing · Semantic Web and Ontologies
