mForms : Multimodal Form-Filling with Question Answering

Larry Heck; Simon Heck; Anirudh Sundar

arXiv:2011.12340·cs.AI·March 26, 2024

mForms : Multimodal Form-Filling with Question Answering

Larry Heck, Simon Heck, Anirudh Sundar

PDF

Open Access

TL;DR

This paper introduces mForms, a zero-shot multimodal form-filling method using question answering models, achieving high accuracy with minimal training data and providing a new dataset for future research.

Contribution

It reformulates form-filling as multimodal question answering, enabling zero-shot performance and introducing a new dataset for multimodal form-filling tasks.

Findings

01

Achieves state-of-the-art F1 of 0.97 on ATIS with limited training data.

02

Maintains robust accuracy in sparse training conditions.

03

Introduces a new multimodal form-filling dataset, mForms.

Abstract

This paper presents a new approach to form-filling by reformulating the task as multimodal natural language Question Answering (QA). The reformulation is achieved by first translating the elements on the GUI form (text fields, buttons, icons, etc.) to natural language questions, where these questions capture the element's multimodal semantics. After a match is determined between the form element (Question) and the user utterance (Answer), the form element is filled through a pre-trained extractive QA system. By leveraging pre-trained QA models and not requiring form-specific training, this approach to form-filling is zero-shot. The paper also presents an approach to further refine the form-filling by using multi-task training to incorporate a potentially large number of successive tasks. Finally, the paper introduces a multimodal natural language form-filling dataset Multimodal Forms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization