FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction
Minh Van Nguyen, Nghia Trung Ngo, Bonan Min, Thien Huu Nguyen

TL;DR
FAMIE is an efficient active learning framework for multilingual information extraction that uses a proxy network and knowledge distillation to accelerate annotation without sacrificing model performance.
Contribution
The paper introduces a novel active learning framework with a proxy network and knowledge distillation to improve annotation efficiency in multilingual information extraction.
Findings
FAMIE achieves competitive performance in sequence labeling tasks.
FAMIE significantly reduces annotation time compared to traditional methods.
The framework supports multiple languages effectively.
Abstract
This paper presents FAMIE, a comprehensive and efficient active learning (AL) toolkit for multilingual information extraction. FAMIE is designed to address a fundamental problem in existing AL frameworks where annotators need to wait for a long time between annotation batches due to the time-consuming nature of model training and data selection at each AL iteration. This hinders the engagement, productivity, and efficiency of annotators. Based on the idea of using a small proxy network for fast data selection, we introduce a novel knowledge distillation mechanism to synchronize the proxy network with the main large model (i.e., BERT-based) to ensure the appropriateness of the selected annotation examples for the main model. Our AL framework can support multiple languages. The experiments demonstrate the advantages of FAMIE in terms of competitive performance and time efficiency for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Topic Modeling · Web Data Mining and Analysis
MethodsKnowledge Distillation
