PyTAIL: Interactive and Incremental Learning of NLP Models with Human in the Loop for Online Data
Shubhanshu Mishra, Jana Diesner

TL;DR
PyTAIL is a Python library that enables interactive, incremental NLP model training with human input, adapting features and models over time to handle evolving data streams, especially in social media text classification.
Contribution
It introduces a flexible human-in-the-loop framework that combines active learning with feature and rule updates for NLP models in streaming data environments.
Findings
PyTAIL achieves high accuracy with only 10% of labeled data.
Active learning strategies improve labeling efficiency.
Tracking evaluation metrics on remaining data enhances model assessment.
Abstract
Online data streams make training machine learning models hard because of distribution shift and new patterns emerging over time. For natural language processing (NLP) tasks that utilize a collection of features based on lexicons and rules, it is important to adapt these features to the changing data. To address this challenge we introduce PyTAIL, a python library, which allows a human in the loop approach to actively train NLP models. PyTAIL enhances generic active learning, which only suggests new instances to label by also suggesting new features like rules and lexicons to label. Furthermore, PyTAIL is flexible enough for users to accept, reject, or update rules and lexicons as the model is being trained. Finally, we simulate the performance of PyTAIL on existing social media benchmark datasets for text classification. We compare various active learning strategies on these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Topic Modeling
MethodsTest
