Improving Text-based Early Prediction by Distillation from Privileged Time-Series Text
Jinghui Liu, Daniel Capurro, Anthony Nguyen, Karin Verspoor

TL;DR
This paper introduces LuPIET, a novel knowledge distillation approach that leverages privileged longer-term text data during training to improve early prediction accuracy in time-series NLP tasks.
Contribution
It presents the first application of privileged information learning for time-series in NLP, enhancing early prediction models through a new distillation method.
Findings
LuPIET improves early prediction performance across clinical and social media datasets.
Compared to transfer learning and mixed training, LuPIET provides more stable and consistent improvements.
Optimal text representation and window selection are crucial for maximizing LuPIET's effectiveness.
Abstract
Modeling text-based time-series to make prediction about a future event or outcome is an important task with a wide range of applications. The standard approach is to train and test the model using the same input window, but this approach neglects the data collected in longer input windows between the prediction time and the final outcome, which are often available during training. In this study, we propose to treat this neglected text as privileged information available during training to enhance early prediction modeling through knowledge distillation, presented as Learning using Privileged tIme-sEries Text (LuPIET). We evaluate the method on clinical and social media text, with four clinical prediction tasks based on clinical notes and two mental health prediction tasks based on social media posts. Our results show LuPIET is effective in enhancing text-based early predictions, though…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Mental Health via Writing
MethodsTest
