Improving Transfer Learning for Sequence Labeling Tasks by Adapting Pre-trained Neural Language Models
David Duki\'c

TL;DR
This thesis advances transfer learning for sequence labeling by developing new adaptation methods for pre-trained neural language models, including multi-task learning, architectural modifications, and response-oriented fine-tuning, leading to improved performance.
Contribution
It introduces novel adaptation techniques for pre-trained models, such as multi-task signals, bidirectional layer communication, and supervised in-context fine-tuning for sequence labeling.
Findings
Enhanced domain transfer for event trigger detection.
Bidirectional information flow improves model performance.
Supervised in-context fine-tuning boosts sequence labeling accuracy.
Abstract
This doctoral thesis improves the transfer learning for sequence labeling tasks by adapting pre-trained neural language models. The proposed improvements in transfer learning involve introducing a multi-task model that incorporates an additional signal, a method based on architectural modifications in autoregressive large language models, and a sequence labeling framework for autoregressive large language models utilizing supervised in-context fine-tuning combined with response-oriented adaptation strategies. The first improvement is given in the context of domain transfer for the event trigger detection task. The domain transfer of the event trigger detection task can be improved by incorporating an additional signal obtained from a domain-independent text processing system into a multi-task model. The second improvement involves modifying the model's architecture. For that purpose, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
