Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5
Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, Aline, Villavicencio, Iryna Gurevych

TL;DR
This paper demonstrates that simple sequential fine-tuning of T5 models effectively transfers knowledge across tasks, achieving top performance on a complex figurative language inference and explanation task.
Contribution
It shows that straightforward sequential fine-tuning outperforms more complex multi-task learning in cross-task transfer for explainable NLP tasks.
Findings
Sequential fine-tuning yields high performance on interdependent tasks.
Simple methods outperform complex multi-task approaches in transfer learning.
The best model achieved the highest score on the FigLang2022 shared task.
Abstract
We compare sequential fine-tuning with a model for multi-task learning in the context where we are interested in boosting performance on two tasks, one of which depends on the other. We test these models on the FigLang2022 shared task which requires participants to predict language inference labels on figurative language along with corresponding textual explanations of the inference predictions. Our results show that while sequential multi-task learning can be tuned to be good at the first of two target tasks, it performs less well on the second and additionally struggles with overfitting. Our findings show that simple sequential fine-tuning of text-to-text models is an extraordinarily powerful method for cross-task knowledge transfer while simultaneously predicting multiple interdependent targets. So much so, that our best model achieved the (tied) highest score on the task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsTest
