Meta-Tsallis-Entropy Minimization: A New Self-Training Approach for   Domain Adaptation on Text Classification

Menglong Lu; Zhen Huang; Zhiliang Tian; Yunxiang Zhao; Xuanyu Fei and; Dongsheng Li

arXiv:2308.02746·cs.CL·August 8, 2023

Meta-Tsallis-Entropy Minimization: A New Self-Training Approach for Domain Adaptation on Text Classification

Menglong Lu, Zhen Huang, Zhiliang Tian, Yunxiang Zhao, Xuanyu Fei and, Dongsheng Li

PDF

Open Access

TL;DR

This paper introduces Meta-Tsallis Entropy Minimization (MTEM), a novel meta-learning based method for domain adaptation in text classification that optimizes adaptive entropy measures to improve pseudo-labeling and model transfer across domains.

Contribution

The paper proposes MTEM, a new meta-learning approach that optimizes Tsallis entropy for better domain adaptation in text classification, with an efficient approximation and sampling mechanism.

Findings

01

MTEM improves BERT's domain adaptation performance by an average of 4% on benchmark datasets.

02

Theoretical proof of convergence for the meta-learning algorithm in MTEM.

03

MTEM effectively reduces sensitivity to prediction errors during self-training.

Abstract

Text classification is a fundamental task for natural language processing, and adapting text classification models across domains has broad applications. Self-training generates pseudo-examples from the model's predictions and iteratively trains on the pseudo-examples, i.e., minimizes the loss on the source domain and the Gibbs entropy on the target domain. However, Gibbs entropy is sensitive to prediction errors, and thus, self-training tends to fail when the domain shift is large. In this paper, we propose Meta-Tsallis Entropy minimization (MTEM), which applies a meta-learning algorithm to optimize the instance adaptive Tsallis entropy on the target domain. To reduce the computation cost of MTEM, we propose an approximation technique to approximate the Second-order derivation involved in the meta-learning. To efficiently generate pseudo labels, we propose an annealing sampling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Cancer-related molecular mechanisms research · Text and Document Classification Technologies

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · fail · Linear Layer · Adam · Dense Connections · Residual Connection · Dropout · WordPiece · Multi-Head Attention