Depth $F_1$: Improving Evaluation of Cross-Domain Text Classification by   Measuring Semantic Generalizability

Parker Seegmiller; Joseph Gatto; and Sarah Masud Preum

arXiv:2406.14695·cs.CL·June 24, 2024

Depth $F_1$: Improving Evaluation of Cross-Domain Text Classification by Measuring Semantic Generalizability

Parker Seegmiller, Joseph Gatto, and Sarah Masud Preum

PDF

Open Access 1 Repo

TL;DR

Depth F1 is a new metric for cross-domain text classification that evaluates how well models perform on target samples dissimilar from the source, addressing limitations of existing evaluation methods.

Contribution

The paper introduces Depth F1, a novel metric that measures semantic generalizability by focusing on dissimilar target samples in cross-domain classification.

Findings

01

Depth F1 effectively highlights models' performance on dissimilar target samples.

02

Benchmarking shows varying model capabilities in semantic transfer.

03

Depth F1 complements existing metrics by providing deeper evaluation insights.

Abstract

Recent evaluations of cross-domain text classification models aim to measure the ability of a model to obtain domain-invariant performance in a target domain given labeled samples in a source domain. The primary strategy for this evaluation relies on assumed differences between source domain samples and target domain samples in benchmark datasets. This evaluation strategy fails to account for the similarity between source and target domains, and may mask when models fail to transfer learning to specific target samples which are highly dissimilar from the source domain. We introduce Depth $F_{1}$ , a novel cross-domain text classification performance metric. Designed to be complementary to existing classification metrics such as $F_{1}$ , Depth $F_{1}$ measures how well a model performs on target samples which are dissimilar from the source domain. We motivate this metric using standard…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pkseeg/df1
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies