Quantitative Estimation of Target Task Performance from Unsupervised Pretext Task in Semi/Self-Supervised Learning

Lin-Han Jia; Si-Yu Han; Wen-Chao Hu; Jie-Jing Shao; Wen-Da Wei; Zhi Zhou; Lan-Zhe Guo; Yu-Feng Li

arXiv:2508.07299·cs.LG·April 9, 2026

Quantitative Estimation of Target Task Performance from Unsupervised Pretext Task in Semi/Self-Supervised Learning

Lin-Han Jia, Si-Yu Han, Wen-Chao Hu, Jie-Jing Shao, Wen-Da Wei, Zhi Zhou, Lan-Zhe Guo, Yu-Feng Li

PDF

TL;DR

This paper develops a theoretical framework and a low-cost estimation method to predict target task performance in semi/self-supervised learning based on assumptions about pretext tasks.

Contribution

It introduces a theory linking pretext task impact to learnability, reliability, and completeness, and proposes a method to estimate target performance before training.

Findings

01

Estimated performance correlates strongly with actual performance.

02

Benchmark of over 100 pretext tasks validates the estimation method.

03

The approach enables preemptive assessment of pretext task suitability.

Abstract

The effectiveness of unlabeled data in Semi/Self-Supervised Learning (SSL) depends on appropriate assumptions for specific scenarios, thereby enabling the selection of beneficial unsupervised pretext tasks. However, existing research has paid limited attention to assumptions in SSL, resulting in practical situations where the compatibility between the unsupervised pretext tasks and the target scenarios can only be assessed after training and validation. This paper centers on the assumptions underlying unsupervised pretext tasks and explores the feasibility of preemptively estimating the impact of unsupervised pretext tasks at low cost. Through rigorous derivation, we show that the impact of unsupervised pretext tasks on target performance depends on three factors: assumption learnability with respect to the model, assumption reliability with respect to the data, and assumption…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.