Finetune-Informed Pretraining Boosts Downstream Performance

Atik Faysal; Mohammad Rostami; Reihaneh Gh. Roshan; Nikhil Muralidhar; Huaxia Wang

arXiv:2601.20884·cs.LG·January 30, 2026

Finetune-Informed Pretraining Boosts Downstream Performance

Atik Faysal, Mohammad Rostami, Reihaneh Gh. Roshan, Nikhil Muralidhar, Huaxia Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces Finetune-Informed Pretraining (FIP), a simple, model-agnostic method that enhances the target modality's representation during pretraining, leading to better downstream performance in multimodal tasks.

Contribution

FIP biases pretraining toward the target modality by adjusting masking difficulty, loss weighting, and decoder capacity without altering the encoder or needing extra data.

Findings

01

FIP improves downstream performance on wireless signal classification.

02

FIP does not require additional data or compute.

03

FIP is compatible with various multimodal masked modeling pipelines.

Abstract

Multimodal pretraining is effective for building general-purpose representations, but in many practical deployments, only one modality is heavily used during downstream fine-tuning. Standard pretraining strategies treat all modalities uniformly, which can lead to under-optimized representations for the modality that actually matters. We propose Finetune-Informed Pretraining (FIP), a model-agnostic method that biases representation learning toward a designated target modality needed at fine-tuning time. FIP combines higher masking difficulty, stronger loss weighting, and increased decoder capacity for the target modality, without modifying the shared encoder or requiring additional supervision. When applied to masked modeling on constellation diagrams for wireless signals, FIP consistently improves downstream fine-tuned performance with no extra data or compute. FIP is simple to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Finetune-Informed Pretraining Boosts Downstream Performance· underline

Taxonomy

TopicsSpeech and dialogue systems · Speech Recognition and Synthesis · Multimodal Machine Learning Applications