Loading paper
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection | Tomesphere