Segment-Level Attribution for Selective Learning of Long Reasoning Traces
Siyuan Wang, Yanchen Liu, Xiang Ren

TL;DR
This paper introduces a segment-level attribution method using integrated gradients to identify and focus on reflective reasoning segments in long chains of thought, improving model accuracy and efficiency.
Contribution
It proposes a novel segment-level attribution framework with selective finetuning to enhance reasoning quality in large models.
Findings
Improves reasoning accuracy across multiple datasets.
Reduces unnecessary verbose content in generated traces.
Enhances learning efficiency by focusing on important segments.
Abstract
Large Reasoning Models (LRMs) achieve strong reasoning performance by generating long chains of thought (CoTs), yet only a small fraction of these traces meaningfully contributes to answer prediction, while the majority contains repetitive or truncated content. Such output redundancy is further propagated after supervised finetuning (SFT), as models learn to imitate verbose but uninformative patterns, which can degrade performance. To this end, we incorporate integrated gradient attribution to quantify each token's influence on final answers and aggregate them into two segment-level metrics: (1) \textit{attribution strength} measures the overall attribution magnitude; and (2) \textit{direction consistency} captures whether tokens' attributions within a segment are uniformly positive or negative (high consistency), or a mixture of both (moderate consistency). Based on these two metrics,…
Peer Reviews
Decision·ICLR 2026 Poster
- It is interesting to see the use of IG at the segment level to score the importance of different parts of the CoT. The definitions of Attribution Strength and Attribution Direction Consistency are thoughtful in that they are not purely driven by raw gradient magnitude, but instead attempt to capture properties specific to reasoning structure. - Through extensive experiments, the authors present hyperparameter search, ablations, and comparisons with alternative importance measures. - The writ
- The argument for using attribution strength based on the absolute value of IG may need to be strengthened. The paper motivates this choice by noting that exploratory reasoning can sometimes have negative IG values, and that such reasoning should not be thrown away. However, consider a segment that overall shows strongly negative IG values and only moderate consistency. Is it necessarily the case that such a segment corresponds to “necessary exploratory reasoning” (lines 155–156)? While useful
good analytical experiments reliance on gradient based method rather LLM as judge The use of Attribution Strength combined with Attribution Direction Consistency provides a mechanism to distinguish critical reasoning from superficial or shallow content efficiency gains in terms of reasoning length (while having high/comparable accuracy) Comprehensive ablation studies
- Focus on only one LLM (Qwen) - High computational cost for importance calculation (J=50) - Sensitivity to hyperparameter selection: the performance of the framework hinges on two crucial, empirically determined thresholds: τ (strength threshold) and β (consistency threshold). The hyperparameter search indicates that model performance is sensitive to these values, as choosing a higher τ introduced more false positives and negatively impacted training performance. The selection process (maximizi
1. Evaluates both pruning-based and selective learning variants, tests random and prior-importance alternatives, and shows that selective learning is preferable to simple pruning, with the IG-based method outperforming confidence, entropy, and perplexity-based segment selection (Table 3, Page 8). 2. The use of normalized strength and moderate consistency to select segments is justified theoretically and supported by empirical analysis (Section 3.1, Figures 2 and 3; the ablation study in Table 2
> Completeness of the chosen keywords for segmentation Are the keywords used for segmentation comprehensive? How were the keywords selected? The generalizability of this paragraph segmentation method across different models (e.g., DeepSeek-R1 and Qwen2.5-7B) requires further discussion. > Limited theoretical analysis of attribution methods Although the attribution metrics are intuitive and based on empirical motivations, there is insufficient discussion of their theoretical limitations or the
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Advanced Graph Neural Networks
