Step Rejection Fine-Tuning: A Practical Distillation Recipe
Igor Slinko, Ilia Zavidnyi, Egor Bogomolov, Yaroslav Zharov

TL;DR
This paper introduces Step Rejection Fine-Tuning (SRFT), a novel method that leverages unresolved trajectories in training LLMs by using a critic to assess step correctness, improving task resolution rates.
Contribution
SRFT provides a practical approach to utilize unresolved trajectories in LLM training, enhancing performance over traditional rejection fine-tuning methods.
Findings
SRFT improves the resolution rate to 32.2% on SWE-bench Verified.
SRFT outperforms RFT by 1.3% in resolution rate.
Using a critic LLM to assess steps enhances learning from partial successes.
Abstract
Rejection Fine-Tuning (RFT) is a standard method for training LLM agents, where unsuccessful trajectories are discarded from the training set. In the context of SWE-bench tasks, this corresponds to filtering out runs where the submitted patch does not pass the tests. However, this approach discards unresolved trajectories, even though they form a large portion of all trajectories for hard tasks and even then may be partially correct. In this work, we propose Step Rejection Fine-Tuning (SRFT) - a practical way to leverage these unresolved trajectories. For this, we employ a critic LLM to assess the correctness of each step in a trajectory. Consequently, during training, we mask the loss for erroneous steps while retaining them in the context window. This way we ensure the model learns to recover from errors without reproducing them. Evaluation on SWE-bench Verified shows that while RFT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
