torchtune: PyTorch native post-training library

Mark Obozov; Maxime Griot; Joseph Cummings; Evan Smothers; Felipe Mello; Rafi Ayub; Philip John Bontrager; Salman Mohammadi; Ariel Kwiatkowski; Nathan Azrak; Mircea Mironenco

arXiv:2605.21442·cs.LG·May 21, 2026

torchtune: PyTorch native post-training library

Mark Obozov, Maxime Griot, Joseph Cummings, Evan Smothers, Felipe Mello, Rafi Ayub, Philip John Bontrager, Salman Mohammadi, Ariel Kwiatkowski, Nathan Azrak, Mircea Mironenco

PDF

TL;DR

torchtune is a PyTorch-native library that simplifies and enhances the post-training process of large language models, focusing on modularity, transparency, and efficiency.

Contribution

It introduces a flexible, transparent, and efficient post-training library for LLMs that outperforms existing frameworks in performance and memory usage.

Findings

01

Provides strong performance across various settings

02

Achieves better memory efficiency than competitors

03

Maintains high flexibility for research iteration

Abstract

Modern LLMs typically require multistage training pipelines to achieve strong downstream performance, with post-training serving as the main interface for adapting open-weight models. We introduce torchtune, a PyTorch-native library designed to streamline the post-training lifecycle of LLMs, enabling efficient fine-tuning, experimentation, and deployment-oriented workflows. Unlike many existing fine-tuning frameworks, which often optimize for ease of use, specialized recipes, or hardware efficiency at the cost of transparency and extensibility, torchtune emphasizes modularity, hackability, and direct access to the underlying PyTorch components. In this paper, we present the design principles behind torchtune, describe how they are reflected in its model builders, training recipes, and distributed training stack, and evaluate the library across representative post-training settings. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.