Customizing an LLM for Enterprise Software Engineering

Aditya Kini; Satish Chandra; Milad Hashemi; Saksham Thakur; Aditya Pandey; Vincent Nguyen; Marc Brockschmidt; Franjo Ivan\v{c}i\'c; Danny Tarlow; Parthasarathy Ranganathan; Petros Maniatis; Ahmed Omran; Zaheer Abbas; Anita Gergely; Martin Sevenich; Gufeng Zhang; Amy Hua; Alexander Fr\"ommgen

arXiv:2605.16517·cs.SE·May 21, 2026

Customizing an LLM for Enterprise Software Engineering

Aditya Kini, Satish Chandra, Milad Hashemi, Saksham Thakur, Aditya Pandey, Vincent Nguyen, Marc Brockschmidt, Franjo Ivan\v{c}i\'c, Danny Tarlow, Parthasarathy Ranganathan, Petros Maniatis, Ahmed Omran, Zaheer Abbas, Anita Gergely, Martin Sevenich, Gufeng Zhang, Amy Hua

PDF

TL;DR

This paper presents Gemini for Google, a specialized large language model adapted for enterprise software engineering, demonstrating significant improvements in developer productivity through extensive data curation and tailored training strategies.

Contribution

It introduces a comprehensive methodology for customizing LLMs for enterprise use, including data extraction, preparation, and deployment, with empirical validation on Google's internal ecosystem.

Findings

01

Gemini for Google reduced iterations per turn by 23%

02

Increased code survival rates by approximately 17%

03

Provided a blueprint for enterprise model adaptation

Abstract

Enterprise software development is a continuous evolutionary process, characterized by incremental additions, architectural revisions, production deployments and rigorous maintenance. These activities generate valuable data that modern LLMs could be finetuned on, to unlock additional tool possibilities for enterprise software engineering. While frontier LLMs are already very capable, this form of customization offers a compelling path for enterprise-specific optimization. We introduce Gemini for Google (GfG)}, an adaptation of Gemini specialized for Google's internal software engineering ecosystem. This paper details the model's end-to-end development, from curating a trillion-token proprietary dataset to implementing a mid-training strategy that mitigates catastrophic forgetting. In a large-scale blind A/B study across 29,000 developers, Gemini for Google significantly outperformed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.