A Note on Code Quality Score: LLMs for Maintainable Large Codebases
Sherman Wong, Jalaj Bhandari, Leo Zhou Fan Yang, Xylan Xu, Yi Zhuang, Cem Cayiroglu, Payal Bhuptani, Sheela Yadawad, Hung Duong

TL;DR
This paper presents the Code Quality Score (CQS) system, which uses fine-tuned Llama3 models and rules to automatically detect code issues and provide reviews, improving maintainability in large-scale software development.
Contribution
Introduction of the CQS system utilizing fine-tuned Llama3 models and rules for automated code quality assessment and critique in industrial settings.
Findings
Achieved high precision in identifying code issues.
Rolled out successfully to industrial developers.
Maintained 60% weekly user helpfulness rate.
Abstract
Maintaining code quality in large-scale software systems presents significant challenges, particularly in settings where a large numbers of engineers work concurrently on a codebase. This paper introduces Code Quality Score (CQS) system to automatically detect issues with a set of code changes and provide actionable insights. At its core, the CQS system is powered by two Llama3 models, fine-tuned (with SFT and offline RL approaches), to a) detect common code quality issues related to coding best practices and b) to provide good ``critiques'' for LLM-generated code review respectively. To maintain good user experience, we layer the system with hand-crafted rules to filter out incorrect responses/hallucinations. Offline evaluations show that our CQS system is able to achieve an impressive precision rate for identifying valid issues. This system has already been rolled out to developers in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
