Time Travel: LLM-Assisted Semantic Behavior Localization with Git Bisect
Yujing Wang, Weize Hong

TL;DR
This paper introduces a novel LLM-augmented framework for semantic fault localization in Git bisect, improving success rates and reducing bisect time by handling noisy, non-deterministic software behaviors.
Contribution
It integrates structured reasoning and fine-tuning of LLMs into Git bisect, addressing challenges of flaky tests and semantic divergence in modern software development.
Findings
6.4 percentage point increase in success rate
Up to 2x reduction in bisect time
Effective handling of noisy, non-deterministic faults
Abstract
We present a novel framework that integrates Large Language Models (LLMs) into the Git bisect process for semantic fault localization. Traditional bisect assumes deterministic predicates and binary failure states assumptions often violated in modern software development due to flaky tests, nonmonotonic regressions, and semantic divergence from upstream repositories. Our system augments bisect traversal with structured chain of thought reasoning, enabling commit by commit analysis under noisy conditions. We evaluate multiple open source and proprietary LLMs for their suitability and fine tune DeepSeekCoderV2 using QLoRA on a curated dataset of semantically labeled diffs. We adopt a weak supervision workflow to reduce annotation overhead, incorporating human in the loop corrections and self consistency filtering. Experiments across multiple open source projects show a 6.4 point absolute…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Software Testing and Debugging Techniques
