OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Tianyu Zheng, Ge Zhang, Tianhao Shen, Xueling Liu, Bill Yuchen Lin,, Jie Fu, Wenhu Chen, and Xiang Yue

TL;DR
OpenCodeInterpreter is an open-source system that combines code generation, execution, and iterative refinement, achieving performance close to GPT-4 and bridging the gap between open models and proprietary systems.
Contribution
It introduces a family of open-source code systems with execution and feedback integration, supported by a large multi-turn interaction dataset, enhancing code accuracy and refinement.
Findings
OpenCodeInterpreter-33B achieves 83.2% accuracy on HumanEval and MBPP.
Performance is close to GPT-4's 84.2%, especially with GPT-4 synthesized feedback.
The system significantly narrows the gap between open-source and proprietary code generation models.
Abstract
The introduction of large language models has significantly advanced code generation. However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter. To address this, we introduce OpenCodeInterpreter, a family of open-source code systems designed for generating, executing, and iteratively refining code. Supported by Code-Feedback, a dataset featuring 68K multi-turn interactions, OpenCodeInterpreter integrates execution and human feedback for dynamic code refinement. Our comprehensive evaluation of OpenCodeInterpreter across key benchmarks such as HumanEval, MBPP, and their enhanced versions from EvalPlus reveals its exceptional performance. Notably, OpenCodeInterpreter-33B achieves an accuracy of 83.2 (76.4) on the average (and plus versions) of HumanEval and MBPP, closely rivaling GPT-4's 84.2 (76.2) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗m-a-p/OpenCodeInterpreter-DS-6.7Bmodel· 433 dl· ♡ 135433 dl♡ 135
- 🤗m-a-p/OpenCodeInterpreter-DS-33Bmodel· 26 dl· ♡ 14826 dl♡ 148
- 🤗m-a-p/OpenCodeInterpreter-CL-7Bmodel· 18 dl· ♡ 1118 dl♡ 11
- 🤗m-a-p/OpenCodeInterpreter-CL-13Bmodel· 15 dl· ♡ 915 dl♡ 9
- 🤗m-a-p/OpenCodeInterpreter-CL-34Bmodel· 167 dl· ♡ 14167 dl♡ 14
- 🤗m-a-p/OpenCodeInterpreter-CL-70Bmodel· 83 dl· ♡ 2483 dl♡ 24
- 🤗LoneStriker/OpenCodeInterpreter-CL-70B-GGUFmodel· 114 dl· ♡ 3114 dl♡ 3
- 🤗LoneStriker/OpenCodeInterpreter-CL-7B-GGUFmodel· 136 dl· ♡ 1136 dl♡ 1
- 🤗LoneStriker/OpenCodeInterpreter-CL-7B-3.0bpw-h6-exl2model· 1 dl1 dl
- 🤗LoneStriker/OpenCodeInterpreter-CL-7B-4.0bpw-h6-exl2model· 1 dl1 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Software Testing and Debugging Techniques · Logic, programming, and type systems
MethodsLinear Layer · Dropout · Dense Connections · Label Smoothing · Adam · Attention Is All You Need · Softmax · Multi-Head Attention · Layer Normalization · Residual Connection
