Learning based Methods for Code Runtime Complexity Prediction
Jagriti Sikka, Kushal Satya, Yaman Kumar, Shagun Uppal, Rajiv Ratn, Shah, Roger Zimmermann

TL;DR
This paper explores machine learning approaches to approximate code runtime complexity prediction, proposing a new dataset and establishing baseline models to aid in code analysis and development tools.
Contribution
It introduces CoRCoD, the first annotated dataset for code complexity, and compares feature-based and embedding-based models for complexity prediction.
Findings
Feature engineering and code embeddings achieve competitive results.
The dataset enables benchmarking of complexity prediction models.
Potential applications include automated grading and IDE tools.
Abstract
Predicting the runtime complexity of a programming code is an arduous task. In fact, even for humans, it requires a subtle analysis and comprehensive knowledge of algorithms to predict time complexity with high fidelity, given any code. As per Turing's Halting problem proof, estimating code complexity is mathematically impossible. Nevertheless, an approximate solution to such a task can help developers to get real-time feedback for the efficiency of their code. In this work, we model this problem as a machine learning task and check its feasibility with thorough analysis. Due to the lack of any open source dataset for this task, we propose our own annotated dataset CoRCoD: Code Runtime Complexity Dataset, extracted from online judges. We establish baselines using two different approaches: feature engineering and code embeddings, to achieve state of the art results and compare their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
