TL;DR
This paper introduces RIdiom, a hybrid knowledge-driven approach combining rules and large language models to accurately detect and refactor Python code into idiomatic forms, outperforming existing methods.
Contribution
It presents a novel hybrid framework integrating rule-based and LLM techniques for Pythonic idiom refactoring, addressing limitations of previous approaches.
Findings
RIdiom achieves over 90% accuracy and F1-score on nine established idioms.
The approach outperforms Prompt-LLM in all evaluated metrics.
It maintains high precision while significantly improving recall and F1-score.
Abstract
Pythonic idioms are highly valued and widely used in the Python programming community. However, many Python users find it challenging to use Pythonic idioms. Adopting a rule-based approach or LLM-only approach is not sufficient to overcome three persistent challenges of code idiomatization including code miss, wrong detection and wrong refactoring. Motivated by the determinism of rules and adaptability of LLMs, we propose a hybrid approach consisting of three modules. We not only write prompts to instruct LLMs to complete tasks, but we also invoke Analytic Rule Interfaces (ARIs) to accomplish tasks. The ARIs are Python code generated by prompting LLMs to generate code. We first construct a knowledge module with three elements including ASTscenario, ASTcomponent and Condition, and prompt LLMs to generate Python code for incorporation into an ARI library for subsequent use. After that,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
