Perish or Flourish? A Holistic Evaluation of Large Language Models for Code Generation in Functional Programming
Nguyet-Anh H. Lang, Eric Lang, Thanh Le-Cong, Bach Le, Quyet-Thang Huynh

TL;DR
This paper introduces FPEval, a comprehensive framework for evaluating large language models on functional programming tasks, revealing performance gaps and style issues in code generation for Haskell, OCaml, and Scala.
Contribution
It presents FPEval, a new benchmark and evaluation framework for assessing LLMs on functional programming languages, including static analysis and repair capabilities.
Findings
LLMs perform better on newer models but still have high error rates in functional languages.
Generated code often follows imperative patterns, affecting maintainability.
Static analysis feedback helps LLMs partially self-repair code issues.
Abstract
Functional programming provides strong foundations for developing reliable and secure software systems, yet its adoption remains not widespread due to the steep learning curve. Recent advances in Large Language Models (LLMs) for code generation present new opportunities to lower these barriers. However, extensive evaluations of LLMs largely focus on imperative programming languages, and their capabilities in functional programming languages (FP) remain underexplored. To address this gap, we introduce FPEval, a holistic evaluation framework built on FPBench, a new benchmark of 721 programming tasks across three difficulty levels on three mainstream FP languages: Haskell, Ocaml and Scala. FPEval provides compehensive evaluation infrastructures with both test validations with comprehensive test suites and static analysis tools to assess both functional correctness and code style and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Scientific Computing and Data Management · Logic, programming, and type systems
