MojoBench: Language Modeling and Benchmarks for Mojo
Nishat Raihan, Joanna C. S. Santos, Marcos Zampieri

TL;DR
MojoBench introduces a new framework and benchmark dataset for evaluating and improving language models' ability to generate code in the Mojo programming language, highlighting significant performance gains and insights into model adaptability.
Contribution
This work is the first to develop a dedicated benchmark and pretrained model for Mojo code generation, expanding LLM evaluation to emerging programming languages.
Findings
Mojo-Coder outperforms GPT-4o and Claude-3.5-Sonnet by 30-35%
MojoBench provides a comprehensive evaluation framework for Mojo code generation
Insights into LLM behavior with underrepresented programming languages
Abstract
The recently introduced Mojo programming language (PL) by Modular, has received significant attention in the scientific community due to its claimed significant speed boost over Python. Despite advancements in code Large Language Models (LLMs) across various PLs, Mojo remains unexplored in this context. To address this gap, we introduce MojoBench, the first framework for Mojo code generation. MojoBench includes HumanEval-Mojo, a benchmark dataset designed for evaluating code LLMs on Mojo, and Mojo-Coder, the first LLM pretrained and finetuned for Mojo code generation, which supports instructions in 5 natural languages (NLs). Our results show that Mojo-Coder achieves a 30-35% performance improvement over leading models like GPT-4o and Claude-3.5-Sonnet. Furthermore, we provide insights into LLM behavior with underrepresented and unseen PLs, offering potential strategies for enhancing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimedia Communication and Technology · Recommender Systems and Techniques · Video Analysis and Summarization
MethodsSoftmax · Attention Is All You Need · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
