MojoFrame: Dataframe Library in Mojo Language
Shengya Huang, Zhaoheng Li, Derek Werner, Yongjoo Park

TL;DR
MojoFrame is the first dataframe library in Mojo, supporting relational operations with high performance, leveraging Mojo's capabilities for efficient data science workflows, and achieving up to 4.60x speedup over existing libraries.
Contribution
Introduces MojoFrame, the first Mojo-native dataframe library supporting core relational operations and UDFs, optimized for Mojo's architecture and performance.
Findings
Supports all TPC-H query operations with promising performance
Achieves up to 4.60x speedup over other dataframe libraries
Demonstrates effective integration of numeric and non-numeric data in Mojo
Abstract
Mojo is an emerging programming language built on MLIR (Multi-Level Intermediate Representation) and supports JIT (Just-in-Time) compilation. It enables transparent hardware-specific optimizations (e.g., for CPUs and GPUs), while allowing users to express their logic using Python-like user-friendly syntax. Mojo has demonstrated strong performance on tensor operations; however, its capabilities for relational operations (e.g., filtering, join, and group-by aggregation) common in data science workflows, remain unexplored. To date, no dataframe implementation exists in the Mojo ecosystem. In this paper, we introduce the first Mojo-native dataframe library, called MojoFrame, that supports core relational operations and user-defined functions (UDFs). MojoFrame is built on top of Mojo's tensor to achieve fast operations on numeric columns, while utilizing a cardinality-aware approach to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Scientific Computing and Data Management · Parallel Computing and Optimization Techniques
