An Analysis into the Performance and Memory Usage of MATLAB Strings
Travis Near

TL;DR
This paper compares MATLAB's cell arrays and string arrays, demonstrating that string arrays are significantly faster, more memory-efficient, and easier to use for textual data processing.
Contribution
The study provides a comprehensive analysis of MATLAB string arrays, highlighting their performance, memory advantages, and improved usability over cell arrays.
Findings
String arrays run 2x to 40x faster than cell arrays.
String arrays have better data locality and reduced metadata overhead.
String arrays offer more expressive syntax with automatic conversions and vectorized methods.
Abstract
MATLAB is a mathematical computing environment used by many engineers, mathematicians, and students to process and understand their data. Important to all data science is the managing of textual data. MATLAB supports two textual data containers: (1) cell arrays of characters and (2) string arrays. This research showcases the strengths of string arrays over cell arrays by quantifying their performance, memory contiguity, syntax readability, interface fluidity, and autocomplete capabilities. These results demonstrate that string arrays often run 2x to 40x faster than cell arrays for common string benchmarks, are optimized for data locality by reducing metadata overhead, and offer a more expressive syntax due to their automatic data type conversions and vectorized methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Algorithms and Data Compression · Parallel Computing and Optimization Techniques
