# ROSA: R Optimizations with Static Analysis

**Authors:** Rathijit Sen, Jianqiao Zhu, Jignesh M. Patel, and Somesh Jha

arXiv: 1704.02996 · 2017-07-04

## TL;DR

ROSA is a static analysis framework that optimizes R programs by analyzing their properties to enable transformations, significantly improving execution time and memory efficiency in data analytics workflows.

## Contribution

ROSA introduces a static analysis approach for R that enables multiple program transformations to enhance performance and space efficiency, addressing key limitations in existing R executions.

## Key findings

- Significant reductions in execution time.
- Substantial memory consumption improvements.
- Effective transformations enabled by static analysis.

## Abstract

R is a popular language and programming environment for data scientists. It is increasingly co-packaged with both relational and Hadoop-based data platforms and can often be the most dominant computational component in data analytics pipelines. Recent work has highlighted inefficiencies in executing R programs, both in terms of execution time and memory requirements, which in practice limit the size of data that can be analyzed by R. This paper presents ROSA, a static analysis framework to improve the performance and space efficiency of R programs. ROSA analyzes input programs to determine program properties such as reaching definitions, live variables, aliased variables, and types of variables. These inferred properties enable program transformations such as C++ code translation, strength reduction, vectorization, code motion, in addition to interpretive optimizations such as avoiding redundant object copies and performing in-place evaluations. An empirical evaluation shows substantial reductions by ROSA in execution time and memory consumption over both CRAN R and Microsoft R Open.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.02996/full.md

## Figures

15 figures with captions in the complete paper: https://tomesphere.com/paper/1704.02996/full.md

## References

53 references — full list in the complete paper: https://tomesphere.com/paper/1704.02996/full.md

---
Source: https://tomesphere.com/paper/1704.02996