Random problems with R

Kellie Ottoboni; Philip B. Stark

arXiv:1809.06520·cs.MS·November 14, 2018·1 cites

Random problems with R

Kellie Ottoboni, Philip B. Stark

PDF

Open Access 1 Repo

TL;DR

This paper identifies a bias in R's random sampling method due to quantization effects, and proposes a fix by generating random integers directly from random bits, improving sampling uniformity.

Contribution

It highlights the bias in R's current random integer generation and introduces a simple, effective method to produce unbiased random samples using random bits.

Findings

01

R's current method causes non-uniform distributions in sampling.

02

Using random bits for integer generation reduces bias.

03

Python's numpy.random.randint() employs this improved approach.

Abstract

R (Version 3.5.1 patched) has an issue with its random sampling functionality. R generates random integers between $1$ and $m$ by multiplying random floats by $m$ , taking the floor, and adding $1$ to the result. Well-known quantization effects in this approach result in a non-uniform distribution on ${1, \dots, m}$ . The difference, which depends on $m$ , can be substantial. Because the sample function in R relies on generating random integers, random sampling in R is biased. There is an easy fix: construct random integers directly from random bits, rather than multiplying a random float by $m$ . That is the strategy taken in Python's numpy.random.randint() function, among others. Example source code in Python is available at https://github.com/statlab/cryptorandom/blob/master/cryptorandom/cryptorandom.py (see functions getrandbits() and randbelow_from_randbits()).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

statlab/cryptorandom
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsChaos-based Image/Signal Encryption · Algorithms and Data Compression · Cellular Automata and Applications