Linearity of Data and Linear Probability Space

Christopher M. Rembold

arXiv:1904.01494·math.ST·April 3, 2019·1 cites

Linearity of Data and Linear Probability Space

Christopher M. Rembold

PDF

Open Access

TL;DR

This paper explores the linearity properties of different data types, emphasizing transformations like log odds for bounded data to achieve linearity, and discusses methods for analyzing untidy data.

Contribution

It introduces a framework for understanding data linearity based on boundedness and proposes transforming probabilities into a linear probability space using log odds.

Findings

01

Log odds transformation improves mean and standard deviation calculations.

02

Two-sided bounded data are inherently non-linear without transformation.

03

Methods for analyzing untidy data are discussed.

Abstract

Some data is linearly additive, other data is not. In this paper, I discuss types of data based on the boundedness of the data and their linearity. 1) Unbounded data can be linear. 2) One-side bounded data is usually log transformed to be linear. 3) Two-side bounded data is not linear. 4) Untidy data do not fit in these categories. An example of two-sided bounded data is probabilities which should be transformed into a linear probability space by taking the logarithm of the odds ratio (log10 odds) which is termed Weight (W). Calculations of means and standard deviation is more accurate when calculated as W values than when calculated as probabilities. A methods to analyze untidy data is discussed.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical and numerical algorithms · Scientific Research and Discoveries · Numerical Methods and Algorithms