Linearity of Data and Linear Probability Space
Christopher M. Rembold

TL;DR
This paper explores the linearity properties of different data types, emphasizing transformations like log odds for bounded data to achieve linearity, and discusses methods for analyzing untidy data.
Contribution
It introduces a framework for understanding data linearity based on boundedness and proposes transforming probabilities into a linear probability space using log odds.
Findings
Log odds transformation improves mean and standard deviation calculations.
Two-sided bounded data are inherently non-linear without transformation.
Methods for analyzing untidy data are discussed.
Abstract
Some data is linearly additive, other data is not. In this paper, I discuss types of data based on the boundedness of the data and their linearity. 1) Unbounded data can be linear. 2) One-side bounded data is usually log transformed to be linear. 3) Two-side bounded data is not linear. 4) Untidy data do not fit in these categories. An example of two-sided bounded data is probabilities which should be transformed into a linear probability space by taking the logarithm of the odds ratio (log10 odds) which is termed Weight (W). Calculations of means and standard deviation is more accurate when calculated as W values than when calculated as probabilities. A methods to analyze untidy data is discussed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical and numerical algorithms · Scientific Research and Discoveries · Numerical Methods and Algorithms
