Successive normalization of rectangular arrays

Richard A. Olshen; Bala Rajaratnam

arXiv:1010.0520·math.ST·December 12, 2013

Successive normalization of rectangular arrays

Richard A. Olshen, Bala Rajaratnam

PDF

TL;DR

This paper introduces and analyzes a method for successively normalizing rectangular data arrays across both rows and columns, ensuring data are on comparable footing for high-dimensional scientific data analysis.

Contribution

The paper proposes a natural successive normalization technique for rectangular arrays and studies its convergence properties, with implementation on simulated and real scientific data.

Findings

01

Successive normalization converges under certain conditions.

02

The method effectively standardizes high-dimensional data.

03

Applications demonstrate improved data comparability.

Abstract

Standard statistical techniques often require transforming data to have mean $0$ and standard deviation $1$ . Typically, this process of "standardization" or "normalization" is applied across subjects when each subject produces a single number. High throughput genomic and financial data often come as rectangular arrays where each coordinate in one direction concerns subjects who might have different status (case or control, say), and each coordinate in the other designates "outcome" for a specific feature, for example, "gene," "polymorphic site" or some aspect of financial profile. It may happen, when analyzing data that arrive as a rectangular array, that one requires BOTH the subjects and the features to be "on the same footing." Thus there may be a need to standardize across rows and columns of the rectangular matrix. There arises the question as to how to achieve this double…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.