Generalized Alpha Investing: Definitions, Optimality Results, and   Application to Public Databases

Ehud Aharoni; Saharon Rosset

arXiv:1307.0522·stat.ME·July 3, 2013

Generalized Alpha Investing: Definitions, Optimality Results, and Application to Public Databases

Ehud Aharoni, Saharon Rosset

PDF

Open Access

TL;DR

This paper introduces a generalized alpha investing procedure for controlling false discovery rates in large public databases, optimizing power and cost-efficiency compared to previous methods.

Contribution

It develops a more general, optimal version of Alpha Investing for mFDR control and applies it to quality preserving databases to reduce costs and improve false discovery management.

Findings

01

The generalized procedure outperforms Alpha Investing in power.

02

Optimal expected reward version enhances false discovery control.

03

Application to public databases reduces costs significantly.

Abstract

The increasing prevalence and utility of large, public databases necessitates the development of appropriate methods for controlling false discovery. Motivated by this challenge, we discuss the generic problem of testing a possibly infinite stream of null hypotheses. In this context, Foster and Stine (2008) suggested a novel method named Alpha Investing for controlling a false discovery measure known as mFDR. We develop a more general procedure for controlling mFDR, of which Alpha Investing is a special case. We show that in common, practical situations, the general procedure can be optimized to produce an expected reward optimal (ERO) version, which is more powerful than Alpha Investing. We then present the concept of quality preserving databases (QPD), originally introduced in Aharoni et al. (2011), which formalizes efficient public database management to simultaneously save costs…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Quality and Management · Privacy-Preserving Technologies in Data · Machine Learning and Algorithms