Deducing Truth from Correlation

Janis N\"otzel; Walter Swetly

arXiv:1412.5831·cs.IT·May 15, 2018

Deducing Truth from Correlation

Janis N\"otzel, Walter Swetly

PDF

TL;DR

This paper introduces 'dependent component analysis', showing that with three independent noisy copies of data, one can accurately infer the true underlying distribution, extending concepts similar to independent component analysis.

Contribution

The paper demonstrates that three independent noisy copies are sufficient to recover the true distribution, introducing a new approach called dependent component analysis.

Findings

01

Three copies suffice for maximum precision in distribution estimation.

02

Invertibility is activated through multiple parallel data uses.

03

Generalizations to different alphabet sizes are provided.

Abstract

This work is motivated by a question at the heart of unsupervised learning approaches: Assume we are collecting a number K of (subjective) opinions about some event E from K different agents. Can we infer E from them? Prima facie this seems impossible, since the agents may be lying. We model this task by letting the events be distributed according to some distribution p and the task is to estimate p under unknown noise. Again, this is impossible without additional assumptions. We report here the finding of very natural such assumptions - the availability of multiple copies of the true data, each under independent and invertible (in the sense of matrices) noise, is already sufficient: If the true distribution and the observations are modeled on the same finite alphabet, then the number of such copies needed to determine p to the highest possible precision is exactly three! This result…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.