Distribution Estimation with Side Information

Haricharan Balasundaram; Andrew Thangaraj

arXiv:2601.08535·cs.IT·January 19, 2026

Distribution Estimation with Side Information

Haricharan Balasundaram, Andrew Thangaraj

PDF

Open Access

TL;DR

This paper explores how side information, like word similarities or known probability groupings, can improve discrete distribution estimation from samples, providing theoretical analysis and empirical validation.

Contribution

It introduces two models leveraging side information—local neighborhood and partial ordering—and characterizes their impact on estimation accuracy.

Findings

01

Side information improves estimation risk bounds.

02

Theoretical analysis quantifies gains from side information.

03

Empirical results confirm theoretical improvements.

Abstract

We consider the classical problem of discrete distribution estimation using i.i.d. samples in a novel scenario where additional side information is available on the distribution. In large alphabet datasets such as text corpora, such side information arises naturally through word semantics/similarities that can be inferred by closeness of vector word embeddings, for instance. We consider two specific models for side information--a local model where the unknown distribution is in the neighborhood of a known distribution, and a partial ordering model where the alphabet is partitioned into known higher and lower probability sets. In both models, we theoretically characterize the improvement in a suitable squared-error risk because of the available side information. Simulations over natural language and synthetic data illustrate these gains.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis · Natural Language Processing Techniques