TL;DR
This paper explores computational problems related to total variation distance, providing a polynomial-time algorithm for mixture equivalence and proving hardness results for estimating TV distance in Ising models.
Contribution
It introduces a simple polynomial-time algorithm for mixture equivalence and establishes computational hardness for TV distance estimation in Ising models.
Findings
Polynomial-time algorithm for mixture distribution equivalence
Hardness result for TV distance estimation in Ising models
Implications for computational limits in probabilistic models
Abstract
We investigate some previously unexplored (or underexplored) computational aspects of total variation (TV) distance. First, we give a simple deterministic polynomial-time algorithm for checking equivalence between mixtures of product distributions, over arbitrary alphabets. This corresponds to a special case, whereby the TV distance between the two distributions is zero. Second, we prove that unless , it is impossible to efficiently estimate the TV distance between arbitrary Ising models, even in a bounded-error randomized setting.
Peer Reviews
Decision·ICLR 2025 Spotlight
The main contribution of this paper is the first result: deciding whether two mixtures of product distributions are the same. Suppose we have $k$ product distributions $P_1, P_2, \ldots, P_k$, where each $P_i$ is an $n$-dimensional product distribution, i.e., $X \sim P_i$ is a vector $(X_1, X_2, \ldots, X_n)$. The paper takes the prefix of $X$, namely $X^{\leq j} = (X_1, X_2, \ldots, X_j)$ for $j \leq n$. This distribution is denoted by $P^{\leq j}_i$. Then, they consider the mixture of $P^{\leq
The hardness result follows from standard results. Proposition 6 provides a self-reduction for the Ising model, which is used in the standard counting-to-sampling reduction. Therefore, the proof of Proposition 6 could be omitted. Proposition 8 essentially states that one can fix the value of a vertex $v$ by adjusting the function $h(v)$, allowing the TV distance to encode the marginal distribution. The relationship between Theorem 1 and Theorem 2 is not very strong, as they pertain to different
The paper addresses important computational questions about the total variation distance, which is fundamental in probability and statistics. The algorithm for equivalence checking of mixtures of product distributions is new and provides a practical solution to a non-trivial problem.This hardness result bridges complexity theory and statistical measures and provides insight into why certain computational tasks are hard. The proofs are well written, the results are accessible.
It would be nicer if the paper could elaborate more on the practical applications of the equivalence checking algorithm with regard to performance on real-world data. The hardness result could also be pushed further by thinking about the possibility of approximate algorithms with different complexity assumptions. It would be even more applicable and helpful with more examples or case studies.
- Mixtures of products are a fundamental family of probability distributions and checking their equivalence is one of the most basic questions about them. - The algorithm uses an interesting novel idea of keeping track of bases for the solution spaces of certain equations. This idea might find applications for testing equivalence of other classes of distributions. - The problem of estimating the total variation distance between two Ising model distributions is quite natural, as Ising models ar
- The algorithm for mixtures of product distributions can only check whether P=Q exactly. The paper would be stronger if it gave an algorithm for approximating the distance between P and Q. - The paper rules out FPRAS for TV distance between a pair of Ising models, but it seems that there could still be a constant-factor approximation algorithm, and the paper would be stronger if this question was also addressed (i.e. it was shown that this is also hard, or an algorithm was given).
Videos
