Asymptotic enumeration of admixed arrays and a different independence heuristic
Alan J. Aw

TL;DR
This paper develops an asymptotic enumeration framework for admixed arrays, revealing a novel correction factor in the independence heuristic for large-scale genetic data models.
Contribution
It introduces admixed arrays, derives exact formulas and asymptotic expansions, and uncovers a new correction factor in the independence heuristic for constrained binary matrices.
Findings
Exact enumeration formulas for marginal constraints
Asymptotic expansion with explicit fourth-moment term
Correction factor of 1/√[4]{e} in the independence heuristic
Abstract
We introduce a class of paired binary matrices called admixed arrays, which arise in analyses of large-scale genetic data and can be viewed as weighted edge colorings of complete bipartite graphs. This combinatorial structure gives rise to two natural families of marginal constraints: a row-sum constraint and a paired column-sum constraint, the latter inducing an inequality among entries of the matrix pair. We study the enumeration of admixed arrays under these constraints in dense regimes. First, we obtain exact formulas for the sizes of the families defined by each constraint in isolation and derive a finite-size criterion characterizing when one constraint is more restrictive than the other. In the large-dimension limit, this comparison simplifies to an entropy inequality, yielding an information-theoretic interpretation and a quantifiable error bound in the semi-regular case. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
