On sampling and modeling complex systems
Matteo Marsili, Iacopo Mastromatteo, Yasser Roudi

TL;DR
This paper explores how limited sampling and partial knowledge affect modeling complex systems, revealing that the distribution of known variables follows a Boltzmann form and identifying conditions for predictability and informativeness of samples.
Contribution
It introduces a generic framework linking sampling limitations to system modeling, deriving the Boltzmann distribution form, and characterizing sample informativeness and variable relevance.
Findings
Distribution of known variables follows Boltzmann form.
Predictability is limited when relevant variables exceed a threshold.
Zipf's law emerges at the crossover between under-sampling and sufficient sampling.
Abstract
The study of complex systems is limited by the fact that only few variables are accessible for modeling and sampling, which are not necessarily the most relevant ones to explain the systems behavior. In addition, empirical data typically under sample the space of possible states. We study a generic framework where a complex system is seen as a system of many interacting degrees of freedom, which are known only in part, that optimize a given function. We show that the underlying distribution with respect to the known variables has the Boltzmann form, with a temperature that depends on the number of unknown variables. In particular, when the unknown part of the objective function decays faster than exponential, the temperature decreases as the number of variables increases. We show in the representative case of the Gaussian distribution, that models are predictable only when the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
