Algorithmic Data Analytics, Small Data Matters and Correlation versus Causation
Hector Zenil

TL;DR
This paper reviews how algorithmic information theory can help distinguish correlation from causation and demonstrates that small data, through local approximations, can yield valuable insights into complex systems.
Contribution
It highlights the role of algorithmic complexity in understanding complex systems and proposes that small data can be effective with appropriate local models.
Findings
Correlation and causation are distinguished via algorithmic complexity.
Small data can provide meaningful insights through local approximations.
Long-range models may require infinite computation, but short-range estimations are feasible.
Abstract
This is a review of aspects of the theory of algorithmic information that may contribute to a framework for formulating questions related to complex highly unpredictable systems. We start by contrasting Shannon Entropy and Kolmogorov-Chaitin complexity epitomizing the difference between correlation and causation to then move onto surveying classical results from algorithmic complexity and algorithmic probability, highlighting their deep connection to the study of automata frequency distributions. We end showing how long-range algorithmic predicting models for economic and biological systems may require infinite computation but locally approximated short-range estimations are possible thereby showing how small data can deliver important insights into important features of complex "Big Data".
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms · Evolutionary Algorithms and Applications
