Better together? Statistical learning in models made of modules
Pierre E. Jacob (Harvard University), Lawrence M. Murray (Uppsala, University), Chris C. Holmes (University of Oxford), Christian P. Robert, (Universit\'e Paris-Dauphine, PSL Research University, University of Warwick)

TL;DR
This paper explores the advantages of modular statistical models over full models in situations with heterogeneous data and potential misspecification, providing criteria to guide the choice between them.
Contribution
It introduces principled criteria for selecting modular versus full-model approaches in complex, misspecified settings across various applied fields.
Findings
Modular approaches can prevent contamination from misspecified modules.
Criteria for choosing between modular and full models are proposed.
Modular methods are advantageous in many applied statistical contexts.
Abstract
In modern applications, statisticians are faced with integrating heterogeneous data modalities relevant for an inference, prediction, or decision problem. In such circumstances, it is convenient to use a graphical model to represent the statistical dependencies, via a set of connected "modules", each relating to a specific data modality, and drawing on specific domain expertise in their development. In principle, given data, the conventional statistical update then allows for coherent uncertainty quantification and information propagation through and across the modules. However, misspecification of any module can contaminate the estimate and update of others, often in unpredictable ways. In various settings, particularly when certain modules are trusted more than others, practitioners have preferred to avoid learning with the full model in favor of approaches that restrict the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Statistical Methods and Bayesian Inference · Data Analysis with R
