Homeostasis phenomenon in predictive inference when using a wrong learning model: a tale of random split of data into training and test sets
Min-ge Xie, Zheshi Zheng

TL;DR
This paper explores how conformal prediction reveals that prediction remains valid even with a wrong model under IID assumptions, but not when IID is violated, highlighting the importance of good modeling practices.
Contribution
It demonstrates the homeostasis property in predictive inference under IID conditions and clarifies its failure when IID assumptions are violated, emphasizing model quality.
Findings
Prediction remains valid with a wrong model under IID assumptions.
Homeostasis property is disrupted when IID assumption is violated.
Better modeling improves prediction accuracy in both IID and non-IID scenarios.
Abstract
This note uses a conformal prediction procedure to provide further support on several points discussed by Professor Efron (Efron, 2020) concerning prediction, estimation and IID assumption. It aims to convey the following messages: (1) Under the IID (e.g., random split of training and testing data sets) assumption, prediction is indeed an easier task than estimation, since prediction has a 'homeostasis property' in this case -- Even if the model used for learning is completely wrong, the prediction results maintain valid. (2) If the IID assumption is violated (e.g., a targeted prediction on specific individuals), the homeostasis property is often disrupted and the prediction results under a wrong model are usually invalid. (3) Better model estimation typically leads to more accurate prediction in both IID and non-IID cases. Good modeling and estimation practices are important and, in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Statistical Methods and Models · Machine Learning and Data Classification
