Discussion of "Data fission: splitting a single data point"
Anna Neufeld, Ameer Dharamshi, Lucy L. Gao, Daniela Witten, Jacob Bien

TL;DR
This paper discusses data fission, a generalization of sample splitting, extending its applicability beyond Gaussian and Poisson distributions and providing practical guidance for logistic regression and distributional misspecification.
Contribution
It offers P1 fission operations for various distribution families and practical guidance on applying P2 fission, especially in logistic regression contexts.
Findings
Provides P1 fission methods for multiple distribution families.
Offers practical guidance for implementing P2 fission.
Interprets P2 fission as a solution for distributional misspecification.
Abstract
Leiner et al. [2023] introduce an important generalization of sample splitting, which they call data fission. They consider two cases of data fission: P1 fission and P2 fission. While P1 fission is extremely useful and easy to use, Leiner et al. [2023] provide P1 fission operations only for the Gaussian and the Poisson distributions. They provide little guidance on how to apply P2 fission operations in practice, leaving the reader unsure of how to apply data fission outside of the Gaussian and Poisson settings. In this discussion, we describe how our own work provides P1 fission operations in a wide variety of families and offers insight into when P1 fission is possible. We also provide guidance on how to actually apply P2 fission in practice, with a special focus on logistic regression. Finally, we interpret P2 fission as a remedy for distributional misspecification when carrying out…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data Technologies and Applications
