Model selection for the segmentation of multiparameter exponential family distributions
Alice Cleynen, Emilie Lebarbier

TL;DR
This paper develops a penalized likelihood method for segmenting univariate exponential family distributions, addressing the challenge of selecting the number of segments, and validates it through simulations and real data applications.
Contribution
It introduces a new penalized likelihood estimator with proven oracle inequality for exponential family segmentation, including categorical variables.
Findings
Estimator satisfies an oracle inequality.
Performance validated via simulation study.
Effective application on real categorical data.
Abstract
We consider the segmentation problem of univariate distributions from the exponential family with multiple parameters. In segmentation, the choice of the number of segments remains a difficult issue due to the discrete nature of the change-points. In this general exponential family distribution framework, we propose a penalized log-likelihood estimator where the penalty is inspired by papers of L. Birg\'e and P. Massart. The resulting estimator is proved to satisfy an oracle inequality. We then further study the particular case of categorical variables by comparing the values of the key constants when derived from the specification of our general approach and when obtained by working directly with the characteristics of this distribution. Finally, a simulation study is conducted to assess the performance of our criterion for the exponential distribution, and an application on real data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
