Approximating photo-$z$ PDFs for large surveys
A.I. Malz, P.J. Marshall, S.J. Schmidt, M.L. Graham, J. DeRose, R., Wechsler

TL;DR
This paper introduces a Python tool for compressing galaxy photo-$z$ PDFs, compares different storage formats, and provides best practices for efficient approximation in large surveys.
Contribution
It presents a new Python package, $ exttt{qp}$, for effective photo-$z$ PDF compression and evaluates various storage formats using realistic mock datasets.
Findings
Quantiles and samples outperform step functions in PDF approximation.
Best practices depend on the properties of the PDFs and the fidelity metrics used.
The approach improves storage efficiency while maintaining scientific accuracy.
Abstract
Modern galaxy surveys produce redshift probability density functions (PDFs) in addition to traditional photometric redshift (photo-) point estimates. However, the storage of photo- PDFs may present a challenge with increasingly large catalogs, as we face a trade-off between the accuracy of subsequent science measurements and the limitation of finite storage resources. This paper presents , a Python package for manipulating parametrizations of 1-dimensional PDFs, as suitable for photo- PDF compression. We use to investigate the performance of three simple PDF storage formats (quantiles, samples, and step functions) as a function of the number of stored parameters on two realistic mock datasets, representative of upcoming surveys with different data qualities. We propose some best practices for choosing a photo- PDF approximation scheme and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
