l bnl compress: Lossy but not lossy compression, a python script to compress MX data
Herbert J Bernstein, Jean Jakoncic

TL;DR
This paper introduces a Python script for compressing macromolecular crystallographic data using lossy and lossless methods, with examples and future plans for AI integration.
Contribution
The novel contribution is a Python-based implementation of lossy compression techniques for NeXus/HDF5 MX data, with open-source code and examples.
Findings
The script uses JPEG-2000 and HCompress for lossy compression of crystallographic data.
Examples demonstrate compression on datasets including lysozyme and endonuclease.
Future plans include parallel processing and AI supervision for adaptive compression.
Abstract
l_bnl_compress.py is a python script implementing the lossy compressions described in [1] for macromolecular crystallographic diffraction data. The lossy compressions use pixel-by-pixel binning, image-by-image summing, JPEG-2000 Daubechies (DB) wavelet compression from the movie industry, and HCompress Haar (now known as DB0) wavelet compression from astronomy which are combined with the usual lossless MX compressions [2]. The original version of the necessary software was based on the CBF [3] representation of the data and written in C and bash. However, much of the current pool of macromolecular crystallographic data, especially for Dectris Eiger detectors, is written in NeXus/HDF5 NXmx format [4]. The new version of the software, named l_bnl_compress.py is a python script using h5py [5] with the astropy module to provide access to HCompress [6] lossy compression and the glymur…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLegal Cases and Commentary · Legal and Constitutional Studies · Legal Issues in Education
