Cows, Pigs and People: Enhanced Intensity-Based Clustering of Isomorphous Multi- Crystal Datasets in the Presence of Subtle Variations
Amy J Thompson, James Beilsten-Edmands, Cicely Tam, Juan Sanchez-Weatherby, James Sandy, Halina Mikolajek, Danny Axford, Sofia Jaho, Michael A Hough, Graeme Winter

TL;DR
This paper introduces improved methods for clustering isomorphous multi-crystal datasets, enabling clearer separation of structurally similar crystals with subtle differences.
Contribution
The paper introduces automated clustering methods in DIALS that enable unambiguous separation of isomorphous crystals with subtle structural differences.
Findings
Improved clustering methods successfully separate bovine, porcine, and human insulin crystals with isomorphous lattices.
Weighting of pairwise correlation coefficients and spatial density-based clustering algorithms enhance data separation.
The methods are now integrated into the DIALS framework for high-throughput data analysis.
Abstract
The high-throughput data collection capabilities of modern X-ray facilities are challenging data processing pipelines to keep pace, while also remaining accessible to non-expert users. Rigorous analysis of multi-crystal data is necessary to sort through the data deluge, and the problem of which datasets to merge becomes an interesting scientific question. Lattice non-isomorphism is a key issue which has been well addressed (and automated) though techniques such as unit cell clustering (Foadi et al., 2013). The effective separation of structurally isomorphous datasets with subtle differences (such as bound ligands, conformational changes or amino acid mutations), is a more challenging problem but has the promise to separate meaningfully different structures from crystal populations. Previous work has used the hierarchical clustering analysis of pairwise correlation coefficients to…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsClassical Studies and Philology · Jury Decision Making Processes · Law and Political Science
