Semantic distillation: a method for clustering objects by their contextual specificity
Thomas Sierocinski (IRMAR), Anthony Le B\'echec, Nathalie Th\'eret,, Dimitri Petritis (IRMAR)

TL;DR
This paper introduces a novel fuzzy hierarchical clustering method called semantic distillation, inspired by quantum measurement theory, to analyze and cluster experimental data such as DNA array gene specificity.
Contribution
It unifies concepts from IR, statistical analysis, and quantum measurement to develop a new clustering technique for complex experimental data.
Findings
Successfully applied to DNA array data
Effectively clusters genes by specificity
Demonstrates cross-disciplinary approach
Abstract
Techniques for data-mining, latent semantic analysis, contextual search of databases, etc. have long ago been developed by computer scientists working on information retrieval (IR). Experimental scientists, from all disciplines, having to analyse large collections of raw experimental data (astronomical, physical, biological, etc.) have developed powerful methods for their statistical analysis and for clustering, categorising, and classifying objects. Finally, physicists have developed a theory of quantum measurement, unifying the logical, algebraic, and probabilistic aspects of queries into a single formalism. The purpose of this paper is twofold: first to show that when formulated at an abstract level, problems from IR, from statistical data analysis, and from physical measurement theories are very similar and hence can profitably be cross-fertilised, and, secondly, to propose a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDNA and Biological Computing · Gene expression and cancer classification · Fractal and DNA sequence analysis
