Multiclass classification utilising an estimated algorithmic probability prior
Kamaludin Dingle, Pau Batlle, Houman Owhadi

TL;DR
This paper demonstrates how approximations of algorithmic probability can improve multiclass classification accuracy, especially with limited training data, by estimating class probabilities from shape complexities in RNA molecule shape-to-sequence mapping.
Contribution
It introduces a novel approach of using algorithmic probability as a prior in machine learning, applied to RNA shape classification, showing benefits in small data regimes.
Findings
Prior improves classification accuracy with limited data
Algorithmic probability estimates aid in real-world ML tasks
Method outperforms baseline in small sample scenarios
Abstract
Methods of pattern recognition and machine learning are applied extensively in science, technology, and society. Hence, any advances in related theory may translate into large-scale impact. Here we explore how algorithmic information theory, especially algorithmic probability, may aid in a machine learning task. We study a multiclass supervised classification problem, namely learning the RNA molecule sequence-to-shape map, where the different possible shapes are taken to be the classes. The primary motivation for this work is a proof of concept example, where a concrete, well-motivated machine learning task can be aided by approximations to algorithmic probability. Our approach is based on directly estimating the class (i.e., shape) probabilities from shape complexities, and using the estimated probabilities as a prior in a Gaussian process learning problem. Naturally, with a large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · RNA and protein synthesis mechanisms · Machine Learning and Data Classification
MethodsGaussian Process
