pyLEMMINGS: Large Margin Multiple Instance Classification and Ranking for Bioinformatics Applications
Amina Asif, Wajid Arshad Abbasi, Farzeen Munir, Asa Ben-Hur, and, Fayyaz ul Amir Afsar Minhas

TL;DR
pyLEMMINGS introduces efficient large margin algorithms for multiple instance classification and ranking, significantly improving accuracy and computational speed in bioinformatics applications.
Contribution
It presents stochastic sub-gradient optimization algorithms for multiple instance learning, providing a scalable and accurate software suite called pyLEMMINGS.
Findings
Successfully identified protein functional segments such as binding sites and amyloid cores.
Achieved state-of-the-art performance on bioinformatics and benchmark datasets.
Over 100-fold faster than heuristic methods with improved accuracy.
Abstract
Motivation: A major challenge in the development of machine learning based methods in computational biology is that data may not be accurately labeled due to the time and resources required for experimentally annotating properties of proteins and DNA sequences. Standard supervised learning algorithms assume accurate instance-level labeling of training data. Multiple instance learning is a paradigm for handling such labeling ambiguities. However, the widely used large-margin classification methods for multiple instance learning are heuristic in nature with high computational requirements. In this paper, we present stochastic sub-gradient optimization large margin algorithms for multiple instance classification and ranking, and provide them in a software suite called pyLEMMINGS. Results: We have tested pyLEMMINGS on a number of bioinformatics problems as well as benchmark datasets.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Machine Learning and Data Classification · Image Retrieval and Classification Techniques
