milearn: A Python Package for Multi-Instance Machine Learning
Dmitry Zankov, Pavlo Polishchuk, Michal Sobieraj, Mario Barbatti

TL;DR
milearn is a versatile Python package that simplifies multi-instance learning by integrating classical and neural methods with hyperparameter tuning, demonstrated across diverse synthetic datasets.
Contribution
It introduces a unified scikit-learn-compatible framework for MIL algorithms, including hyperparameter optimization tailored for small datasets and specialized support for key instance detection.
Findings
Effective across various synthetic MIL tasks
Supports both classical and neural network models
Includes hyperparameter optimization for small datasets
Abstract
We introduce milearn, a Python package for multi-instance learning (MIL) that follows the familiar scikit-learn fit/predict interface while providing a unified framework for both classical and neural-network-based MIL algorithms for regression and classification. The package also includes built-in hyperparameter optimization designed specifically for small MIL datasets, enabling robust model selection in data-scarce scenarios. We demonstrate the versatility of milearn across a broad range of synthetic MIL benchmark datasets, including digit classification and regression, molecular property prediction, and protein-protein interaction (PPI) prediction. Special emphasis is placed on the key instance detection (KID) problem, for which the package provides dedicated support.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Machine Learning and Data Classification · Machine Learning in Bioinformatics
