TL;DR
This paper introduces a new shapelet-based formulation for Multiple-Instance Learning that considers all possible shapelets, providing a richer classifier class, with theoretical justification and practical algorithms for large datasets.
Contribution
It formulates MIL with all possible shapelets, reduces the problem to polynomial-size DC programs, and offers heuristics for large datasets with theoretical guarantees.
Findings
The proposed algorithm achieves comparable accuracy to existing methods.
Heuristics enable application to large datasets with reasonable computational time.
Theoretical results justify previous heuristic approaches.
Abstract
We propose a new formulation of Multiple-Instance Learning (MIL), in which a unit of data consists of a set of instances called a bag. The goal is to find a good classifier of bags based on the similarity with a "shapelet" (or pattern), where the similarity of a bag with a shapelet is the maximum similarity of instances in the bag. In previous work, some of the training instances are chosen as shapelets with no theoretical justification. In our formulation, we use all possible, and thus infinitely many shapelets, resulting in a richer class of classifiers. We show that the formulation is tractable, that is, it can be reduced through Linear Programming Boosting (LPBoost) to Difference of Convex (DC) programs of finite (actually polynomial) size. Our theoretical result also gives justification to the heuristics of some of the previous work. The time complexity of the proposed algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
