On the String Kernel Pre-Image Problem with Applications in Drug Discovery
S\'ebastien Gigu\`ere, Am\'elie Rolland, Fran\c{c}ois Laviolette and, Mario Marchand

TL;DR
This paper introduces a new low-complexity upper bound for the string kernel pre-image problem, enabling efficient search algorithms with applications in drug discovery, particularly in identifying druggable peptides.
Contribution
It develops a novel upper bound for the string kernel pre-image problem and demonstrates its effectiveness in a branch and bound algorithm for drug discovery.
Findings
Effective in discovering druggable peptides
Provides a computationally efficient approach
Applicable to various string kernels
Abstract
The pre-image problem has to be solved during inference by most structured output predictors. For string kernels, this problem corresponds to finding the string associated to a given input. An algorithm capable of solving or finding good approximations to this problem would have many applications in computational biology and other fields. This work uses a recent result on combinatorial optimization of linear predictors based on string kernels to develop, for the pre-image, a low complexity upper bound valid for many string kernels. This upper bound is used with success in a branch and bound searching algorithm. Applications and results in the discovery of druggable peptides are presented and discussed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Advanced biosensing and bioanalysis techniques · Machine Learning and Algorithms
