SparseChem: Fast and accurate machine learning model for small molecules
Adam Arany, Jaak Simm, Martijn Oldenhof, Yves Moreau

TL;DR
SparseChem is a software package that enables fast, accurate machine learning on high-dimensional sparse biochemical data, supporting various models and accessible via command line and Python.
Contribution
It introduces a versatile, efficient library for machine learning on large-scale sparse biochemical datasets, with easy integration and broad functionality.
Findings
Supports millions of features and compounds
Allows training of classification, regression, and censored regression models
Accessible via command line and Python
Abstract
SparseChem provides fast and accurate machine learning models for biochemical applications. Especially, the package supports very high-dimensional sparse inputs, e.g., millions of features and millions of compounds. It is possible to train classification, regression and censored regression models, or combination of them from command line. Additionally, the library can be accessed directly from Python. Source code and documentation is freely available under MIT License on GitHub.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational Drug Discovery Methods · Machine Learning in Materials Science · Analytical Chemistry and Chromatography
