Contextual Regression: An Accurate and Conveniently Interpretable Nonlinear Model for Mining Discovery from Scientific Data
Chengyu Liu, Wei Wang

TL;DR
Contextual regression is a novel hybrid model combining neural networks and dot products, achieving high accuracy and interpretability for nonlinear scientific data analysis, enabling discovery of new biological insights.
Contribution
It introduces a new hybrid neural network architecture that is both accurate and interpretable for nonlinear data, filling a gap in scientific data mining.
Findings
Achieved high fidelity recovery of feature contributions under noise.
Outperformed state-of-the-art methods in predicting open chromatin sites.
Uncovered two new histone marks related to open chromatin.
Abstract
Machine learning algorithms such as linear regression, SVM and neural network have played an increasingly important role in the process of scientific discovery. However, none of them is both interpretable and accurate on nonlinear datasets. Here we present contextual regression, a method that joins these two desirable properties together using a hybrid architecture of neural network embedding and dot product layer. We demonstrate its high prediction accuracy and sensitivity through the task of predictive feature selection on a simulated dataset and the application of predicting open chromatin sites in the human genome. On the simulated data, our method achieved high fidelity recovery of feature contributions under random noise levels up to 200%. On the open chromatin dataset, the application of our method not only outperformed the state of the art method in terms of accuracy, but also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Gene expression and cancer classification · Machine Learning in Bioinformatics
MethodsSupport Vector Machine
