Conjecturing-Based Discovery of Patterns in Data

J.P. Brooks; D.J. Edwards; C.E. Larson; N. Van Cleemput

arXiv:2011.11576·cs.LG·July 18, 2023·1 cites

Conjecturing-Based Discovery of Patterns in Data

J.P. Brooks, D.J. Edwards, C.E. Larson, N. Van Cleemput

PDF

Open Access 1 Repo

TL;DR

This paper introduces a conjecturing machine framework that discovers feature relationships in data, including nonlinear bounds and boolean expressions, and applies it to COVID-19 patient data to identify risk factors.

Contribution

The paper presents a novel conjecturing framework capable of uncovering known and new feature relationships, outperforming previous symbolic regression methods in recovering feature equations.

Findings

01

Recovered known nonlinear and boolean relationships among features.

02

Suggested potential COVID-19 risk factors confirmed by medical literature.

03

Demonstrated effectiveness of the conjecturing approach on real-world data.

Abstract

We propose the use of a conjecturing machine that suggests feature relationships in the form of bounds involving nonlinear terms for numerical features and boolean expressions for categorical features. The proposed Conjecturing framework recovers known nonlinear and boolean relationships among features from data. In both settings, true underlying relationships are revealed. We then compare the method to a previously-proposed framework for symbolic regression on the ability to recover equations that are satisfied among features in a dataset. The framework is then applied to patient-level data regarding COVID-19 outcomes to suggest possible risk factors that are confirmed in the medical literature.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nvcleemp/conjecturing
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Explainable Artificial Intelligence (XAI) · Machine Learning and Data Classification