Prove Symbolic Regression is NP-hard by Symbol Graph
Jinglu Song, Qiang Lu, Bozhou Tian, Jingwen Zhang, Jake Luo, Zhiguang, Wang

TL;DR
This paper demonstrates that symbolic regression is NP-hard by introducing the concept of symbol graphs and linking the problem to the NP-hard Steiner Arborescence problem.
Contribution
It introduces symbol graphs as a novel representation and proves the NP-hardness of symbolic regression through a reduction to a known NP-hard problem.
Findings
Symbol graphs effectively represent the entire expression space.
Symbolic regression is NP-hard due to its connection with DCSAP.
The complexity of SR is established through reduction to Steiner Arborescence problem.
Abstract
Symbolic regression (SR) is the task of discovering a symbolic expression that fits a given data set from the space of mathematical expressions. Despite the abundance of research surrounding the SR problem, there's a scarcity of works that confirm its NP-hard nature. Therefore, this paper introduces the concept of a symbol graph as a comprehensive representation of the entire mathematical expression space, effectively illustrating the NP-hard characteristics of the SR problem. Leveraging the symbol graph, we establish a connection between the SR problem and the task of identifying an optimally fitted degree-constrained Steiner Arborescence (DCSAP). The complexity of DCSAP, which is proven to be NP-hard, directly implies the NP-hard nature of the SR problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Machine Learning and Data Classification · Face and Expression Recognition
MethodsSparse Evolutionary Training
