
TL;DR
This paper explores methods to extract human-readable equations from neural networks, aiming to improve interpretability when the underlying patterns are describable by simple mathematical expressions.
Contribution
It proposes a novel approach to interpret neural networks by deriving explicit equations that approximate the model's behavior.
Findings
Successful extraction of readable equations from neural networks
Enhanced interpretability of complex models in physics-related problems
Potential for simplifying neural network explanations in real-world applications
Abstract
Artificial neural networks are often very complex and too deep for a human to understand. As a result, they are usually referred to as black boxes. For a lot of real-world problems, the underlying pattern itself is very complicated, such that an analytic solution does not exist. However, in some cases, laws of physics, for example, the pattern can be described by relatively simple mathematical expressions. In that case, we want to get a readable equation rather than a black box. In this paper, we try to find a way to explain a network and extract a human-readable equation that describes the model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Explainable Artificial Intelligence (XAI) · Time Series Analysis and Forecasting
