Interpretable Models in ANNs

Yang Li

arXiv:2011.12424·cs.LG·November 26, 2020

Interpretable Models in ANNs

Yang Li

PDF

Open Access

TL;DR

This paper explores methods to extract human-readable equations from neural networks, aiming to improve interpretability when the underlying patterns are describable by simple mathematical expressions.

Contribution

It proposes a novel approach to interpret neural networks by deriving explicit equations that approximate the model's behavior.

Findings

01

Successful extraction of readable equations from neural networks

02

Enhanced interpretability of complex models in physics-related problems

03

Potential for simplifying neural network explanations in real-world applications

Abstract

Artificial neural networks are often very complex and too deep for a human to understand. As a result, they are usually referred to as black boxes. For a lot of real-world problems, the underlying pattern itself is very complicated, such that an analytic solution does not exist. However, in some cases, laws of physics, for example, the pattern can be described by relatively simple mathematical expressions. In that case, we want to get a readable equation rather than a black box. In this paper, we try to find a way to explain a network and extract a human-readable equation that describes the model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Explainable Artificial Intelligence (XAI) · Time Series Analysis and Forecasting