Feature Gradient Flow for Interpreting Deep Neural Networks in Head and Neck Cancer Prediction
Yinzhu Jin, Jonathan C. Garneau, P. Thomas Fletcher

TL;DR
This paper proposes feature gradient flow, a novel interpretability technique for deep neural networks, and demonstrates its effectiveness in understanding models predicting head and neck cancer metastasis from CT images.
Contribution
It introduces feature gradient flow for model interpretability and a training regularization method to enhance neural network transparency.
Findings
Feature gradient flow effectively identifies important features.
Regularization improves model interpretability without sacrificing accuracy.
Method applied successfully to cancer metastasis prediction.
Abstract
This paper introduces feature gradient flow, a new technique for interpreting deep learning models in terms of features that are understandable to humans. The gradient flow of a model locally defines nonlinear coordinates in the input data space representing the information the model is using to make its decisions. Our idea is to measure the agreement of interpretable features with the gradient flow of a model. To then evaluate the importance of a particular feature to the model, we compare that feature's gradient flow measure versus that of a baseline noise feature. We then develop a technique for training neural networks to be more interpretable by adding a regularization term to the loss function that encourages the model gradients to align with those of chosen interpretable features. We test our method in a convolutional neural network prediction of distant metastasis of head and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsALIGN
