Searching for internal symbols underlying deep learning
Jung H. Lee, Sujith Vijayan

TL;DR
This paper investigates whether deep neural networks develop internal abstract codes that can enhance decision-making reliability and safety, by combining segmentation models and unsupervised learning to uncover these internal symbols.
Contribution
It introduces a novel approach to extract and analyze internal abstract codes in DNNs, extending prior work on understanding high-level features in deep learning.
Findings
Internal codes can be extracted using combined segmentation and unsupervised methods
Abstract internal codes potentially contribute to more reliable decision-making
Study provides evidence of internal symbolic representations in DNNs
Abstract
Deep learning (DL) enables deep neural networks (DNNs) to automatically learn complex tasks or rules from given examples without instructions or guiding principles. As we do not engineer DNNs' functions, it is extremely difficult to diagnose their decisions, and multiple lines of studies proposed to explain the principles of their operations. Notably, one line of studies suggests that DNNs may learn concepts, the high level features that are recognizable to humans. In this study, we extend this line of studies and hypothesize that DNNs can develop abstract codes that can be used to augment DNNs' decision-making. To address this hypothesis, we combine foundation segmentation models and unsupervised learning to extract internal codes and identify potential use of abstract codes to make DL's decision-making more reliable and safer.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Machine Learning and Data Classification · Topic Modeling
