Editorial: Information theory meets deep neural networks: theory and applications

Anguo Zhang; Qichun Zhang; Kai Zhao

PMC · DOI:10.3389/fnins.2024.1448517·July 15, 2024

Editorial: Information theory meets deep neural networks: theory and applications

Anguo Zhang, Qichun Zhang, Kai Zhao

PDF

Open Access

Abstract

Keywords

artificial neural networksinformation theoryinformation bottleneckdeep learning—artificial intelligencedeep neural networks (DNNs)

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Neural Networks and Applications · Adversarial Robustness in Machine Learning

Full text

We are delighted to introduce this Research Topic, titled “Information Theory Meets Deep Neural Networks: Theory and Applications”. Deep neural networks (DNNs) have become a focal point in machine learning research, achieving impressive results across various tasks. However, understanding their workings and mechanisms remains challenging (Samek et al., 2021; Gawlikowski et al., 2023). Information theory, a mathematical framework for representing and analyzing information, has been widely applied to study the fundamental characteristics of data, such as structure and distribution. In the context of DNNs, information theory has been instrumental in explaining and optimizing their performance (Zhang and Li, 2019; Zhang et al., 2022, 2023). For instance, the information bottleneck theory has shed light on the abstract representations of neural networks, while entropy and mutual information have been used to evaluate model complexity and generalization performance (Wu et al., 2023). This Research Topic aims to explore the intersection of information theory and DNNs, two fields that have profoundly impacted the understanding and advancement of neural networks and their applications. The synergy between these disciplines offers promising avenues for developing more efficient, robust, and interpretable AI systems. In this Research Topic, we present four papers that illustrate the breadth and depth of research at this intersection, highlighting innovative methodologies and their applications in various domains.

You and Wang proposed a novel approach to genealogy layout recognition. Recognizing the significance of genealogies in cultural heritage, the authors introduced a sublinear information bottleneck (SIB) for feature extraction and a two-stage deep learning model combining SIB-ResNet and SIB-YOLOv5. This method surpassed existing state-of-the-art techniques, offering promising results in identifying and localizing components in genealogy images. This advancement not only aids in genealogy research but also in preserving cultural heritage through improved recognition technologies.

Li and Peng addressed the challenges of synthetic aperture radar (SAR) automatic target recognition (ATR). The study introduced a data augmentation technique that mitigates SAR image noise and a weighted ResNet with residual strain control. This approach not only enhances computational efficiency but also improves recognition accuracy, significantly reducing training time and data requirements. The experimental results demonstrated the superior performance of this method, paving the way for more efficient SAR ATR systems.

Alazeb et al. focused on shifting to the realm of robotic environments and scene classification. The paper presented a robust framework for multi-object detection and scene understanding, leveraging advanced visual sensor technologies and deep learning models. By integrating preprocessing, semantic segmentation, feature extraction, and object recognition, the proposed system achieved remarkable accuracy on standard datasets such as PASCALVOC-12, Cityscapes, and Caltech 101. This work represented a significant step forward in enhancing the capabilities of vision-based systems in various applications, from autonomous driving to augmented reality.

Finally, Chen et al. delved into the theoretical aspects of neural network training. The authors propose a novel method for adaptive learning rate estimation in restricted Boltzmann machines (RBMs) using rectified linear units (ReLUs). By providing mathematical expressions for adaptive learning step calculation, this approach optimized the learning rate dynamically, improving the generalization ability and reducing the loss function more effectively than traditional methods. This theoretical contribution offers valuable insights into the optimization of unsupervised learning algorithms.

In conclusion, this Research Topic showcased the innovative research at the crossroads of information theory and deep neural networks. The contributions presented here not only advance theoretical understanding but also demonstrate practical applications that hold the potential to transform various fields. We extended our gratitude to the authors for their exceptional work and to the reviewers for their rigorous evaluation. We hope this Research Topic inspires further research and collaboration in this exciting domain.

Author contributions

AZ: Resources, Writing – original draft. QZ: Writing – review & editing. KZ: Writing – review & editing.

Bibliography6

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Gawlikowski J.Tassi C. R. N.Ali M.Lee J.Humt M.Feng J.. (2023). A survey of uncertainty in deep neural networks. Artif. Intellig. Rev. 56, 1513–1589. 10.1007/s 10462-023-10562-9 · doi ↗
2Samek W.Montavon G.Lapuschkin S.Anders C. J.Müller K.-R. (2021). Explaining deep neural networks and beyond: A review of methods and applications. Proc, IEEE 109, 247–278. 10.1109/JPROC.2021.3060483 · doi ↗
3Wu J.Huang Y.Gao M.Gao Z.Zhao J.Shi J.. (2023). Exponential information bottleneck theory against intra-attribute variations for pedestrian attribute recognition. IEEE Trans. Inform. Forens. Secur. 18, 5623–5635. 10.1109/TIFS.2023.3311584 · doi ↗
4Zhang A.Li X.Gao Y.Niu Y. (2022). Event-driven intrinsic plasticity for spiking convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 33, 1986–1995. 10.1109/TNNLS.2021.308495534106868 · doi ↗ · pubmed ↗
5Zhang A.Shi J.Wu J.Zhou Y.Yu W. (2023). Low latency and sparse computing spiking neural networks with self-driven adaptive threshold plasticity. IEEE Trans. Neural Netw. Learn. Syst. 1–12. 10.1109/TNNLS.2023.330051437581976 · doi ↗ · pubmed ↗
6Zhang W.Li P. (2019). Information-theoretic intrinsic plasticity for online unsupervised learning in spiking neural networks. Front. Neurosci. 13:31. 10.3389/fnins.2019.0003130804736 PMC 6371195 · doi ↗ · pubmed ↗