Supervised level-wise pretraining for recurrent neural network initialization in multi-class classification
Dino Ienco, Roberto Interdonato, Raffaele Gaetano

TL;DR
This paper introduces a supervised, data-aware layer-wise pretraining method for RNNs that improves initialization and generalization in multi-class classification tasks, demonstrated across speech and remote sensing benchmarks.
Contribution
It presents the first data-aware RNN initialization strategy using a taxonomy derived from model behavior, enhancing classification performance.
Findings
Improved RNN initialization leads to better generalization.
Significant performance gains on speech and remote sensing benchmarks.
Data-aware pretraining supports more effective RNN training.
Abstract
Recurrent Neural Networks (RNNs) can be seriously impacted by the initial parameters assignment, which may result in poor generalization performances on new unseen data. With the objective to tackle this crucial issue, in the context of RNN based classification, we propose a new supervised layer-wise pretraining strategy to initialize network parameters. The proposed approach leverages a data-aware strategy that sets up a taxonomy of classification problems automatically derived by the model behavior. To the best of our knowledge, despite the great interest in RNN-based classification, this is the first data-aware strategy dealing with the initialization of such models. The proposed strategy has been tested on four benchmarks coming from two different domains, i.e., Speech Recognition and Remote Sensing. Results underline the significance of our approach and point out that data-aware…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
