Contextual Gating within the Transformer Stack: Synergistic Feature Modulation for Enhanced Lyrical Classification and Calibration
M.A. Gameiro

TL;DR
This paper presents the SFL Transformer, a novel model that integrates structural features into the self-attention mechanism of a Transformer to improve lyrical classification accuracy and calibration.
Contribution
It introduces a Contextual Gating mechanism that modulates deep semantic features with structural cues within the Transformer stack, enhancing performance and reliability.
Findings
Achieved 0.9910 accuracy and macro F1 score, surpassing previous models.
Maintained low calibration error (ECE=0.0081) and Log Loss (0.0489).
Validated that mid-stack feature integration improves discriminative power.
Abstract
This study introduces a significant architectural advancement in feature fusion for lyrical content classification by integrating auxiliary structural features directly into the self-attention mechanism of a pre-trained Transformer. I propose the SFL Transformer, a novel deep learning model that utilizes a Contextual Gating mechanism (an Intermediate SFL) to modulate the sequence of hidden states within the BERT encoder stack, rather than fusing features at the final output layer. This approach modulates the deep, contextualized semantic features (Hseq) using low-dimensional structural cues (Fstruct). The model is applied to a challenging binary classification task derived from UMAP-reduced lyrical embeddings. The SFL Transformer achieved an Accuracy of 0.9910 and a Macro F1 score of 0.9910, significantly improving the state-of-the-art established by the previously published SFL model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Emotion and Mood Recognition · Sentiment Analysis and Opinion Mining
