EnTri: Ensemble Learning with Tri-level Representations for Explainable Scene Recognition
Amirhossein Aminimehr, Amirali Molaei, Erik Cambria

TL;DR
EnTri is an ensemble learning framework that uses a hierarchy of visual features at multiple levels to improve scene recognition accuracy and provide interpretable explanations of its predictions.
Contribution
It introduces a novel tri-level feature representation and an extension algorithm for generating visual and textual explanations, enhancing both accuracy and interpretability in scene recognition.
Findings
Achieves state-of-the-art accuracy on benchmark datasets
Provides visual and textual explanations for scene classification
Demonstrates improved interpretability without sacrificing accuracy
Abstract
Scene recognition based on deep-learning has made significant progress, but there are still limitations in its performance due to challenges posed by inter-class similarities and intra-class dissimilarities. Furthermore, prior research has primarily focused on improving classification accuracy, yet it has given less attention to achieving interpretable, precise scene classification. Therefore, we are motivated to propose EnTri, an ensemble scene recognition framework that employs ensemble learning using a hierarchy of visual features. EnTri represents features at three distinct levels of detail: pixel-level, semantic segmentation-level, and object class and frequency level. By incorporating distinct feature encoding schemes of differing complexity and leveraging ensemble strategies, our approach aims to improve classification accuracy while enhancing transparency and interpretability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
