Context-based Deep Learning Architecture with Optimal Integration Layer   for Image Parsing

Ranju Mandal; Basim Azam; and Brijesh Verma

arXiv:2204.06214·cs.CV·April 14, 2022

Context-based Deep Learning Architecture with Optimal Integration Layer for Image Parsing

Ranju Mandal, Basim Azam, and Brijesh Verma

PDF

Open Access

TL;DR

This paper introduces a three-layer deep learning architecture that explicitly integrates visual and contextual information for improved image parsing, utilizing genetic algorithms for optimal fusion of features.

Contribution

It presents a novel three-layer architecture with an optimal fusion layer using genetic algorithms, enhancing the integration of visual and contextual data in image parsing.

Findings

01

Improved accuracy on benchmark datasets

02

Stable predictions with optimized network weights

03

Effective integration of context and visual features

Abstract

Deep learning models have been efficient lately on image parsing tasks. However, deep learning models are not fully capable of exploiting visual and contextual information simultaneously. The proposed three-layer context-based deep architecture is capable of integrating context explicitly with visual information. The novel idea here is to have a visual layer to learn visual characteristics from binary class-based learners, a contextual layer to learn context, and then an integration layer to learn from both via genetic algorithm-based optimal fusion to produce a final decision. The experimental outcomes when evaluated on benchmark datasets are promising. Further analysis shows that optimized network weights can improve performance and make stable predictions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques