Describing image focused in cognitive and visual details for visually   impaired people: An approach to generating inclusive paragraphs

Daniel Louzada Fernandes; Marcos Henrique Fonseca Ribeiro; Fabio; Ribeiro Cerqueira; Michel Melo Silva

arXiv:2202.05331·cs.CV·February 17, 2022

Describing image focused in cognitive and visual details for visually impaired people: An approach to generating inclusive paragraphs

Daniel Louzada Fernandes, Marcos Henrique Fonseca Ribeiro, Fabio, Ribeiro Cerqueira, Michel Melo Silva

PDF

TL;DR

This paper presents a novel approach combining dense captioning, filtering, and language models to generate more interpretable and relevant image descriptions for visually impaired users in online content.

Contribution

It introduces a domain-specific image captioning method tailored for webinars, improving description relevance for visually impaired users.

Findings

01

Generated descriptions are more interpretable and focused on relevant information.

02

The approach outperforms traditional captioning methods in the target domain.

03

Enhanced assistive descriptions aid better understanding of webinar images.

Abstract

Several services for people with visual disabilities have emerged recently due to achievements in Assistive Technologies and Artificial Intelligence areas. Despite the growth in assistive systems availability, there is a lack of services that support specific tasks, such as understanding the image context presented in online content, e.g., webinars. Image captioning techniques and their variants are limited as Assistive Technologies as they do not match the needs of visually impaired people when generating specific descriptions. We propose an approach for generating context of webinar images combining a dense captioning technique with a set of filters, to fit the captions in our domain, and a language model for the abstractive summary task. The results demonstrated that we can produce descriptions with higher interpretability and focused on the relevant information for that group of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.