Describing image focused in cognitive and visual details for visually impaired people: An approach to generating inclusive paragraphs
Daniel Louzada Fernandes, Marcos Henrique Fonseca Ribeiro, Fabio, Ribeiro Cerqueira, Michel Melo Silva

TL;DR
This paper presents a novel approach combining dense captioning, filtering, and language models to generate more interpretable and relevant image descriptions for visually impaired users in online content.
Contribution
It introduces a domain-specific image captioning method tailored for webinars, improving description relevance for visually impaired users.
Findings
Generated descriptions are more interpretable and focused on relevant information.
The approach outperforms traditional captioning methods in the target domain.
Enhanced assistive descriptions aid better understanding of webinar images.
Abstract
Several services for people with visual disabilities have emerged recently due to achievements in Assistive Technologies and Artificial Intelligence areas. Despite the growth in assistive systems availability, there is a lack of services that support specific tasks, such as understanding the image context presented in online content, e.g., webinars. Image captioning techniques and their variants are limited as Assistive Technologies as they do not match the needs of visually impaired people when generating specific descriptions. We propose an approach for generating context of webinar images combining a dense captioning technique with a set of filters, to fit the captions in our domain, and a language model for the abstractive summary task. The results demonstrated that we can produce descriptions with higher interpretability and focused on the relevant information for that group of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
