Understanding attention-based encoder-decoder networks: a case study   with chess scoresheet recognition

Sergio Y. Hayashi; Nina S. T. Hirata

arXiv:2406.06538·cs.CV·June 12, 2024

Understanding attention-based encoder-decoder networks: a case study with chess scoresheet recognition

Sergio Y. Hayashi, Nina S. T. Hirata

PDF

TL;DR

This paper investigates how attention-based encoder-decoder neural networks learn to read handwritten chess scoresheets, focusing on understanding the learning process rather than just prediction accuracy.

Contribution

It characterizes the learning process of such networks by analyzing subtask interactions and factors affecting training, providing insights into their internal mechanisms.

Findings

01

Identifies key subtasks: input-output alignment, pattern recognition, handwriting recognition.

02

Reveals competition, collaboration, and dependence among subtasks.

03

Provides guidance on balancing factors for effective training.

Abstract

Deep neural networks are largely used for complex prediction tasks. There is plenty of empirical evidence of their successful end-to-end training for a diversity of tasks. Success is often measured based solely on the final performance of the trained network, and explanations on when, why and how they work are less emphasized. In this paper we study encoder-decoder recurrent neural networks with attention mechanisms for the task of reading handwritten chess scoresheets. Rather than prediction performance, our concern is to better understand how learning occurs in these type of networks. We characterize the task in terms of three subtasks, namely input-output alignment, sequential pattern recognition, and handwriting recognition, and experimentally investigate which factors affect their learning. We identify competition, collaboration and dependence relations between the subtasks, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.