Advancing TDFN: Precise Fixation Point Generation Using Reconstruction Differences
Shuguang Wang, Yuanjing Wang

TL;DR
This paper improves the TDFN model by using image reconstruction differences to guide fixation point generation, leading to more accurate fixations and better classification performance with fewer fixations.
Contribution
It introduces a novel training method for fixation point generation based on reconstruction differences, enhancing accuracy and efficiency of the TDFN.
Findings
Achieves highly accurate fixation points
Significantly improves classification accuracy
Reduces the number of fixations needed
Abstract
Wang and Wang (2025) proposed the Task-Driven Fixation Network (TDFN) based on the fixation mechanism, which leverages low-resolution information along with high-resolution details near fixation points to accomplish specific visual tasks. The model employs reinforcement learning to generate fixation points. However, training reinforcement learning models is challenging, particularly when aiming to generate pixel-level accurate fixation points on high-resolution images. This paper introduces an improved fixation point generation method by leveraging the difference between the reconstructed image and the input image to train the fixation point generator. This approach directs fixation points to areas with significant differences between the reconstructed and input images. Experimental results demonstrate that this method achieves highly accurate fixation points, significantly enhances the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques
