Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers
Jinyang Liu, Wondmgezahu Teshome, Sandesh Ghimire, Mario Sznaier,, Octavia Camps

TL;DR
This paper introduces JPDVT, a diffusion transformer-based method that effectively solves complex image and video jigsaw puzzles, including those with missing pieces, by generating and utilizing content-conditioned positional information.
Contribution
The paper presents a novel diffusion transformer approach for jigsaw puzzle solving that outperforms existing discriminative models, especially with many puzzle elements and missing pieces.
Findings
Achieves state-of-the-art results on multiple datasets.
Effective in puzzles with a large number of elements.
Handles missing puzzle pieces successfully.
Abstract
Solving image and video jigsaw puzzles poses the challenging task of rearranging image fragments or video frames from unordered sequences to restore meaningful images and video sequences. Existing approaches often hinge on discriminative models tasked with predicting either the absolute positions of puzzle elements or the permutation actions applied to the original data. Unfortunately, these methods face limitations in effectively solving puzzles with a large number of elements. In this paper, we propose JPDVT, an innovative approach that harnesses diffusion transformers to address this challenge. Specifically, we generate positional information for image patches or video frames, conditioned on their underlying visual content. This information is then employed to accurately assemble the puzzle pieces in their correct positions, even in scenarios involving missing pieces. Our method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Archaeological Research and Protection · Forensic Anthropology and Bioarchaeology Studies
MethodsJigsaw · Diffusion
