Domain Generalization through Spatial Relation Induction over Visual Primitives

Dat Nguyen; Duc-Duy Nguyen

arXiv:2605.06043·cs.CV·May 8, 2026

Domain Generalization through Spatial Relation Induction over Visual Primitives

Dat Nguyen, Duc-Duy Nguyen

PDF

TL;DR

This paper introduces PARSE, a novel domain generalization framework that explicitly models visual primitives and their spatial relations, leading to improved classification robustness across domains.

Contribution

PARSE explicitly factors visual recognition into primitives and their relations, enabling end-to-end learning of structural compositions for better domain generalization.

Findings

01

PARSE improves accuracy by over 4.5 percentage points on CUB-DG.

02

PARSE remains competitive with existing methods on DomainBed.

03

The approach models spatial relations with differentiable predicates.

Abstract

Domain generalization requires identifying stable representations that support reliable classification across domains. Most existing methods seek such stability through improving the training process, for example, through model selection strategies, data augmentation, or feature-alignment objectives. Although these strategies can be effective, they leave the representation learning of structural composition implicit, which may limit performance on compositional domain generalization benchmarks. In this work, we propose Primitive-Aware Relational Structure for domain gEneralization (PARSE), an image classification framework that factors visual recognition into visual primitives and their relational composition. We represent these compositions using soft binary, ternary, and quaternary predicates over primitive locations, yielding differentiable measures of spatial alignment that can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.