Per-Domain Generalizing Policies: On Validation Instances and Scaling   Behavior

Timo P. Gros; Nicola J. M\"uller; Daniel Fiser; Isabel Valera; Verena; Wolf; J\"org Hoffmann

arXiv:2505.00439·cs.LG·May 2, 2025

Per-Domain Generalizing Policies: On Validation Instances and Scaling Behavior

Timo P. Gros, Nicola J. M\"uller, Daniel Fiser, Isabel Valera, Verena, Wolf, J\"org Hoffmann

PDF

TL;DR

This paper introduces a dynamic validation set generation method to improve the scaling behavior of per-domain generalizing policies, demonstrating its effectiveness across multiple domains.

Contribution

It proposes a novel dynamic validation set generation technique and a refined evaluation methodology for assessing scaling behavior of policies.

Findings

01

Dynamic validation improves scaling performance in all tested domains.

02

Systematic test instance generation guarantees confidence in coverage performance.

03

Method enhances the generalization of GNN policies across varying instance sizes.

Abstract

Recent work has shown that successful per-domain generalizing action policies can be learned. Scaling behavior, from small training instances to large test instances, is the key objective; and the use of validation instances larger than training instances is one key to achieve it. Prior work has used fixed validation sets. Here, we introduce a method generating the validation set dynamically, on the fly, increasing instance size so long as informative and feasible.We also introduce refined methodology for evaluating scaling behavior, generating test instances systematically to guarantee a given confidence in coverage performance for each instance size. In experiments, dynamic validation improves scaling behavior of GNN policies in all 9 domains used.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.