Transfer Learning with Synthetic Corpora for Spatial Role Labeling and   Reasoning

Roshanak Mirzaee; Parisa Kordjamshidi

arXiv:2210.16952·cs.CL·November 7, 2022

Transfer Learning with Synthetic Corpora for Spatial Role Labeling and Reasoning

Roshanak Mirzaee, Parisa Kordjamshidi

PDF

Open Access 1 Repo

TL;DR

This paper introduces new synthetic and real-world datasets for spatial language tasks, demonstrating that pretraining with synthetic data enhances model performance, especially with limited target domain data.

Contribution

It provides two novel datasets for spatial question answering and role labeling, and shows synthetic data pretraining improves spatial language model performance.

Findings

01

Pretraining with synthetic data boosts SOTA results.

02

Synthetic datasets cover diverse spatial relations.

03

Performance gains are significant with small target data.

Abstract

Recent research shows synthetic data as a source of supervision helps pretrained language models (PLM) transfer learning to new target tasks/domains. However, this idea is less explored for spatial language. We provide two new data resources on multiple spatial language processing tasks. The first dataset is synthesized for transfer learning on spatial question answering (SQA) and spatial role labeling (SpRL). Compared to previous SQA datasets, we include a larger variety of spatial relation types and spatial expressions. Our data generation process is easily extendable with new spatial expression lexicons. The second one is a real-world SQA dataset with human-generated questions built on an existing corpus with SPRL annotations. This dataset can be used to evaluate spatial language processing models in realistic situations. We show pretraining with automatically generated data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hlr/spartun
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Multimodal Machine Learning Applications · Geographic Information Systems Studies