MobQA: A Benchmark Dataset for Semantic Understanding of Human Mobility Data through Question Answering

Hikaru Asano; Hiroki Ouchi; Akira Kasuga; Ryo Yonetani

arXiv:2508.11163·cs.CL·August 18, 2025

MobQA: A Benchmark Dataset for Semantic Understanding of Human Mobility Data through Question Answering

Hikaru Asano, Hiroki Ouchi, Akira Kasuga, Ryo Yonetani

PDF

TL;DR

MobQA introduces a comprehensive dataset to evaluate large language models' ability to understand and interpret human mobility data through various question types, highlighting current strengths and limitations in semantic reasoning.

Contribution

The paper presents MobQA, a new benchmark dataset with diverse question types to assess LLMs' semantic understanding of human mobility data.

Findings

01

LLMs perform well on factual data retrieval.

02

Significant challenges remain in semantic reasoning tasks.

03

Trajectory length affects model performance.

Abstract

This paper presents MobQA, a benchmark dataset designed to evaluate the semantic understanding capabilities of large language models (LLMs) for human mobility data through natural language question answering. While existing models excel at predicting human movement patterns, it remains unobvious how much they can interpret the underlying reasons or semantic meaning of those patterns. MobQA provides a comprehensive evaluation framework for LLMs to answer questions about diverse human GPS trajectories spanning daily to weekly granularities. It comprises 5,800 high-quality question-answer pairs across three complementary question types: factual retrieval (precise data extraction), multiple-choice reasoning (semantic inference), and free-form explanation (interpretive description), which all require spatial, temporal, and semantic reasoning. Our evaluation of major LLMs reveals strong…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.