Towards Understanding End-of-trip Instructions in a Taxi Ride Scenario
Deepthi Karkada, Ramesh Manuvinakurike, Kallirroi Georgila

TL;DR
This paper presents a new dataset of human descriptions for end-of-trip locations in taxi scenarios, supporting research in visual and language understanding tasks.
Contribution
It introduces a novel dataset with annotations and a scheme for understanding end-of-trip location descriptions in taxi scenarios.
Findings
Dataset includes synthetic and real-world images with detailed annotations.
Pilot experiment demonstrates potential for visual reference resolution.
Supports various visual and language tasks in transportation contexts.
Abstract
We introduce a dataset containing human-authored descriptions of target locations in an "end-of-trip in a taxi ride" scenario. We describe our data collection method and a novel annotation scheme that supports understanding of such descriptions of target locations. Our dataset contains target location descriptions for both synthetic and real-world images as well as visual annotations (ground truth labels, dimensions of vehicles and objects, coordinates of the target location,distance and direction of the target location from vehicles and objects) that can be used in various visual and language tasks. We also perform a pilot experiment on how the corpus could be applied to visual reference resolution in this domain.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Speech and dialogue systems
