Multinational Address Parsing: A Zero-Shot Evaluation
Marouane Yassine, David Beauchemin, Fran\c{c}ois Laviolette and, Luc Lamontagne

TL;DR
This paper investigates zero-shot transfer learning for address parsing across multiple countries using neural networks, attention mechanisms, and domain adversarial training, achieving state-of-the-art results without country-specific training.
Contribution
It introduces a zero-shot transfer approach for address parsing, applying neural networks with attention and adversarial training to generalize across countries.
Findings
State-of-the-art performance in most countries tested
Attention and adversarial training improve zero-shot transfer
Incomplete addresses impact model performance
Abstract
Address parsing consists of identifying the segments that make up an address, such as a street name or a postal code. Because of its importance for tasks like record linkage, address parsing has been approached with many techniques, the latest relying on neural networks. While these models yield notable results, previous work on neural networks has only focused on parsing addresses from a single source country. This paper explores the possibility of transferring the address parsing knowledge acquired by training deep learning models on some countries' addresses to others with no further training in a zero-shot transfer learning setting. We also experiment using an attention mechanism and a domain adversarial training algorithm in the same zero-shot transfer setting to improve performance. Both methods yield state-of-the-art performance for most of the tested countries while giving good…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data-Driven Disease Surveillance · Data Quality and Management
