Template-Based Schema Matching of Multi-Layout Tenancy Schedules:A Comparative Study of a Template-Based Hybrid Matcher and the ALITE Full Disjunction Model
Tim Uilkema, Yao Ma, Seyed Sahand Mohammadi Ziabari, Joep van Vliet

TL;DR
This paper introduces a hybrid, template-based schema matching approach for tenancy schedules that improves alignment accuracy and usability over traditional models like ALITE by combining schema and instance-based metrics.
Contribution
It presents a novel hybrid matcher that integrates schema and data metrics with optimal assignment algorithms, enhancing schema matching for multi-layout tenancy schedules.
Findings
Peak F1-score of 0.881 with the hybrid matcher
45.7% nulls in the optimized matching results
Superior performance compared to ALITE on ground truth datasets
Abstract
The lack of standardized tabular formats for tenancy schedules across real estate firms creates significant inefficiencies in data integration. Existing automated integration methods, such as Full Disjunction (FD)-based models like ALITE, prioritize completeness but result in schema bloat, sparse attributes and limited business usability. We propose a novel hybrid, template-based schema matcher that aligns multi-layout tenancy schedules to a predefined target schema. The matcher combines schema (Jaccard, Levenshtein) and instance-based metrics (data types, distributions) with globally optimal assignments determined via the Hungarian Algorithm. Evaluation against a manually labeled ground truth demonstrates substantial improvements, with grid search optimization yielding a peak F1-score of 0.881 and an overall null percentage of 45.7%. On a separate ground truth of 20 semantically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
