Turath-150K: Image Database of Arab Heritage
Dani Kiyasseh, Rasheed El-Bouri

TL;DR
Turath-150K is a large-scale image dataset capturing Arab cultural objects, activities, and scenarios, aiming to diversify training data for neural networks and promote research in under-represented regions.
Contribution
The paper introduces Turath-150K, a culturally-diverse Arab image database, along with specialized benchmarks, highlighting limitations of existing models and fostering inclusive machine learning research.
Findings
Existing ImageNet-pretrained models perform poorly on Arab cultural images.
Training on Turath improves classification accuracy on Arab-specific benchmarks.
Turath encourages research in under-represented cultural domains.
Abstract
Large-scale image databases remain largely biased towards objects and activities encountered in a select few cultures. This absence of culturally-diverse images, which we refer to as the hidden tail, limits the applicability of pre-trained neural networks and inadvertently excludes researchers from under-represented regions. To begin remedying this issue, we curate Turath-150K, a database of images of the Arab world that reflect objects, activities, and scenarios commonly found there. In the process, we introduce three benchmark databases, Turath Standard, Art, and UNESCO, specialised subsets of the Turath dataset. After demonstrating the limitations of existing networks pre-trained on ImageNet when deployed on such benchmarks, we train and evaluate several networks on the task of image classification. As a consequence of Turath, we hope to engage machine learning researchers in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Remote-Sensing Image Classification
