OPTED: Open Preprocessed Trachoma Eye Dataset Using Zero-Shot SAM 3 Segmentation
Kibrom Gebremedhin, Hadush Hailu, Bruk Gebregziabher

TL;DR
The paper introduces OPTED, an open-source dataset for trachoma detection created using a zero-shot segmentation model, with a reproducible pipeline for preprocessing clinical eyelid images.
Contribution
It presents a novel, reproducible pipeline utilizing SAM 3 for automated region extraction and dataset creation for trachoma classification.
Findings
Achieved 99.5% detection rate with optimal prompt
Produced high-quality, standardized images suitable for machine learning
Released dataset and code openly for research use
Abstract
Trachoma remains the leading infectious cause of blindness worldwide, with Sub-Saharan Africa bearing over 85% of the global burden and Ethiopia alone accounting for more than half of all cases. Yet publicly available preprocessed datasets for automated trachoma classification are scarce, and none originate from the most affected region. Raw clinical photographs of eyelids contain significant background noise that hinders direct use in machine learning pipelines. We present OPTED, an open-source preprocessed trachoma eye dataset constructed using the Segment Anything Model 3 (SAM 3) for automated region-of-interest extraction. We describe a reproducible four-step pipeline: (1) text-prompt-based zero-shot segmentation of the tarsal conjunctiva using SAM 3, (2) background removal and bounding-box cropping with alignment, (3) quality filtering based on confidence scores, and (4) Lanczos…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
