MaskIt: Masking for efficient utilization of incomplete public datasets for training deep learning models
Ankit Kariryaa

TL;DR
This paper introduces MaskIt, a masking technique that enables training deep learning models on incomplete datasets by focusing on available data regions, demonstrated on a city tree dataset with 78.4% accuracy.
Contribution
The paper proposes a novel masking approach that allows effective training on incomplete datasets by integrating masks as model inputs, improving data utilization.
Findings
Achieved 78.4% accuracy in predicting trees within masked regions.
Successfully trained deep learning models on incomplete datasets using the masking method.
Demonstrated the approach on a real-world city dataset from Hamburg, Germany.
Abstract
A major challenge in training deep learning models is the lack of high quality and complete datasets. In the paper, we present a masking approach for training deep learning models from a publicly available but incomplete dataset. For example, city of Hamburg, Germany maintains a list of trees along the roads, but this dataset does not contain any information about trees in private homes and parks. To train a deep learning model on such a dataset, we mask the street trees and aerial images with the road network. Road network used for creating the mask is downloaded from OpenStreetMap, and it marks the area where the training data is available. The mask is passed to the model as one of the inputs and it also coats the output. Our model learns to successfully predict trees only in the masked region with 78.4% accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote Sensing and LiDAR Applications · Species Distribution and Climate Change · Remote Sensing in Agriculture
