A Survey on Machine Learning Techniques for Auto Labeling of Video, Audio, and Text Data
Shikun Zhang, Omid Jafari, Parth Nagarkar

TL;DR
This survey reviews machine learning techniques aimed at reducing data labeling costs for video, audio, and text, emphasizing auto annotation methods to improve efficiency and model robustness.
Contribution
It provides a comprehensive overview of existing auto labeling techniques across multiple data modalities, highlighting recent advances and research trends.
Findings
Transfer learning reduces data annotation needs.
Auto labeling techniques improve efficiency in data preparation.
Research focus is shifting towards automated annotation methods.
Abstract
Machine learning has been utilized to perform tasks in many different domains such as classification, object detection, image segmentation and natural language analysis. Data labeling has always been one of the most important tasks in machine learning. However, labeling large amounts of data increases the monetary cost in machine learning. As a result, researchers started to focus on reducing data annotation and labeling costs. Transfer learning was designed and widely used as an efficient approach that can reasonably reduce the negative impact of limited data, which in turn, reduces the data preparation cost. Even transferring previous knowledge from a source domain reduces the amount of data needed in a target domain. However, large amounts of annotated data are still demanded to build robust models and improve the prediction accuracy of the model. Therefore, researchers started to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Video Analysis and Summarization · Image Retrieval and Classification Techniques
