Bias in Automated Image Colorization: Metrics and Error Types
Frank Stapel, Floris Weers, Doina Bucur

TL;DR
This paper investigates biases in automated image colorization, introducing new metrics for local and regional bias, and categorizing common error types to better understand and evaluate colorization quality.
Contribution
It presents novel bias measurement techniques and error categorization methods for assessing GAN-based image colorization models.
Findings
Desaturation effect observed in colorized images
Blue shift is a common bias across results
Color shifts vary among different image categories
Abstract
We measure the color shifts present in colorized images from the ADE20K dataset, when colorized by the automatic GAN-based DeOldify model. We introduce fine-grained local and regional bias measurements between the original and the colorized images, and observe many colorization effects. We confirm a general desaturation effect, and also provide novel observations: a shift towards the training average, a pervasive blue shift, different color shifts among image categories, and a manual categorization of colorization errors in three classes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis
MethodsColorization
