Diffusion Model with Clustering-based Conditioning for Food Image Generation
Yue Han, Jiangpeng He, Mridul Gupta, Edward J. Delp, Fengqing Zhu

TL;DR
This paper introduces ClusDiff, a clustering-based conditional diffusion model that generates high-quality synthetic food images to improve data augmentation and class balance in food image analysis.
Contribution
The paper presents ClusDiff, a novel clustering-based training framework for diffusion models, enhancing synthetic food image quality and diversity for data augmentation.
Findings
ClusDiff outperforms existing generative models on Food-101.
Synthetic images from ClusDiff improve food classification accuracy.
ClusDiff helps mitigate class imbalance in food datasets.
Abstract
Image-based dietary assessment serves as an efficient and accurate solution for recording and analyzing nutrition intake using eating occasion images as input. Deep learning-based techniques are commonly used to perform image analysis such as food classification, segmentation, and portion size estimation, which rely on large amounts of food images with annotations for training. However, such data dependency poses significant barriers to real-world applications, because acquiring a substantial, diverse, and balanced set of food images can be challenging. One potential solution is to use synthetic food images for data augmentation. Although existing work has explored the use of generative adversarial networks (GAN) based structures for generation, the quality of synthetic food images still remains subpar. In addition, while diffusion-based generative models have shown promising results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
