SeeDS: Semantic Separable Diffusion Synthesizer for Zero-shot Food   Detection

Pengfei Zhou; Weiqing Min; Yang Zhang; Jiajun Song; Ying Jin and; Shuqiang Jiang

arXiv:2310.04689·cs.CV·October 10, 2023

SeeDS: Semantic Separable Diffusion Synthesizer for Zero-shot Food Detection

Pengfei Zhou, Weiqing Min, Yang Zhang, Jiajun Song, Ying Jin and, Shuqiang Jiang

PDF

1 Repo

TL;DR

SeeDS introduces a novel framework combining semantic feature synthesis and diffusion models to improve zero-shot food detection, achieving state-of-the-art results on multiple datasets.

Contribution

The paper proposes the SeeDS framework with two modules that synthesize discriminative and diversified features for zero-shot food detection, addressing semantic complexity and intra-class diversity.

Findings

01

Achieves state-of-the-art zero-shot food detection performance on ZSFooD and UECFOOD-256 datasets.

02

Maintains effectiveness on general zero-shot detection datasets like PASCAL VOC and MS COCO.

03

Demonstrates the benefit of semantic feature synthesis and diffusion models in fine-grained recognition.

Abstract

Food detection is becoming a fundamental task in food computing that supports various multimedia applications, including food recommendation and dietary monitoring. To deal with real-world scenarios, food detection needs to localize and recognize novel food objects that are not seen during training, demanding Zero-Shot Detection (ZSD). However, the complexity of semantic attributes and intra-class feature diversity poses challenges for ZSD methods in distinguishing fine-grained food classes. To tackle this, we propose the Semantic Separable Diffusion Synthesizer (SeeDS) framework for Zero-Shot Food Detection (ZSFD). SeeDS consists of two modules: a Semantic Separable Synthesizing Module (S $^{3}$ M) and a Region Feature Denoising Diffusion Model (RFDDM). The S $^{3}$ M learns the disentangled semantic representation for complex food attributes from ingredients and cuisines, and synthesizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lancezpf/seeds
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Diffusion · Softmax · Linear Layer · Synthesizer