TagGPT: Large Language Models are Zero-shot Multimodal Taggers

Chen Li; Yixiao Ge; Jiayong Mao; Dian Li; Ying Shan

arXiv:2304.03022·cs.IR·April 7, 2023·1 cites

TagGPT: Large Language Models are Zero-shot Multimodal Taggers

Chen Li, Yixiao Ge, Jiayong Mao, Dian Li, Ying Shan

PDF

Open Access 1 Repo

TL;DR

TagGPT leverages large language models with prompt engineering to perform zero-shot multimodal tag extraction and tagging, improving multimedia content distribution without task-specific training.

Contribution

This work introduces TagGPT, a modular zero-shot tagging system using LLMs and sentence embeddings, capable of handling various modalities and outperforming existing taggers.

Findings

01

Effective zero-shot multimodal tagging demonstrated on public datasets.

02

TagGPT outperforms existing hashtag and tagger methods.

03

Flexible modular framework adaptable to different LLMs and embeddings.

Abstract

Tags are pivotal in facilitating the effective distribution of multimedia content in various applications in the contemporary Internet era, such as search engines and recommendation systems. Recently, large language models (LLMs) have demonstrated impressive capabilities across a wide range of tasks. In this work, we propose TagGPT, a fully automated system capable of tag extraction and multimodal tagging in a completely zero-shot fashion. Our core insight is that, through elaborate prompt engineering, LLMs are able to extract and reason about proper tags given textual clues of multimodal data, e.g., OCR, ASR, title, etc. Specifically, to automatically build a high-quality tag set that reflects user intent and interests for a specific application, TagGPT predicts large-scale candidate tags from a series of raw data via prompting LLMs, filtered with frequency and semantics. Given a new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tencentarc/taggpt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Web Data Mining and Analysis