Iterative Definition Refinement for Zero-Shot Classification via LLM-Based Semantic Prototype Optimization

Naeem Rehmat; Muhammad Saad Saeed; Ijaz Ul Haq; Khalid Malik

arXiv:2604.27335·cs.CV·May 1, 2026

Iterative Definition Refinement for Zero-Shot Classification via LLM-Based Semantic Prototype Optimization

Naeem Rehmat, Muhammad Saad Saeed, Ijaz Ul Haq, Khalid Malik

PDF

1 Repo

TL;DR

This paper introduces an iterative, LLM-driven method to refine category definitions for zero-shot web content classification, significantly enhancing accuracy without retraining models.

Contribution

It proposes a training-free, adaptive framework that optimizes category definitions iteratively using LLMs, improving zero-shot classification performance.

Findings

01

Iterative refinement improves classification accuracy across models.

02

The approach reduces semantic overlap caused by ambiguous definitions.

03

A new benchmark dataset with 10 URL categories and 1,000 samples per class is introduced.

Abstract

Web filtering systems rely on accurate web content classification to block cyber threats, prevent data exfiltration, and ensure compliance. However, classification is increasingly difficult due to the dynamic and rapidly evolving nature of the modern web. Embedding-based zero-shot approaches map content and category descriptions into a shared semantic space, enabling label assignment without labeled training data, but remain highly sensitive to definition quality. Poorly specified or ambiguous definitions create semantic overlap in the embedding space, leading to systematic misclassification. In this paper, we propose a training-free, adaptive iterative definition refinement framework that improves zero-shot web content classification by progressively optimizing category definitions rather than updating model parameters. Using LLMs as feedback-driven definition optimizers, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

naeemrehmat/B2MWT-10C
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.