Pest-Thinker: Learning to Think and Reason like Entomologists via Reinforcement Learning
Xueheng Li, Yu Wang, Tao Hu, Ji Huang, Ke Cao, Qize Yang, Rui Li, Jie Zhang, Chengjun Xie

TL;DR
Pest-Thinker is a reinforcement learning framework that enhances multimodal language models to reason about pest morphology for better crop pest identification, using new datasets and structured reasoning techniques.
Contribution
It introduces Pest-Thinker, combining knowledge-driven RL, new pest datasets, and Chain-of-Thought reasoning to improve pest recognition and morphological understanding.
Findings
Significant improvement in pest morphological reasoning accuracy.
Effective generalization to out-of-domain pest species.
Enhanced visual understanding through structured reasoning.
Abstract
Pest-induced crop losses pose a major threat to global food security and sustainable agricultural development. While recent advances in Multimodal Large Language Models (MLLMs) have shown strong potential for visual understanding and smart agriculture, their direct application to pest recognition remains limited due to the domain's unique challenges such as high inter-species complexity, intra-species variability, and the scarcity of expert-annotated data. In this work, we introduce Pest-Thinker, a knowledge-driven reinforcement learning (RL) framework that enables MLLMs to reason over fine-grained pest morphology. We first construct two high-definition pest benchmarks, QFSD and AgriInsect, comprising diverse species and expert-annotated morphological traits. Leveraging these datasets, we synthesize Chain-of-Thought (CoT) reasoning trajectories to facilitate structured learning of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
