Embedding-based Retrieval with LLM for Effective Agriculture Information   Extracting from Unstructured Data

Ruoling Peng; Kang Liu; Po Yang; Zhipeng Yuan; Shunbao Li

arXiv:2308.03107·cs.AI·August 8, 2023·22 cites

Embedding-based Retrieval with LLM for Effective Agriculture Information Extracting from Unstructured Data

Ruoling Peng, Kang Liu, Po Yang, Zhipeng Yuan, Shunbao Li

PDF

Open Access

TL;DR

This paper presents a novel approach combining embedding-based retrieval and large language models to automatically extract structured agricultural pest data from unstructured documents, improving accuracy and efficiency.

Contribution

It introduces a domain-agnostic methodology that leverages LLMs and embedding retrieval for automatic data extraction from agricultural texts, with minimal human intervention.

Findings

01

Achieves higher accuracy than existing methods on benchmark datasets.

02

Maintains efficiency in processing unstructured agricultural documents.

03

Effectively extracts entities and attributes for pest identification.

Abstract

Pest identification is a crucial aspect of pest control in agriculture. However, most farmers are not capable of accurately identifying pests in the field, and there is a limited number of structured data sources available for rapid querying. In this work, we explored using domain-agnostic general pre-trained large language model(LLM) to extract structured data from agricultural documents with minimal or no human intervention. We propose a methodology that involves text retrieval and filtering using embedding-based retrieval, followed by LLM question-answering to automatically extract entities and attributes from the documents, and transform them into structured data. In comparison to existing methods, our approach achieves consistently better accuracy in the benchmark while maintaining efficiency.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSmart Agriculture and AI · Advanced Text Analysis Techniques