An Empirical Study on Information Extraction using Large Language Models
Ridong Han, Chaohao Yang, Tao Peng, Prayag Tiwari, Xiang Wan, Lu Liu,, Benyou Wang

TL;DR
This paper evaluates GPT-4's information extraction capabilities, compares it with state-of-the-art methods, and explores prompt-based techniques to enhance its performance, revealing current limitations and potential improvements.
Contribution
The study provides a comprehensive assessment of GPT-4's IE abilities and introduces prompt-based methods to improve LLMs' extraction performance, highlighting existing gaps.
Findings
GPT-4 lags behind SOTA IE methods in performance
Prompt-based techniques can improve GPT-4's IE ability
Remaining issues suggest further research needed
Abstract
Human-like large language models (LLMs), especially the most powerful and popular ones in OpenAI's GPT family, have proven to be very helpful for many natural language processing (NLP) related tasks. Therefore, various attempts have been made to apply LLMs to information extraction (IE), which is a fundamental NLP task that involves extracting information from unstructured plain text. To demonstrate the latest representative progress in LLMs' information extraction ability, we assess the information extraction ability of GPT-4 (the latest version of GPT at the time of writing this paper) from four perspectives: Performance, Evaluation Criteria, Robustness, and Error Types. Our results suggest a visible performance gap between GPT-4 and state-of-the-art (SOTA) IE methods. To alleviate this problem, considering the LLMs' human-like characteristics, we propose and analyze the effects of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTechnology and Data Analysis · Diverse Approaches in Healthcare and Education Studies · Computational and Text Analysis Methods
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Cosine Annealing · Absolute Position Encodings · Label Smoothing · Position-Wise Feed-Forward Layer · Linear Layer · Residual Connection · Linear Warmup With Cosine Annealing · Attention Dropout · Discriminative Fine-Tuning
