Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM's Nest
Letian Peng, Zilong Wang, Feng Yao, Jingbo Shang

TL;DR
This paper introduces Cuckoo, a versatile information extraction model that leverages existing LLM data, acting as a free rider to improve IE tasks efficiently without additional data collection.
Contribution
Cuckoo is the first IE model to utilize LLM pre-training and post-training data, enabling effective few-shot adaptation and continuous evolution with LLM advancements.
Findings
Cuckoo outperforms existing pre-trained IE models in few-shot settings.
It leverages 102.6M extractive data from LLM resources.
Cuckoo adapts effectively to traditional and complex IE tasks.
Abstract
Massive high-quality data, both pre-training raw texts and post-training annotations, have been carefully prepared to incubate advanced large language models (LLMs). In contrast, for information extraction (IE), pre-training data, such as BIO-tagged sequences, are hard to scale up. We show that IE models can act as free riders on LLM resources by reframing next-token \emph{prediction} into \emph{extraction} for tokens already present in the context. Specifically, our proposed next tokens extraction (NTE) paradigm learns a versatile IE model, \emph{Cuckoo}, with 102.6M extractive data converted from LLM's pre-training and post-training data. Under the few-shot setting, Cuckoo adapts effectively to traditional and complex instruction-following IE with better performance than existing pre-trained IE models. As a free rider, Cuckoo can naturally evolve with the ongoing advancements in LLM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗KomeijiForce/Cuckoo-C4model· 5 dl· ♡ 25 dl♡ 2
- 🤗KomeijiForce/Cuckoo-C4-Instructmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗KomeijiForce/Cuckoo-C4-Rainbowmodel· 6 dl· ♡ 16 dl♡ 1
- 🤗KomeijiForce/Cuckoo-C4-Super-Rainbowmodel· 3 dl· ♡ 23 dl♡ 2
- 🤗KomeijiForce/Cuckoo-Super-Rainbow-Text-Classificationmodel
- 🤗KomeijiForce/cuckoo-deberta-large-c4model· 3 dl3 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAquatic life and conservation · Rabbits: Nutrition, Reproduction, Health · Aquaculture Nutrition and Growth
