TL;DR
This paper introduces ToMMeR, a lightweight model that efficiently detects entity mentions in text using early LLM layers, achieving high recall and competitive performance with minimal parameters.
Contribution
ToMMeR demonstrates that structured entity representations are present in early transformer layers and can be recovered with a small, efficient model.
Findings
ToMMeR achieves 93% recall zero-shot on 13 NER benchmarks.
Diverse architectures converge on similar mention boundaries (DICE >75%).
Extended ToMMeR with span classification reaches 80-87% F1 on standard benchmarks.
Abstract
Identifying which text spans refer to entities - mention detection - is both foundational for information extraction and a known performance bottleneck. We introduce ToMMeR, a lightweight model (<300K parameters) probing mention detection capabilities from early LLM layers. Across 13 NER benchmarks, ToMMeR achieves 93% recall zero-shot, with an estimated 90% precision under a human-calibrated LLM-judge protocol, showing that ToMMeR rarely produces spurious predictions despite high recall. Cross-model analysis reveals that diverse architectures (14M-15B parameters) converge on similar mention boundaries (DICE >75%), confirming that mention detection emerges naturally from language modeling. When extended with span classification heads, ToMMeR achieves competitive NER performance (80-87% F1 on standard benchmarks). Our work provides evidence that structured entity representations exist in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L6_R64model· ♡ 1♡ 1
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L0_R64model
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L1_R64model
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L2_R64model
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L3_R64model
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L4_R64model
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L5_R64model
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L7_R64model
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L8_R64model
- 🤗llm2ner/ToMMeR-Llama-3.2-1B_L9_R64model
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
