Shifting NER into High Gear: The Auto-AdvER Approach
Filippos Ventirozos, Ioanna Nteka, Tania Nandy, Jozef Baca, Peter, Appleby, Matthew Shardlow

TL;DR
This paper introduces Auto-AdvER, a specialized NER schema and dataset for car advertisements, demonstrating its development, annotation quality, and evaluation of various language models, highlighting the potential for improved automotive text analytics.
Contribution
The paper develops a novel NER schema and dataset for car ads, and evaluates the performance of multiple language models on this domain, advancing domain-specific NER research.
Findings
LLMs outperform smaller encoder-only models in NER tasks
Inter-annotator agreement achieved 92% F1-Score
LLMs are effective but costly for specialized NER tasks
Abstract
This paper presents a case study on the development of Auto-AdvER, a specialised named entity recognition schema and dataset for text in the car advertisement genre. Developed with industry needs in mind, Auto-AdvER is designed to enhance text mining analytics in this domain and contributes a linguistically unique NER dataset. We present a schema consisting of three labels: "Condition", "Historic" and "Sales Options". We outline the guiding principles for annotation, describe the methodology for schema development, and show the results of an annotation study demonstrating inter-annotator agreement of 92% F1-Score. Furthermore, we compare the performance by using encoder-only models: BERT, DeBERTaV3 and decoder-only open and closed source Large Language Models (LLMs): Llama, Qwen, GPT-4 and Gemini. Our results show that the class of LLMs outperforms the smaller encoder-only models.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Fault Detection and Control Systems
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Position-Wise Feed-Forward Layer · Softmax · Byte Pair Encoding · Linear Layer · Linear Warmup With Linear Decay · Multi-Head Attention · Weight Decay · WordPiece
