Fine-Tuned Language Models for Domain-Specific Summarization and Tagging

Jun Wang; Fuming Lin; Yuyu Chen

arXiv:2510.25460·cs.CL·October 30, 2025

Fine-Tuned Language Models for Domain-Specific Summarization and Tagging

Jun Wang, Fuming Lin, Yuyu Chen

PDF

TL;DR

This paper introduces a pipeline combining fine-tuned large language models with named entity recognition to improve domain-specific text summarization and tagging, especially in evolving sub-cultural languages and slang.

Contribution

It demonstrates that instruction fine-tuning enhances summarization and tagging accuracy across general and domain-specific datasets, with transferability of reasoning capabilities across languages.

Findings

01

Instruction fine-tuning improves accuracy significantly.

02

Domain-specific fine-tuning outperforms general models.

03

Models effectively support real-time information management.

Abstract

This paper presents a pipeline integrating fine-tuned large language models (LLMs) with named entity recognition (NER) for efficient domain-specific text summarization and tagging. The authors address the challenge posed by rapidly evolving sub-cultural languages and slang, which complicate automated information extraction and law enforcement monitoring. By leveraging the LLaMA Factory framework, the study fine-tunes LLMs on both generalpurpose and custom domain-specific datasets, particularly in the political and security domains. The models are evaluated using BLEU and ROUGE metrics, demonstrating that instruction fine-tuning significantly enhances summarization and tagging accuracy, especially for specialized corpora. Notably, the LLaMA3-8B-Instruct model, despite its initial limitations in Chinese comprehension, outperforms its Chinese-trained counterpart after domainspecific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.