Cross-Domain Content Generation with Domain-Specific Small Language Models
Ankit Maloo, Abhinav Garg

TL;DR
This paper presents a method for small language models to generate domain-specific content across multiple datasets by employing knowledge expansion with frozen layers, avoiding catastrophic forgetting and improving multi-domain generation.
Contribution
It introduces a knowledge expansion strategy with frozen layers enabling small models to handle multiple domains without catastrophic forgetting.
Findings
Custom tokenizers improve generation quality.
Single models with LoRA or standard fine-tuning are ineffective.
Knowledge expansion with frozen layers enables multi-domain content generation.
Abstract
Generating domain-specific content using small language models poses challenges, especially when dealing with multiple distinct datasets with minimal overlap. In this study, we explore methods to enable a small language model to produce coherent and relevant outputs for two different domains: stories (Dataset A) and recipes (Dataset B). Our initial experiments show that training individual models on each dataset yields satisfactory results, with each model generating appropriate content within its domain. We find that utilizing custom tokenizers tailored to each dataset significantly enhances generation quality compared to using a generic tokenizer. Attempts to adapt a single model to both domains using Low-Rank Adaptation (LoRA) or standard fine-tuning do not yield substantial results, often failing to produce meaningful outputs. Moreover, full fine-tuning without freezing the model's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Wikis in Education and Collaboration · Semantic Web and Ontologies
