Cross-Domain Content Generation with Domain-Specific Small Language   Models

Ankit Maloo; Abhinav Garg

arXiv:2409.17171·cs.CL·October 3, 2024

Cross-Domain Content Generation with Domain-Specific Small Language Models

Ankit Maloo, Abhinav Garg

PDF

Open Access

TL;DR

This paper presents a method for small language models to generate domain-specific content across multiple datasets by employing knowledge expansion with frozen layers, avoiding catastrophic forgetting and improving multi-domain generation.

Contribution

It introduces a knowledge expansion strategy with frozen layers enabling small models to handle multiple domains without catastrophic forgetting.

Findings

01

Custom tokenizers improve generation quality.

02

Single models with LoRA or standard fine-tuning are ineffective.

03

Knowledge expansion with frozen layers enables multi-domain content generation.

Abstract

Generating domain-specific content using small language models poses challenges, especially when dealing with multiple distinct datasets with minimal overlap. In this study, we explore methods to enable a small language model to produce coherent and relevant outputs for two different domains: stories (Dataset A) and recipes (Dataset B). Our initial experiments show that training individual models on each dataset yields satisfactory results, with each model generating appropriate content within its domain. We find that utilizing custom tokenizers tailored to each dataset significantly enhances generation quality compared to using a generic tokenizer. Attempts to adapt a single model to both domains using Low-Rank Adaptation (LoRA) or standard fine-tuning do not yield substantial results, often failing to produce meaningful outputs. Moreover, full fine-tuning without freezing the model's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Wikis in Education and Collaboration · Semantic Web and Ontologies