Weaver: Foundation Models for Creative Writing
Tiannan Wang, Jiamin Chen, Qingrui Jia, Shuai Wang, Ruoyu Fang, Huilin, Wang, Zhaowei Gao, Chunzhao Xie, Chuou Xu, Jihong Dai, Yibin Liu, Jialong Wu,, Shengwei Ding, Long Li, Zhiwei Huang, Xinle Deng, Teng Yu, Gangan Ma, Han, Xiao, Zixin Chen, Danjun Xiang, Yunxia Wang

TL;DR
Weaver is a family of specialized large language models optimized for creative and professional writing, outperforming larger generalist models like GPT-4 in writing tasks, with capabilities for retrieval augmentation and tool integration.
Contribution
Introduces Weaver, a new family of domain-specific LLMs for content creation, with novel training, fine-tuning, and alignment methods tailored for high-quality writing.
Findings
Weaver models outperform larger generalist LLMs on writing benchmarks.
Weaver Ultra surpasses GPT-4 in various writing scenarios.
Supports retrieval-augmented generation and tool calling for enhanced writing assistance.
Abstract
This work introduces Weaver, our first family of large language models (LLMs) dedicated to content creation. Weaver is pre-trained on a carefully selected corpus that focuses on improving the writing capabilities of large language models. We then fine-tune Weaver for creative and professional writing purposes and align it to the preference of professional writers using a suit of novel methods for instruction data synthesis and LLM alignment, making it able to produce more human-like texts and follow more diverse instructions for content creation. The Weaver family consists of models of Weaver Mini (1.8B), Weaver Base (6B), Weaver Pro (14B), and Weaver Ultra (34B) sizes, suitable for different applications and can be dynamically dispatched by a routing agent according to query complexity to balance response quality and computation cost. Evaluation on a carefully curated benchmark for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCreativity in Education and Neuroscience · Artistic and Creative Research
MethodsLinear Layer · Byte Pair Encoding · Residual Connection · Dropout · Layer Normalization · Multi-Head Attention · Adam · Softmax · Attention Is All You Need · Dense Connections
