DSGPT: Domain-Specific Generative Pre-Training of Transformers for Text Generation in E-commerce Title and Review Summarization
Xueying Zhang, Yunjiang Jiang, Yue Shang, Zhaomeng Cheng, Chi Zhang,, Xiaochuan Fan, Yun Xiao, Bo Long

TL;DR
DSGPT is a domain-specific pre-training method for transformer-based text generation that effectively utilizes limited data and improves summarization tasks in e-commerce without requiring extensive labeled datasets.
Contribution
The paper introduces DS-GPT, a novel pre-training approach that leverages small domain-specific datasets and eliminates the need for labeled data, enhancing e-commerce text summarization.
Findings
Significant improvement in title and review summarization accuracy.
Effective use of limited domain data for pre-training.
No need for product-related human-labeled data.
Abstract
We propose a novel domain-specific generative pre-training (DS-GPT) method for text generation and apply it to the product titleand review summarization problems on E-commerce mobile display.First, we adopt a decoder-only transformer architecture, which fitswell for fine-tuning tasks by combining input and output all to-gether. Second, we demonstrate utilizing only small amount of pre-training data in related domains is powerful. Pre-training a languagemodel from a general corpus such as Wikipedia or the CommonCrawl requires tremendous time and resource commitment, andcan be wasteful if the downstream tasks are limited in variety. OurDSGPT is pre-trained on a limited dataset, the Chinese short textsummarization dataset (LCSTS). Third, our model does not requireproduct-related human-labeled data. For title summarization task,the state of art explicitly uses additional background…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
