Graphmax for Text Generation
Liu Bin, Yin Guosheng

TL;DR
Graphmax introduces a graph-based regularization method for large language models, integrating scene-specific co-occurrence data to improve task-aligned text generation and translation quality.
Contribution
The paper proposes a novel graphmax function that combines global language model knowledge with local scene-specific co-occurrence information for enhanced text generation.
Findings
Improves performance across multiple NLP tasks.
Enhances topic consistency in generated text.
Participants can distinguish graphmax-generated text from softmax.
Abstract
In text generation, a large language model (LM) makes a choice of each new word based only on the former selection of its context using the softmax function. Nevertheless, the link statistics information of concurrent words based on a scene-specific corpus is valuable in choosing the next word, which can help to ensure the topic of the generated text to be aligned with the current task. To fully explore the co-occurrence information,we propose a graphmax function for task-specific text generation. Using the graph-based regularization, graphmax enables the final word choice to be determined by both the global knowledge from the LM and the local knowledge from the scene-specific corpus. The traditional softmax function is regularized with a graph total variation (GTV) term, which incorporates the local knowledge into the LM and encourages the model to consider the statistical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsSoftmax
