LLM-based Affective Text Generation Quality Based on Different Quantization Values
Yarik Menchaca Resendiz, Roman Klinger

TL;DR
This study investigates how different quantization levels affect the quality, memory usage, and inference time of large language models in affective text generation, highlighting trade-offs between efficiency and accuracy.
Contribution
It provides a comprehensive analysis of quantization impacts on affective text generation across multiple models, revealing memory savings and accuracy trade-offs.
Findings
76% memory reduction with quantization
Up to 10 percentage points decrease in F1 score for larger models
Inference time roughly doubles with lower precision
Abstract
Large language models exhibit a remarkable capacity in language generation and comprehension. These advances enable AI systems to produce more human-like and emotionally engaging text. However, these models rely on a large number of parameters, requiring significant computational resources for training and inference. In some scenarios, accessing these resources can be challenging (e.g., budget or hardware limitations). Techniques like reducing precision bits can make models more memory-efficient, reducing the computational resources needed, at the cost of reduced accuracy. This paper addresses the trade-off between different quantization values, GPU RAM utilization, and text quality in affective text generation (e.g., "I really enjoy running in the snow-covered forest"). To evaluate, we use an emotion classifier and ten seed prompts to generate affective text. We test three setups of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Text and Document Classification Technologies · Sentiment Analysis and Opinion Mining
