Improving Quantized Model Performance in Qualitative Analysis with Multi-Pass Prompt Verification

Aisvarya Adeseye; Jouni Isoaho; Adeyemi Adeseye

arXiv:2605.20193·cs.CL·May 21, 2026

Improving Quantized Model Performance in Qualitative Analysis with Multi-Pass Prompt Verification

Aisvarya Adeseye, Jouni Isoaho, Adeyemi Adeseye

PDF

TL;DR

This paper investigates how different quantization levels and types affect LLaMA-3.1's performance in qualitative analysis and proposes a multi-pass prompt verification method to improve stability and accuracy of low-bit models.

Contribution

It introduces a quantization-aware multi-pass prompt verification approach that enhances low-bit LLMs' stability and accuracy in qualitative research tasks.

Findings

01

8-bit models closely match human ground truth.

02

4-bit models become stable with the proposed method.

03

3-bit and 2-bit models improve performance after verification.

Abstract

Quantized Large Language Models (LLMs) are used more often in qualitative analysis because they run fast and need fewer computing resources. This study examines how different lower bits quantization levels (8-bit, 4-bit, 3-bit, and 2-bit) and quantization types affect the performance of LLaMA-3.1 (8B) on qualitative analysis. The study uses expert and non-expert responses from 82 interview transcripts. Low-bit models often produce higher levels of hallucinations and unstable results, especially when reading non-expert language with unclear terms. To improve performance, we propose a quantization-aware multi-pass prompt verification method. This method guides the model through controlled steps that reduce hallucinations. It removes unreliable content and passes the results to the next transcript after verification, improving accuracy. To validate performance, human coders analyzed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.