TACOMORE: Leveraging the Potential of LLMs in Corpus-based Discourse   Analysis with Prompt Engineering

Bingru Li; Han Wang

arXiv:2412.10139·cs.CL·December 16, 2024

TACOMORE: Leveraging the Potential of LLMs in Corpus-based Discourse Analysis with Prompt Engineering

Bingru Li, Han Wang

PDF

TL;DR

TACOMORE is a prompting framework designed to enhance the performance, reproducibility, and ethicality of large language models in automated corpus-based discourse analysis, demonstrated on COVID-19 research articles.

Contribution

The paper introduces TACOMORE, a novel prompt engineering framework that improves LLMs' effectiveness in discourse analysis tasks with better reproducibility and ethical considerations.

Findings

01

Improved LLM performance on keyword, collocate, and concordance analysis

02

Enhanced reproducibility and ethicality in discourse analysis

03

Provides a structured prompting approach for qualitative research

Abstract

The capacity of LLMs to carry out automated qualitative analysis has been questioned by corpus linguists, and it has been argued that corpus-based discourse analysis incorporating LLMs is hindered by issues of unsatisfying performance, hallucination, and irreproducibility. Our proposed method, TACOMORE, aims to address these concerns by serving as an effective prompting framework in this domain. The framework consists of four principles, i.e., Task, Context, Model and Reproducibility, and specifies five fundamental elements of a good prompt, i.e., Role Description, Task Definition, Task Procedures, Contextual Information and Output Format. We conduct experiments on three LLMs, i.e., GPT-4o, Gemini-1.5-Pro and Gemini-1.5.Flash, and find that TACOMORE helps improve LLM performance in three representative discourse analysis tasks, i.e., the analysis of keywords, collocates and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.