Are Large Language Models the future crowd workers of Linguistics?
Iris Ferrazzo

TL;DR
This study investigates whether Large Language Models like GPT-4 can replace human participants in empirical linguistic research, showing they can outperform humans in certain tasks and highlighting the potential for broader application in humanities research.
Contribution
It demonstrates the effectiveness of LLMs in linguistic data elicitation tasks and explores advanced prompting techniques to improve alignment with human performance.
Findings
LLMs outperform humans in linguistic tasks
Chain-of-Thought prompting improves LLM performance
LLMs show high versatility in linguistic data collection
Abstract
Data elicitation from human participants is one of the core data collection strategies used in empirical linguistic research. The amount of participants in such studies may vary considerably, ranging from a handful to crowdsourcing dimensions. Even if they provide resourceful extensive data, both of these settings come alongside many disadvantages, such as low control of participants' attention during task completion, precarious working conditions in crowdsourcing environments, and time-consuming experimental designs. For these reasons, this research aims to answer the question of whether Large Language Models (LLMs) may overcome those obstacles if included in empirical linguistic pipelines. Two reproduction case studies are conducted to gain clarity into this matter: Cruz (2023) and Lombard et al. (2021). The two forced elicitation tasks, originally designed for human participants, are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
