MacBehaviour: An R package for behavioural experimentation on large   language models

Xufeng Duan; Shixuan Li; Zhenguang G. Cai1

arXiv:2405.07495·cs.CL·May 14, 2024

MacBehaviour: An R package for behavioural experimentation on large language models

Xufeng Duan, Shixuan Li, Zhenguang G. Cai1

PDF

Open Access 1 Repo

TL;DR

MacBehaviour is an R package designed to facilitate behavioral experiments on large language models, enabling researchers to systematically study LLMs' responses and behaviors across multiple models with streamlined tools.

Contribution

The paper introduces MacBehaviour, a comprehensive R package that simplifies and standardizes the process of conducting behavioral experiments on over 60 large language models.

Findings

01

LLMs exhibit human-like gender inference from names.

02

The package successfully replicates previous findings on LLM behavior.

03

Demonstrated utility across multiple LLMs in validation experiments.

Abstract

There has been increasing interest in investigating the behaviours of large language models (LLMs) and LLM-powered chatbots by treating an LLM as a participant in a psychological experiment. We therefore developed an R package called "MacBehaviour" that aims to interact with more than 60 language models in one package (e.g., OpenAI's GPT family, the Claude family, Gemini, Llama family, and open-source models) and streamline the experimental process of LLMs behaviour experiments. The package offers a comprehensive set of functions designed for LLM experiments, covering experiment design, stimuli presentation, model behaviour manipulation, logging response and token probability. To demonstrate the utility and effectiveness of "MacBehaviour," we conducted three validation experiments on three LLMs (GPT-3.5, Llama-2 7B, and Vicuna-1.5 13B) to replicate sound-gender association in LLMs. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xufengduan/macbehaviour
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Text Analysis Techniques · Language and cultural evolution · Opinion Dynamics and Social Influence

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Linear Layer · Discriminative Fine-Tuning · Multi-Head Attention · Layer Normalization · Dense Connections · Attention Dropout · Weight Decay