MacBehaviour: An R package for behavioural experimentation on large language models
Xufeng Duan, Shixuan Li, Zhenguang G. Cai1

TL;DR
MacBehaviour is an R package designed to facilitate behavioral experiments on large language models, enabling researchers to systematically study LLMs' responses and behaviors across multiple models with streamlined tools.
Contribution
The paper introduces MacBehaviour, a comprehensive R package that simplifies and standardizes the process of conducting behavioral experiments on over 60 large language models.
Findings
LLMs exhibit human-like gender inference from names.
The package successfully replicates previous findings on LLM behavior.
Demonstrated utility across multiple LLMs in validation experiments.
Abstract
There has been increasing interest in investigating the behaviours of large language models (LLMs) and LLM-powered chatbots by treating an LLM as a participant in a psychological experiment. We therefore developed an R package called "MacBehaviour" that aims to interact with more than 60 language models in one package (e.g., OpenAI's GPT family, the Claude family, Gemini, Llama family, and open-source models) and streamline the experimental process of LLMs behaviour experiments. The package offers a comprehensive set of functions designed for LLM experiments, covering experiment design, stimuli presentation, model behaviour manipulation, logging response and token probability. To demonstrate the utility and effectiveness of "MacBehaviour," we conducted three validation experiments on three LLMs (GPT-3.5, Llama-2 7B, and Vicuna-1.5 13B) to replicate sound-gender association in LLMs. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Language and cultural evolution · Opinion Dynamics and Social Influence
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Sparse Evolutionary Training · Linear Layer · Discriminative Fine-Tuning · Multi-Head Attention · Layer Normalization · Dense Connections · Attention Dropout · Weight Decay
