RealBehavior: A Framework for Faithfully Characterizing Foundation   Models' Human-like Behavior Mechanisms

Enyu Zhou; Rui Zheng; Zhiheng Xi; Songyang Gao; Xiaoran Fan; Zichu; Fei; Jingting Ye; Tao Gui; Qi Zhang; Xuanjing Huang

arXiv:2310.11227·cs.CL·October 18, 2023·1 cites

RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms

Enyu Zhou, Rui Zheng, Zhiheng Xi, Songyang Gao, Xiaoran Fan, Zichu, Fei, Jingting Ye, Tao Gui, Qi Zhang, Xuanjing Huang

PDF

Open Access

TL;DR

RealBehavior is a framework designed to accurately characterize human-like behaviors in foundation models by assessing the faithfulness of behavioral measurements through reproducibility, consistency, and generalizability.

Contribution

The paper introduces a novel framework, RealBehavior, for faithful characterization of model behaviors, emphasizing the importance of verifying measurement faithfulness beyond traditional psychological tools.

Findings

01

Simple psychological tools may not faithfully characterize all behaviors.

02

Assessing reproducibility, consistency, and generalizability improves behavioral analysis.

03

Diversifying alignment objectives can prevent restricted model characteristics.

Abstract

Reports of human-like behaviors in foundation models are growing, with psychological theories providing enduring tools to investigate these behaviors. However, current research tends to directly apply these human-oriented tools without verifying the faithfulness of their outcomes. In this paper, we introduce a framework, RealBehavior, which is designed to characterize the humanoid behaviors of models faithfully. Beyond simply measuring behaviors, our framework assesses the faithfulness of results based on reproducibility, internal and external consistency, and generalizability. Our findings suggest that a simple application of psychological tools cannot faithfully characterize all human-like behaviors. Moreover, we discuss the impacts of aligning models with human and social values, arguing for the necessity of diversifying alignment objectives to prevent the creation of models with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Mental Health Interventions · Computational and Text Analysis Methods · Ethics and Social Impacts of AI