Leveraging a Cognitive Model to Measure Subjective Similarity of Human and GPT-4 Written Content

Tailia Malloy; Maria Jos\'e Ferreira; Fei Fang; Cleotilde Gonzalez

arXiv:2409.00269·cs.CL·May 16, 2025

Leveraging a Cognitive Model to Measure Subjective Similarity of Human and GPT-4 Written Content

Tailia Malloy, Maria Jos\'e Ferreira, Fei Fang, Cleotilde Gonzalez

PDF

Open Access

TL;DR

This paper introduces the IBIS metric, combining cognitive models with LLM embeddings to measure subjective content similarity, accounting for individual biases, and demonstrating its effectiveness with a dataset of email safety categorizations.

Contribution

It presents a novel similarity metric integrating cognitive models with LLM embeddings to capture subjective human judgments and biases.

Findings

01

IBIS improves alignment with human categorizations

02

Cognitive integration enhances personalization of similarity measures

03

Dataset demonstrates effectiveness in educational email classification

Abstract

Cosine similarity between two documents can be computed using token embeddings formed by Large Language Models (LLMs) such as GPT-4, and used to categorize those documents across a range of uses. However, these similarities are ultimately dependent on the corpora used to train these LLMs, and may not reflect subjective similarity of individuals or how their biases and constraints impact similarity metrics. This lack of cognitively-aware personalization of similarity metrics can be particularly problematic in educational and recommendation settings where there is a limited number of individual judgements of category or preference, and biases can be particularly relevant. To address this, we rely on an integration of an Instance-Based Learning (IBL) cognitive model with LLM embeddings to develop the Instance-Based Individualized Similarity (IBIS) metric. This similarity metric is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Artificial Intelligence in Healthcare and Education · Topic Modeling

MethodsAttention Is All You Need · Byte Pair Encoding · Absolute Position Encodings · Softmax · Label Smoothing · Dropout · Layer Normalization · Position-Wise Feed-Forward Layer · Linear Layer · Adam