Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of   Working Memory

Ankur Sikarwar; Mengmi Zhang

arXiv:2307.10768·q-bio.NC·November 2, 2023·1 cites

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory

Ankur Sikarwar, Mengmi Zhang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a comprehensive benchmark dataset for working memory, comparing AI models and humans across multiple tasks, revealing AI's partial mimicry of human WM and highlighting areas for improvement.

Contribution

The paper presents WorM, a large-scale, multifaceted benchmark dataset for working memory, and evaluates AI models against human benchmarks across diverse WM functionalities.

Findings

01

AI models replicate primacy and recency effects

02

Models show neural specialization for WM domains

03

Limitations in AI models' ability to fully emulate human WM

Abstract

Working memory (WM), a fundamental cognitive process facilitating the temporary storage, integration, manipulation, and retrieval of information, plays a vital role in reasoning and decision-making tasks. Robust benchmark datasets that capture the multifaceted nature of WM are crucial for the effective development and evaluation of AI WM models. Here, we introduce a comprehensive Working Memory (WorM) benchmark dataset for this purpose. WorM comprises 10 tasks and a total of 1 million trials, assessing 4 functionalities, 3 domains, and 11 behavioral and neural characteristics of WM. We jointly trained and tested state-of-the-art recurrent neural networks and transformers on all these tasks. We also include human behavioral benchmarks as an upper bound for comparison. Our results suggest that AI models replicate some characteristics of WM in the brain, most notably primacy and recency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhanglab-deepneurocoglab/worm
pytorchOfficial

Videos

Decoding the Enigma: Benchmarking Humans and AIs on the Many Facets of Working Memory· slideslive

Taxonomy

TopicsCognitive Functions and Memory · Ferroelectric and Negative Capacitance Devices · Neural and Behavioral Psychology Studies