Development of Cognitive Intelligence in Pre-trained Language Models

Raj Sanjay Shah; Khushi Bhardwaj; Sashank Varma

arXiv:2407.01047·cs.CL·July 15, 2024

Development of Cognitive Intelligence in Pre-trained Language Models

Raj Sanjay Shah, Khushi Bhardwaj, Sashank Varma

PDF

Open Access

TL;DR

This paper investigates how pre-trained language models develop cognitive abilities during training, revealing a consistent window where their performance aligns most closely with human cognitive development, which is crucial for modeling human cognition.

Contribution

It introduces a developmental perspective to analyze PLMs' cognitive abilities during training, highlighting a specific window of maximal cognitive alignment across models.

Findings

01

PLMs show a consistent developmental window of maximal cognitive alignment.

02

Training before this window enhances structure for learning, after reduces alignment.

03

Alignment with human cognition peaks during a specific training phase.

Abstract

Recent studies show evidence for emergent cognitive abilities in Large Pre-trained Language Models (PLMs). The increasing cognitive alignment of these models has made them candidates for cognitive science theories. Prior research into the emergent cognitive abilities of PLMs has largely been path-independent to model training, i.e., has focused on the final model weights and not the intermediate steps. However, building plausible models of human cognition using PLMs would benefit from considering the developmental alignment of their performance during training to the trajectories of children's thinking. Guided by psychometric tests of human intelligence, we choose four sets of tasks to investigate the alignment of ten popular families of PLMs and evaluate their available intermediate and final training steps. These tasks are Numerical ability, Linguistic abilities, Conceptual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling