A Computational Acquisition Model for Multimodal Word Categorization

Uri Berger; Gabriel Stanovsky; Omri Abend; Lea Frermann

arXiv:2205.05974·cs.CL·May 13, 2022·1 cites

A Computational Acquisition Model for Multimodal Word Categorization

Uri Berger, Gabriel Stanovsky, Omri Abend, Lea Frermann

PDF

Open Access 1 Repo

TL;DR

This paper introduces a cognitively-inspired multimodal model trained on naturalistic image-caption data, demonstrating its ability to learn word categories and object recognition in a manner similar to child language development.

Contribution

The study presents a novel cross-modal self-supervised model trained on naturalistic data, addressing limitations of previous vision-based models and aligning with developmental findings.

Findings

01

Model learns word categories effectively

02

Demonstrates object recognition abilities

03

Shows developmental trends similar to children

Abstract

Recent advances in self-supervised modeling of text and images open new opportunities for computational models of child language acquisition, which is believed to rely heavily on cross-modal signals. However, prior studies have been limited by their reliance on vision models trained on large image datasets annotated with a pre-defined set of depicted object categories. This is (a) not faithful to the information children receive and (b) prohibits the evaluation of such models with respect to category learning tasks, due to the pre-imposed category structure. We address this gap, and present a cognitively-inspired, multimodal acquisition model, trained from image-caption pairs on naturalistic data using cross-modal self-supervision. We show that the model learns word categories and object recognition abilities, and presents trends reminiscent of those reported in the developmental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

slab-nlp/multimodal_clustering
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques