High-dimensional distributed semantic spaces for utterances

Jussi Karlgren; Pentti Kanerva

arXiv:2104.00424·cs.CL·April 2, 2021

High-dimensional distributed semantic spaces for utterances

Jussi Karlgren, Pentti Kanerva

PDF

Open Access

TL;DR

This paper introduces a high-dimensional, mathematically principled model for representing diverse linguistic features of utterances and texts, bridging symbolic and continuous representations for improved language processing.

Contribution

It extends Random Indexing to create a unified, fixed-dimensional vector space for various linguistic features, enabling efficient integration of symbolic and machine learning methods.

Findings

01

Successfully represents a broad range of linguistic features in a common vector space

02

Demonstrates the model's applicability to different linguistic data types

03

Provides a computationally feasible approach for linguistic feature integration

Abstract

High-dimensional distributed semantic spaces have proven useful and effective for aggregating and processing visual, auditory, and lexical information for many tasks related to human-generated data. Human language makes use of a large and varying number of features, lexical and constructional items as well as contextual and discourse-specific data of various types, which all interact to represent various aspects of communicative information. Some of these features are mostly local and useful for the organisation of e.g. argument structure of a predication; others are persistent over the course of a discourse and necessary for achieving a reasonable level of understanding of the content. This paper describes a model for high-dimensional representation for utterance and text level data including features such as constructions or contextual data, based on a mathematically principled and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Natural Language Processing Techniques