Do Large GPT Models Discover Moral Dimensions in Language   Representations? A Topological Study Of Sentence Embeddings

Stephen Fitz

arXiv:2309.09397·cs.CL·September 19, 2023

Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings

Stephen Fitz

PDF

Open Access

TL;DR

This study investigates whether GPT-3.5's sentence embeddings encode moral dimensions by analyzing their topological structure with respect to a fairness metric, revealing a separation into fair and unfair submanifolds.

Contribution

The paper introduces a novel topological visualization method to analyze moral dimensions in GPT embeddings, demonstrating that these models develop an understanding of fairness during training.

Findings

01

GPT-3.5 embeddings form two submanifolds for fair and unfair judgments

02

The topological analysis reveals an emergent moral dimension in language representations

03

The fairness metric effectively captures moral distinctions in sentence embeddings

Abstract

As Large Language Models are deployed within Artificial Intelligence systems, that are increasingly integrated with human society, it becomes more important than ever to study their internal structures. Higher level abilities of LLMs such as GPT-3.5 emerge in large part due to informative language representations they induce from raw text data during pre-training on trillions of words. These embeddings exist in vector spaces of several thousand dimensions, and their processing involves mapping between multiple vector spaces, with total number of parameters on the order of trillions. Furthermore, these language representations are induced by gradient optimization, resulting in a black box system that is hard to interpret. In this paper, we take a look at the topological structure of neuronal activity in the "brain" of Chat-GPT's foundation language model, and analyze it with respect to a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopological and Geometric Data Analysis · Clusterin in disease pathology

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Softmax · Dense Connections · Linear Layer · Attention Dropout · Residual Connection · Adam