SHARE: Social-Humanities AI for Research and Education

Jo\~ao Gon\c{c}alves; Sonia de Jager; Petr Knoth; David Pride; Nick Jelicic

arXiv:2604.11152·cs.CL·April 14, 2026

SHARE: Social-Humanities AI for Research and Education

Jo\~ao Gon\c{c}alves, Sonia de Jager, Petr Knoth, David Pride, Nick Jelicic

PDF

1 Repo 2 Models 1 Datasets

TL;DR

This report presents the SHARE family of causal language models tailored for social sciences and humanities, along with the MIRROR interface for ethical text review, achieving performance comparable to larger models.

Contribution

Introduction of the first SSH-specific causal language models and a novel interface that enables ethical review without text generation.

Findings

01

SHARE models perform close to larger general-purpose models on SSH texts

02

MIRROR interface allows review of inputs while maintaining SSH norms

03

Models are pretrained specifically for social sciences and humanities texts

Abstract

This intermediate technical report introduces the SHARE family of base models and the MIRROR user interface. The SHARE models are the first causal language models fully pretrained by and for the social sciences and humanities (SSH). Their performance in modelling SSH texts is close to that of general purpose models (Phi-4) which use 100 times more tokens, as shown by our custom SSH Cloze benchmark. The MIRROR user interface is designed for reviewing text inputs from the SSH disciplines while preserving critical engagement. By prototyping a generative AI interface that does not generate any text, we propose a way to harness the capabilities of the SHARE models without compromising the integrity of SSH principles and norms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

joaoffg/SHARE
github

Models

Datasets

Joaoffg/Cloze-SSH
dataset· 24 dl
24 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.