AI Safety and Reproducibility: Establishing Robust Foundations for the   Neuropsychology of Human Values

Gopal P. Sarma; Nick J. Hay; and Adam Safron

arXiv:1712.04307·cs.AI·September 11, 2018

AI Safety and Reproducibility: Establishing Robust Foundations for the Neuropsychology of Human Values

Gopal P. Sarma, Nick J. Hay, and Adam Safron

PDF

TL;DR

This paper advocates for a systematic effort to validate neuropsychological findings related to human values to strengthen the scientific foundation for AI safety and value alignment.

Contribution

It introduces a plan to identify and replicate key neuropsychological findings to support AI safety research.

Findings

01

Proposes a systematic replication initiative for neuropsychology findings.

02

Highlights the importance of validated science for AI value alignment.

03

Aims to improve the reliability of research informing AI safety.

Abstract

We propose the creation of a systematic effort to identify and replicate key findings in neuropsychology and allied fields related to understanding human values. Our aim is to ensure that research underpinning the value alignment problem of artificial intelligence has been sufficiently validated to play a role in the design of AI systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.