AI Safety and Reproducibility: Establishing Robust Foundations for the Neuropsychology of Human Values
Gopal P. Sarma, Nick J. Hay, and Adam Safron

TL;DR
This paper advocates for a systematic effort to validate neuropsychological findings related to human values to strengthen the scientific foundation for AI safety and value alignment.
Contribution
It introduces a plan to identify and replicate key neuropsychological findings to support AI safety research.
Findings
Proposes a systematic replication initiative for neuropsychology findings.
Highlights the importance of validated science for AI value alignment.
Aims to improve the reliability of research informing AI safety.
Abstract
We propose the creation of a systematic effort to identify and replicate key findings in neuropsychology and allied fields related to understanding human values. Our aim is to ensure that research underpinning the value alignment problem of artificial intelligence has been sufficiently validated to play a role in the design of AI systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
