Filling gaps in trustworthy development of AI

Shahar Avin; Haydn Belfield; Miles Brundage; Gretchen Krueger; Jasmine; Wang; Adrian Weller; Markus Anderljung; Igor Krawczuk; David Krueger,; Jonathan Lebensold; Tegan Maharaj; Noa Zilberman

arXiv:2112.07773·cs.AI·December 16, 2021

Filling gaps in trustworthy development of AI

Shahar Avin, Haydn Belfield, Miles Brundage, Gretchen Krueger, Jasmine, Wang, Adrian Weller, Markus Anderljung, Igor Krawczuk, David Krueger,, Jonathan Lebensold, Tegan Maharaj, Noa Zilberman

PDF

TL;DR

This paper discusses the importance of bridging the gap between AI ethics principles and practical implementation to enhance trustworthiness in AI development, proposing mechanisms for trustworthy AI ecosystems.

Contribution

It introduces concrete methods for AI developers to demonstrate and verify trustworthiness, addressing gaps in current ethical frameworks.

Findings

01

Proposes mechanisms for trustworthy AI development

02

Highlights the importance of verifiable trustworthiness

03

Suggests ecosystem approaches for trust assessment

Abstract

The range of application of artificial intelligence (AI) is vast, as is the potential for harm. Growing awareness of potential risks from AI systems has spurred action to address those risks, while eroding confidence in AI systems and the organizations that develop them. A 2019 study found over 80 organizations that published and adopted "AI ethics principles'', and more have joined since. But the principles often leave a gap between the "what" and the "how" of trustworthy AI development. Such gaps have enabled questionable or ethically dubious behavior, which casts doubts on the trustworthiness of specific organizations, and the field more broadly. There is thus an urgent need for concrete methods that both enable AI developers to prevent harm and allow them to demonstrate their trustworthiness through verifiable behavior. Below, we explore mechanisms (drawn from arXiv:2004.07213) for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.