Should attention be all we need? The epistemic and ethical implications   of unification in machine learning

Nic Fishman; Leif Hancox-Li

arXiv:2205.08377·cs.LG·May 18, 2022

Should attention be all we need? The epistemic and ethical implications of unification in machine learning

Nic Fishman, Leif Hancox-Li

PDF

TL;DR

This paper critically examines the widespread adoption of attention-based models like transformers in machine learning, highlighting epistemic and ethical implications of their unification across diverse domains.

Contribution

It provides a nuanced analysis of the epistemic and ethical risks associated with the dominance of attention mechanisms in machine learning, challenging assumptions of their universal applicability.

Findings

01

Unification may limit methodological diversity and increase black-boxing.

02

Transformers' success does not necessarily transfer across all domains.

03

Centralization of power raises ethical concerns about marginalization.

Abstract

"Attention is all you need" has become a fundamental precept in machine learning research. Originally designed for machine translation, transformers and the attention mechanisms that underpin them now find success across many problem domains. With the apparent domain-agnostic success of transformers, many researchers are excited that similar model architectures can be successfully deployed across diverse applications in vision, language and beyond. We consider the benefits and risks of these waves of unification on both epistemic and ethical fronts. On the epistemic side, we argue that many of the arguments in favor of unification in the natural sciences fail to transfer over to the machine learning case, or transfer over only under assumptions that might not hold. Unification also introduces epistemic risks related to portability, path dependency, methodological diversity, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.