Quantum Statistics-Inspired Neural Attention

Aristotelis Charalampous; Sotirios Chatzis

arXiv:1809.06205·cs.AI·October 31, 2018

Quantum Statistics-Inspired Neural Attention

Aristotelis Charalampous, Sotirios Chatzis

PDF

Open Access

TL;DR

This paper introduces a quantum-statistics-inspired extension to neural attention mechanisms, modeling higher-order dependencies in sequence data to improve tasks like machine translation.

Contribution

It broadens neural attention by modeling attention as a density matrix, capturing complex dependencies beyond point-wise selection.

Findings

01

Improved performance on benchmark machine translation datasets

02

Effective modeling of higher-order temporal dependencies

03

Favorable evaluation metrics compared to traditional attention models

Abstract

Sequence-to-sequence (encoder-decoder) models with attention constitute a cornerstone of deep learning research, as they have enabled unprecedented sequential data modeling capabilities. This effectiveness largely stems from the capacity of these models to infer salient temporal dynamics over long horizons; these are encoded into the obtained neural attention (NA) distributions. However, existing NA formulations essentially constitute point-wise selection mechanisms over the observed source sequences; that is, attention weights computation relies on the assumption that each source sequence element is independent of the rest. Unfortunately, although convenient, this assumption fails to account for higher-order dependencies which might be prevalent in real-world data. This paper addresses these limitations by leveraging Quantum-Statistical modeling arguments. Specifically, our work…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning