Unbiasing Truncated Backpropagation Through Time

Corentin Tallec; Yann Ollivier

arXiv:1705.08209·cs.NE·May 24, 2017·53 cites

Unbiasing Truncated Backpropagation Through Time

Corentin Tallec, Yann Ollivier

PDF

Open Access

TL;DR

This paper introduces ARTBP, an unbiased variant of truncated BPTT that maintains computational efficiency while improving convergence and performance in learning recurrent models.

Contribution

The paper proposes ARTBP, a novel algorithm that corrects bias in truncated BPTT using variable truncation and compensation factors, ensuring unbiased gradient estimates.

Findings

01

ARTBP converges reliably on synthetic tasks with complex dependencies.

02

ARTBP slightly outperforms truncated BPTT on Penn Treebank language modeling.

03

ARTBP maintains computational benefits of truncated BPTT while providing unbiased gradients.

Abstract

Truncated Backpropagation Through Time (truncated BPTT) is a widespread method for learning recurrent computational graphs. Truncated BPTT keeps the computational benefits of Backpropagation Through Time (BPTT) while relieving the need for a complete backtrack through the whole data sequence at every step. However, truncation favors short-term dependencies: the gradient estimate of truncated BPTT is biased, so that it does not benefit from the convergence guarantees from stochastic gradient theory. We introduce Anticipated Reweighted Truncated Backpropagation (ARTBP), an algorithm that keeps the computational benefits of truncated BPTT, while providing unbiasedness. ARTBP works by using variable truncation lengths together with carefully chosen compensation factors in the backpropagation equation. We check the viability of ARTBP on two tasks. First, a simple synthetic task where careful…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks