Contextual BERT: Conditioning the Language Model Using a Global State

Timo I. Denk; Ana Peleteiro Ramallo

arXiv:2010.15778·cs.CL·October 30, 2020

Contextual BERT: Conditioning the Language Model Using a Global State

Timo I. Denk, Ana Peleteiro Ramallo

PDF

TL;DR

This paper enhances BERT by adding a global state for conditioning on context, improving personalized predictions in fashion outfit completion tasks.

Contribution

Introduces two novel methods to incorporate a global context into BERT, enabling better personalization in industry applications.

Findings

01

Improved accuracy in fashion outfit completion

02

Significant enhancement in personalization capabilities

03

Effective conditioning on fixed-sized context

Abstract

BERT is a popular language model whose main pre-training task is to fill in the blank, i.e., predicting a word that was masked out of a sentence, based on the remaining words. In some applications, however, having an additional context can help the model make the right prediction, e.g., by taking the domain or the time of writing into account. This motivates us to advance the BERT architecture by adding a global state for conditioning on a fixed-sized context. We present our two novel approaches and apply them to an industry use-case, where we complete fashion outfits with missing articles, conditioned on a specific customer. An experimental comparison to other methods from the literature shows that our methods improve personalization significantly.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Layer Normalization · WordPiece · Adam · Softmax · Dense Connections · Dropout · Weight Decay