Contextual BERT: Conditioning the Language Model Using a Global State
Timo I. Denk, Ana Peleteiro Ramallo

TL;DR
This paper enhances BERT by adding a global state for conditioning on context, improving personalized predictions in fashion outfit completion tasks.
Contribution
Introduces two novel methods to incorporate a global context into BERT, enabling better personalization in industry applications.
Findings
Improved accuracy in fashion outfit completion
Significant enhancement in personalization capabilities
Effective conditioning on fixed-sized context
Abstract
BERT is a popular language model whose main pre-training task is to fill in the blank, i.e., predicting a word that was masked out of a sentence, based on the remaining words. In some applications, however, having an additional context can help the model make the right prediction, e.g., by taking the domain or the time of writing into account. This motivates us to advance the BERT architecture by adding a global state for conditioning on a fixed-sized context. We present our two novel approaches and apply them to an industry use-case, where we complete fashion outfits with missing articles, conditioned on a specific customer. An experimental comparison to other methods from the literature shows that our methods improve personalization significantly.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Layer Normalization · WordPiece · Adam · Softmax · Dense Connections · Dropout · Weight Decay
