TL;DR
This paper investigates what pre-trained BERT models know about recommendation items like books, movies, and music, analyzing its content and collaborative knowledge and its performance in conversational recommendation tasks.
Contribution
The study provides a detailed probing analysis of BERT's stored knowledge about recommendation items and evaluates its effectiveness in conversational recommendation scenarios.
Findings
BERT contains knowledge about item content such as genres.
BERT has more content-based than collaborative-based knowledge.
BERT struggles with adversarial data in conversational recommendation.
Abstract
Heavily pre-trained transformer models such as BERT have recently shown to be remarkably powerful at language modelling by achieving impressive results on numerous downstream tasks. It has also been shown that they are able to implicitly store factual knowledge in their parameters after pre-training. Understanding what the pre-training procedure of LMs actually learns is a crucial step for using and improving them for Conversational Recommender Systems (CRS). We first study how much off-the-shelf pre-trained BERT "knows" about recommendation items such as books, movies and music. In order to analyze the knowledge stored in BERT's parameters, we use different probes that require different types of knowledge to solve, namely content-based and collaborative-based. Content-based knowledge is knowledge that requires the model to match the titles of items with their content information, such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Multi-Head Attention · Dense Connections · WordPiece · Residual Connection · Attention Is All You Need · Refunds@Expedia|||How do I get a full refund from Expedia? · Adam · Layer Normalization · Dropout
