Cross-genre Document Retrieval: Matching between Conversational and Formal Writings
Tomasz Jurczyk, Jinho D. Choi

TL;DR
This paper addresses cross-genre document retrieval between formal queries and conversational transcripts, introducing a structure reranking method that improves retrieval accuracy by over 4%.
Contribution
It presents a novel structure reranking approach utilizing syntactic and semantic features to enhance cross-genre retrieval performance.
Findings
Over 4% improvement with structure reranking
Effective use of syntactic and semantic structures
Baseline established with state-of-the-art search engine
Abstract
This paper challenges a cross-genre document retrieval task, where the queries are in formal writing and the target documents are in conversational writing. In this task, a query, is a sentence extracted from either a summary or a plot of an episode in a TV show, and the target document consists of transcripts from the corresponding episode. To establish a strong baseline, we employ the current state-of-the-art search engine to perform document retrieval on the dataset collected for this work. We then introduce a structure reranking approach to improve the initial ranking by utilizing syntactic and semantic structures generated by NLP tools. Our evaluation shows an improvement of more than 4% when the structure reranking is applied, which is very promising.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
