Neural Retriever and Go Beyond: A Thesis Proposal
Man Luo

TL;DR
This thesis proposal aims to improve neural information retrieval by developing new models, pretraining tasks, and data generation methods to address current limitations like data scarcity and multi-modality queries.
Contribution
It introduces novel architectures, pretraining strategies, and data generation techniques to enhance neural retrievers and explores future research directions.
Findings
Proposed new model architectures for neural retrieval.
Developed IR-oriented pretraining tasks.
Generated large-scale training data for neural retrievers.
Abstract
Information Retriever (IR) aims to find the relevant documents (e.g. snippets, passages, and articles) to a given query at large scale. IR plays an important role in many tasks such as open domain question answering and dialogue systems, where external knowledge is needed. In the past, searching algorithms based on term matching have been widely used. Recently, neural-based algorithms (termed as neural retrievers) have gained more attention which can mitigate the limitations of traditional methods. Regardless of the success achieved by neural retrievers, they still face many challenges, e.g. suffering from a small amount of training data and failing to answer simple entity-centric questions. Furthermore, most of the existing neural retrievers are developed for pure-text query. This prevents them from handling multi-modality queries (i.e. the query is composed of textual description and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
