Chain-of-Skills: A Configurable Model for Open-domain Question Answering

Kaixin Ma; Hao Cheng; Yu Zhang; Xiaodong Liu; Eric Nyberg; Jianfeng; Gao

arXiv:2305.03130·cs.CL·May 29, 2023·1 cites

Chain-of-Skills: A Configurable Model for Open-domain Question Answering

Kaixin Ma, Hao Cheng, Yu Zhang, Xiaodong Liu, Eric Nyberg, Jianfeng, Gao

PDF

Open Access 1 Repo

TL;DR

This paper introduces a modular, skill-based retrieval model for open-domain question answering that enhances transferability, scalability, and performance through flexible configurations and self-supervised pretraining.

Contribution

It proposes a novel modular retriever with skill reuse, inspired by sparse Transformer, improving zero-shot and fine-tuned ODQA performance across multiple datasets.

Findings

01

Outperforms recent self-supervised retrievers in zero-shot settings.

02

Achieves state-of-the-art results on NQ, HotpotQA, and OTT-QA.

03

Supports flexible skill configurations for different domains.

Abstract

The retrieval model is an indispensable component for real-world knowledge-intensive tasks, e.g., open-domain question answering (ODQA). As separate retrieval skills are annotated for different datasets, recent work focuses on customized methods, limiting the model transferability and scalability. In this work, we propose a modular retriever where individual modules correspond to key skills that can be reused across datasets. Our approach supports flexible skill configurations based on the target domain to boost performance. To mitigate task interference, we design a novel modularization parameterization inspired by sparse Transformer. We demonstrate that our model can benefit from self-supervised pretraining on Wikipedia and fine-tuning using multiple ODQA datasets, both in a multi-task fashion. Our approach outperforms recent self-supervised retrievers in zero-shot evaluations and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mayer123/udt-qa
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning

MethodsAttention Is All You Need · Layer Normalization · Linear Layer · Label Smoothing · Dropout · Byte Pair Encoding · Multi-Head Attention · Absolute Position Encodings · Dense Connections · Adam