WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences
Xiao Liu, Hanyu Lai, Hao Yu, Yifan Xu, Aohan Zeng, Zhengxiao Du, Peng, Zhang, Yuxiao Dong, Jie Tang

TL;DR
WebGLM is an efficient web-enhanced question-answering system that integrates web search with large language models, outperforming existing systems in accuracy, efficiency, and cost-effectiveness, based on human evaluations and ablation studies.
Contribution
We introduce WebGLM, a novel web-enhanced QA system that combines web retrieval with large language models, addressing WebGPT limitations and demonstrating superior performance with a 10B parameter model.
Findings
WebGLM outperforms WebGPT (13B) in human evaluation.
WebGLM with 10B parameters rivals WebGPT (175B) in accuracy.
Systematic evaluation criteria for web-enhanced QA systems are proposed.
Abstract
We present WebGLM, a web-enhanced question-answering system based on the General Language Model (GLM). Its goal is to augment a pre-trained large language model (LLM) with web search and retrieval capabilities while being efficient for real-world deployments. To achieve this, we develop WebGLM with strategies for the LLM-augmented retriever, bootstrapped generator, and human preference-aware scorer. Specifically, we identify and address the limitations of WebGPT (OpenAI), through which WebGLM is enabled with accuracy, efficiency, and cost-effectiveness advantages. In addition, we propose systematic criteria for evaluating web-enhanced QA systems. We conduct multi-dimensional human evaluation and quantitative ablation studies, which suggest the outperformance of the proposed WebGLM designs over existing systems. WebGLM with the 10-billion-parameter GLM (10B) is shown to perform better…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsGLM
