A Common Evaluation Setting for Just.Ask, Open Ephyra and Aranea QA   systems

Ricardo Pires

arXiv:1205.1779·cs.IR·May 9, 2012

A Common Evaluation Setting for Just.Ask, Open Ephyra and Aranea QA systems

Ricardo Pires

PDF

Open Access

TL;DR

This paper proposes a unified evaluation framework for comparing QA systems like Just.Ask, Open Ephyra, and Aranea, addressing inconsistencies in testing conditions and analyzing the impact of different pipeline stages.

Contribution

It introduces a common evaluation setting for multiple QA systems, enabling fair comparison and analysis of their components and techniques.

Findings

01

Standardized evaluation setting facilitates fair comparison

02

Analysis of pipeline stage impact on QA performance

03

Insights into technique transferability between systems

Abstract

Question Answering (QA) is not a new research field in Natural Language Processing (NLP). However in recent years, QA has been a subject of growing study. Nowadays, most of the QA systems have a similar pipelined architecture and each system use a set of unique techniques to accomplish its state of the art results. However, many things are not clear in the QA processing. It is not clear the extend of the impact of tasks performed in earlier stages in following stages of the pipelining process. It is not clear, if techniques used in a QA system can be used in another QA system to improve its results. And finally, it is not clear in what setting should be these systems tested in order to properly analyze their results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Bayesian Modeling and Causal Inference