Iterative Hierarchical Attention for Answering Complex Questions over Long Documents
Haitian Sun, William W. Cohen, Ruslan Salakhutdinov

TL;DR
The paper introduces DocHopper, a hierarchical attention model that navigates long documents to answer complex, multi-hop questions efficiently, achieving state-of-the-art results on multiple datasets.
Contribution
It presents a novel hierarchical attention mechanism that enables multi-step navigation through long documents for question answering.
Findings
Achieves state-of-the-art results on three QA datasets.
Runs 3-10 times faster than baseline models.
Effectively handles long, complex documents for multi-hop reasoning.
Abstract
We propose a new model, DocHopper, that iteratively attends to different parts of long, hierarchically structured documents to answer complex questions. Similar to multi-hop question-answering (QA) systems, at each step, DocHopper uses a query to attend to information from a document, combines this ``retrieved'' information with to produce the next query. However, in contrast to most previous multi-hop QA systems, DocHopper is able to ``retrieve'' either short passages or long sections of the document, thus emulating a multi-step process of ``navigating'' through a long document to answer a question. To enable this novel behavior, DocHopper does not combine document information with by concatenating text to the text of , but by combining a compact neural representation of with a compact neural representation of a hierarchical part of the document, which can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Adam · Residual Connection · Dense Connections · Linear Warmup With Linear Decay · Weight Decay
