Loading paper
NovelHopQA: Diagnosing Multi-Hop Reasoning Failures in Long Narrative Contexts | Tomesphere