ResearchArcade: Graph Interface for Academic Tasks
Jingjun Xu, Chongshan Lin, Haofei Yu, Tao Feng, Jiaxuan You

TL;DR
ResearchArcade introduces a graph-based interface that unifies diverse academic data sources and tasks, supporting multiple models and modalities to enhance research workflows and accelerate knowledge discovery.
Contribution
It presents a novel unified graph interface for academic data, enabling multi-task, multi-modal research support with improved performance over baselines.
Findings
Graph structures improve task performance.
Multi-source, multi-modal data enhances research capabilities.
Unified interface supports diverse academic tasks.
Abstract
Academic research generates diverse data sources, and as researchers increasingly use machine learning to assist research tasks, a crucial question arises: Can we build a unified data interface to support the development of machine learning models for various academic tasks? Models trained on such a unified interface can better support human researchers throughout the research process, eventually accelerating knowledge discovery. In this work, we introduce ResearchArcade, a graph-based interface that connects multiple academic data sources, unifies task definitions, and supports a wide range of base models to address key academic challenges. ResearchArcade utilizes a coherent multi-table format with graph structures to organize data from different sources, including academic corpora from ArXiv and peer reviews from OpenReview, while capturing information with multiple modalities, such…
Peer Reviews
Decision·Submitted to ICLR 2026
S1. This submission accurately identifies key pain points in academic AI. From a data perspective, it targets the complexity of academic data, sourced from diverse platforms (ArXiv’s computer science papers, OpenReview’s ICLR submissions) and spanning multiple modalities (text, figures, tables). From a task perspective, it decreases the unnecessary effort required for data. By focusing on these gaps, the work directly responds to the demand for a unified data interface, as proposed in its core d
W1. Data used in the experiment is restricted to the computer science (CS) field (ArXiv data) and ICLR conferences (OpenReview data). However, no testing was conducted in other domains such as biology, chemistry or materials science. W2. Critical preprocessing steps are not explained. For example, how to extract paragraphs and figures from ArXiv’s LaTeX and how to align OpenReview reviews with paper paragraphs—these details remain untackled. W3. Novel academic-related contributions are limited
S1: This paper addresses a real need for unified academic data interfaces to support ML models across diverse research tasks. S2: This paper integrates ArXiv and OpenReview with text, figures, tables, and temporal evolution, offering a holistic view of academic knowledge. S3: The two-step scheme (target entity + neighborhood) of this paper is simple yet general, enabling both predictive and generative tasks. S4: This paper evaluates 6 tasks across 4 model types (EMB, GNN, LLM, GWM), showing c
W1: The graph construction and task definition are engineering-heavy; no new modeling techniques or architectures are proposed. W2: Tasks of this paper, like figure insertion or paragraph generation are reconstruction-style and may not reflect real-world academic needs (e.g., idea quality, scientific discovery). W3: Best accuracy is only 0.55, barely above random — raises questions about whether the graph is rich enough for high-level reasoning. W4: All metrics are automatic (SBERT, BLEU, etc
- Compared to previous datasets, ResearchArcade covers multiple sources and modalities. It also provides a unified interface for tasks defined on academic graphs. - The writing is clear and easy to follow.
- As an academic graph, the dataset’s coverage of papers is still limited. It includes around 45k papers from arXiv and about 28k from ICLR, which may restrict the generalizability of conclusions derived from it. - For the paragraph generation and revision generation tasks, the authors rely solely on semantic similarity metrics. However, such metrics may not capture aspects like clarity or appropriateness of the generated text (e.g., generated paragraphs and revisions). Using LLM-as-a-judge ev
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Graph Theory and Algorithms
