AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning

Amy Xin; Jinxin Liu; Zijun Yao; Zhicheng Lee; Shulin Cao; Lei Hou; Juanzi Li

arXiv:2411.16495·cs.CL·September 29, 2025

AtomR: Atomic Operator-Empowered Large Language Models for Heterogeneous Knowledge Reasoning

Amy Xin, Jinxin Liu, Zijun Yao, Zhicheng Lee, Shulin Cao, Lei Hou, Juanzi Li

PDF

Open Access 1 Repo 1 Datasets

TL;DR

AtomR introduces atomic knowledge operators and a fine-grained reasoning framework for large language models, significantly improving heterogeneous knowledge reasoning accuracy on challenging benchmarks.

Contribution

The paper proposes AtomR, a novel framework with atomic operators for precise heterogeneous knowledge reasoning in LLMs, addressing prior limitations in reasoning planning and knowledge integration.

Findings

01

AtomR outperforms baselines with 9.4% and 9.5% F1 score improvements.

02

Introduces BlendQA, a new benchmark for heterogeneous knowledge reasoning.

03

Demonstrates effective question decomposition and knowledge manipulation.

Abstract

Despite the outstanding capabilities of large language models (LLMs), knowledge-intensive reasoning still remains a challenging task due to LLMs' limitations in compositional reasoning and the hallucination problem. A prevalent solution is to employ chain-of-thought (CoT) with retrieval-augmented generation (RAG), which first formulates a reasoning plan by decomposing complex questions into simpler sub-questions, and then applies iterative RAG at each sub-question. However, prior works exhibit two crucial problems: inadequate reasoning planning and poor incorporation of heterogeneous knowledge. In this paper, we introduce AtomR, a framework for LLMs to conduct accurate heterogeneous knowledge reasoning at the atomic level. Inspired by how knowledge graph query languages model compositional reasoning through combining predefined operations, we propose three atomic knowledge operators, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

THU-KEG/AtomR
pytorchOfficial

Datasets

THU-KEG/BlendQA
dataset· 6 dl
6 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Data Quality and Management · Machine Learning in Materials Science

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Layer Normalization · Byte Pair Encoding · Adam · Residual Connection · Weight Decay · Softmax