Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?

Fumiya Uchiyama; Takeshi Kojima; Andrew Gambardella; Qi Cao; Yusuke Iwasawa; Yutaka Matsuo

arXiv:2410.06735·cs.CL·July 1, 2025

Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?

Fumiya Uchiyama, Takeshi Kojima, Andrew Gambardella, Qi Cao, Yusuke Iwasawa, Yutaka Matsuo

PDF

Open Access 1 Repo 1 Video

TL;DR

This study investigates how pre-training on programming languages versus natural languages influences large language models' ability to perform logical inference, highlighting the importance of language features and syntax structure.

Contribution

It systematically compares the effects of different programming languages and syntax features on logical reasoning performance in LLMs, revealing key factors that enhance inference abilities.

Findings

01

Models trained on programming languages outperform those trained on natural languages in logical tasks.

02

Programming language models better follow instructions than natural language models.

03

Syntax tree depth influences the logical reasoning performance of models.

Abstract

Recent large language models (LLMs) have demonstrated remarkable generalization abilities in mathematics and logical reasoning tasks. Prior research indicates that LLMs pre-trained with programming language data exhibit high mathematical and reasoning abilities; however, this causal relationship has not been rigorously tested. Our research aims to verify which programming languages and features during pre-training affect logical inference performance. Specifically, we pre-trained decoder-based language models from scratch using datasets from ten programming languages (e.g., Python, C, Java) and three natural language datasets (Wikipedia, Fineweb, C4) under identical conditions. Thereafter, we evaluated the trained models in a few-shot in-context learning setting on logical reasoning tasks: FLD and bAbi, which do not require commonsense or world knowledge. The results demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fumiyauchiyama/code_pretraining
pytorchOfficial

Videos

Which Programming Language and What Features at Pre-training Stage Affect Downstream Logical Inference Performance?· underline

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Intelligent Tutoring Systems and Adaptive Learning · Online Learning and Analytics