# CrystalICL: Enabling In-Context Learning for Crystal Generation

**Authors:** Ruobing Wang, Qiaoyu Tan, Yili Wang, Ying Wang, Xin Wang

arXiv: 2508.20143 · 2025-08-29

## TL;DR

CrystalICL is a novel model that enhances few-shot crystal generation by combining space-group tokenization and a hybrid instruction tuning framework, outperforming existing methods in generating crystal structures with desired properties.

## Contribution

The paper introduces CrystalICL, a new approach that enables effective few-shot crystal generation through innovative tokenization and instruction tuning strategies.

## Key findings

- Outperforms baseline methods in crystal generation tasks
- Effective in both conditional and unconditional generation scenarios
- Demonstrates strong generalization with limited data

## Abstract

Designing crystal materials with desired physicochemical properties remains a fundamental challenge in materials science. While large language models (LLMs) have demonstrated strong in-context learning (ICL) capabilities, existing LLM-based crystal generation approaches are limited to zero-shot scenarios and are unable to benefit from few-shot scenarios. In contrast, human experts typically design new materials by modifying relevant known structures which aligns closely with the few-shot ICL paradigm. Motivated by this, we propose CrystalICL, a novel model designed for few-shot crystal generation. Specifically, we introduce a space-group based crystal tokenization method, which effectively reduces the complexity of modeling crystal symmetry in LLMs. We further introduce a condition-structure aware hybrid instruction tuning framework and a multi-task instruction tuning strategy, enabling the model to better exploit ICL by capturing structure-property relationships from limited data. Extensive experiments on four crystal generation benchmarks demonstrate the superiority of CrystalICL over the leading baseline methods on conditional and unconditional generation tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.20143/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/2508.20143/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/2508.20143/full.md

---
Source: https://tomesphere.com/paper/2508.20143