# Data-driven Discovery of 3D and 2D Thermoelectric Materials

**Authors:** Kamal Choudhary, Kevin Garrity, Francesca Tavazza

arXiv: 1906.06024 · 2020-10-28

## TL;DR

This study combines computational screening and machine learning to identify promising 3D and 2D thermoelectric materials from a large database, enabling efficient discovery of high-performance candidates.

## Contribution

It introduces a systematic approach integrating DFT, transport calculations, and ML models to discover and pre-screen thermoelectric materials efficiently.

## Key findings

- Identified 2932 promising 3D thermoelectric materials.
- Predicted several new classes of high-performance 2D thermoelectric materials.
- Developed ML models for rapid pre-screening of thermoelectric properties.

## Abstract

In this work, we first perform a systematic search for high-efficiency three-dimensional (3D) and two-dimensional (2D) thermoelectric materials by combining semiclassical transport techniques with density functional theory (DFT) calculations and then train machine-learning models on the thermoelectric data. Out of 36000 three-dimensional and 900 two-dimensional materials currently in the publicly available JARVIS-DFT database, we identify 2932 3D and 148 2D promising thermoelectric materials using a multi-steps screening procedure, where specific thresholds are chosen for key quantities like bandgaps, Seebeck coefficients and power factors. We compute the Seebeck coefficients for all the materials currently in the database and validate our calculations by comparing our results, for a subset of materials, to experimental and existing computational datasets. We also investigate the effect of chemical, structural, crystallographic and dimensionality trends on thermoelectric performance. We predict several classes of efficient 3D and 2D materials such as Ba(MgX)2 (X=P,As,Bi), X2YZ6 (X=K,Rb, Y=Pd,Pt, Z=Cl,Br), K2PtX2(X=S,Se), NbCu3X4 (X=S,Se,Te), Sr2XYO6 (X=Ta, Zn, Y=Ga, Mo), TaCu3X4 (X=S, Se,Te), and XYN (X=Ti, Zr, Y=Cl, Br). Finally, as high-throughput DFT is computationally expensive, we train machine learning models using gradient boosting decision trees (GBDT) and classical force-field inspired descriptors (CFID) for n-and p-type Seebeck coefficients and power factors, to quickly pre-screen materials for guiding the next set of DFT calculations. The dataset and tools are made publicly available at the websites: https://www.ctcms.nist.gov/~knc6/JVASP.html , https://www.ctcms.nist.gov/jarvisml/ and https://jarvis.nist.gov/ .

---
Source: https://tomesphere.com/paper/1906.06024