How to Efficiently Design 2D Materials for Electrochemical Applications Using Machine Learning
Pawin Iamprasertkun

TL;DR
This paper explores how machine learning can help design 2D materials for electrochemical uses more efficiently.
Contribution
The paper introduces a new approach combining machine learning and electrochemistry for 2D material design.
Findings
Traditional trial-and-error methods are reaching their limits in 2D material design.
Machine learning and generative AI offer a promising path for efficient material design.
Abstract
Two dimensional (2D) materials have transitioned from lab findings to potential applications. Starting with the isolation of graphene, the field has rapidly expanded to encompass a broad spectrum of materials, including transition metal dichalcogenides, MXenes, and so on. Each of them offers unique structural, electronic, optical, and electrochemical properties. These materials have been recognized as candidates for applications in energy storage and conversion including electrocatalysts. As we approach the limits of traditional “trial-and-error” methods, the integration of statistical analysis, machine learning (ML), live (real-time) electrochemistry, and generative AI presents a compelling path forward. These tools are no longer aspirational; they are becoming essential to navigating the vast and complex design space of 2D materials for electrochemical applications in the future.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —Thammasat University10.13039/501100005790
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Advanced Memory and Neural Computing · Fuel Cells and Related Materials
Introduction
Over the past two decades, 2D materials have been synthesized using a variety of techniques for diverse electrochemical applications, hence, a ton of data has been generated. Therefore, * a promising starting point for future research is the utilizing of literature data from databases *. Even simple statistical analysis in electrochemistry can provide fundamental insights into the behavior and design of 2D materials. This is because statistical tools such as regression, ANOVA,? curve fitting, or Pearson/Spearman correlation serve as an essential bridge between raw data and meaningful interpretation.? For example, Pearson correlation can provide an indication of which material design features enhance electrochemical activity, such as capacitance and overpotential. Performing fitting analyses can also yield insights into the dominant electrochemical mechanisms at play.? More advanced methods, such as principal component analysis (PCA), can help reduce dimensionality, highlight dominant trends, and identify redundant descriptors in large data sets.? This approach provides useful electrochemical insights without requiring laboratory resources and can guide newcomers toward more precise synthesis and measurement. However, data sparsity, inconsistent units, and varying test conditions limit reproducibility, so relying solely on statistical analysis may lead to misinterpretations. Yet, it remains a valuable practice for understanding data distribution and ensuring that only clean, structured, and interpretable data are fed into machine learning models, thereby building trust in downstream predictions. This topic has not been widely emphasized, as most researchers focus on experimental work; however, after COVID-19 it became central in “2D materials” and “electrochemistry,” where many materials scientists began using artificial neural networks (ANNs) to predict electrochemical features. While black-box models such as ANNs can provide strong predictive performance when hyperparameters are finely tuned, they are also prone to overfitting. ?,? Another major concern is the validity of the data. If incorrect data are fed into the model, the output will inevitably be flawed, no matter how precise the model is.? To address this issue, employing classical machine learning approaches–such as regression or tree-based methods, and even complex structural models–may improve the accuracy of electrochemical predictions. Hence, feature-importance analysis tools such as Shapley Additive Explanations (SHAP), Partial Dependence Plots (PDP), and Feature Importance (FPI) can be applied to enable a so-called “inverted design”.? * However, it should be noted that the current approach to predicting electrochemical performance may soon reach its limits, particularly the notion of simply collecting data and applying conventional models, such as artificial neural networks (ANNs), without deeper physical insight.
- Recent research has shifted toward the use of generative AI, led by teams such as Microsoft Research, and Google DeepMind to discover materials with tailored properties for specific applications, such as energy storage and electrocatalysis. ?,? With established tools such as high-throughput screening,? open materials databases,? and machine learning models for property prediction–such as crystal graph convolutional neural networks?–* it is now possible to screen hundreds of thousands of materials to identify the most promising candidates followed by computational and experimental validation *. This approach greatly accelerates the discovery of improved materials, overcoming the limitations of human intuition, restricted candidate spaces, and lengthy iterative cycles, ultimately enabling the discovery of new high-performance materials for many electrochemical applications.
Features Prediction of 2D Materials
From an electrochemical perspective, ML has been applied to predict capacitance, energy/power density, catalytic activity, and adsorption energies of 2D materials, while also reducing the computational cost of first-principles simulations.? Mining experimental data and applying supervised learning methods also accelerates discovery and broadens the design space.? Recent studies show that ML can predict electrocatalytic performance, provide mechanistic insights, and guide the design of efficient electrodes. Interpretability tools such as Shapley Additive Explanations (SHAP) can quantify the contribution of each design feature–e.g., surface area, doping, defects, or pore size–to electrochemical performance metrics such as capacitance and catalytic activity. ?,? Partial Dependence Plots (PDPs) visualize how a specific feature–such as capacitance or charge–discharge rate–affects performance while holding other features constant, helping researchers identify key synthesis (in focus) or operating parameters to optimize electrochemical behavior.? “Inverse material design” can be achieved through Feature Importance (FPI) analysis, which guides experimental efforts toward the most relevant features, streamlines the material design process, and reduces reliance on trial-and-error approaches.? The key challenge is translating ML “recipes” into practical synthesis. However, conducting experiments or relying on data from previous literature still limits the pace of materials discovery. By integrating computational and experimental workflows, ML could propose optimal synthesis condition including optimum surface area, doping, heterostructures, or compositions–drastically reducing the time and cost compared with traditional experimental method.
Materials Discovery through Generative Models
Machine learning can accelerate materials discovery by predicting properties at low computational cost, but most methods still depend on experimental data or crystal structure descriptors, restricting their use to well-characterized materials. Roost (Representation Learning from Stoichiometry), developed by Goodall and Lee,? overcomes this limitation by treating stoichiometric formulas as dense weighted graphs between elements and learning descriptors directly from data. This approach achieves lower errors with less data than existing structure-agnostic methods. This development has expanded into state-of-the-art graph neural networks, such as those from Google DeepMind, for materials exploration and candidate structure filtering. Using this approach, they have discovered more than a million stable structures.? With generative models–such as score-based approaches–it is possible to reconstruct known or even novel stable structures using only atomic numbers as input.? It is clear that developing more precise generative AI remains a major focus in industry. Microsoft Research recently introduced “MatterGen”, a framework trained on crystallographic data and designed for high structural validity and broad chemical space exploration. By implicitly learning symmetry and bonding rules, it offers a more general-purpose and physics-aligned approach, representing a significant improvement over earlier generative models.? * However, result validation remains a critical issue. While deep learning and automation hold great promise for accelerating materials discovery, their outputs must be carefully verified, as methodological errors can lead to overstated claims.
- For instance, incorrect or poor diffraction fits that overlook strong peaks can make it impossible to confirm the reported structures or purities.? Yet, reliable validation often requires density functional theory (DFT), which can be computationally expensive.
Autonomus Electrochemical Systems
The integration of live experimentation is beginning to transform how 2D materials are synthesized and characterized.? Synthesis protocols are continuously refined based on real-time data, significantly reducing the time required to optimize material properties and ensuring high-quality material production.? This approach is particularly beneficial for the synthesis of 2D materials for electrochemical applications, where minor variations in experimental conditions such as synthesis environment, solvent, exfoliation time and power, or substrate can dramatically affect the electrochemical performance. For example, in the development of WSe_2_, small changes in synthesis conditions can influence the material capacitance, catalytic activity and cycling stability.? By integrating autonomous synthesis with real-time spectral analysis and ML algorithms, it is possible to rapidly map the synthesis-property relationships of materials, accelerating the optimization process and enabling the development of high-performance materials for energy storage applications.? Szymanski et al. demonstrated the A-Lab (autonomous laboratory) platform for synthesizing novel materials, discovering over 41 new compounds within just 17 days by leveraging data from the Materials Project and Google DeepMind.? While ML and autonomous experimentation have shown significant promise in predicting the intrinsic properties of 2D materials, a critical challenge remains in translating these properties into device-level. It is interesting to consider a closed-loop design approach, such as electrolyte design for industrial-scale electrochemical systems.? However, * there is currently a shortage of chemists and materials scientists with programming skills, as well as computer engineers with a strong foundational understanding of chemistry. As a result, communication between experts in these fields may still be limited, posing a challenge to conducting truly novel interdisciplinary research. *
Conclusion and Outlook
A particularly underexplored yet promising direction is the integration of machine learning with “in-situ” or “operando” materials and electrochemical characterization. Most current applications still rely on static data sets derived from “ex-situ” measurements or DFT simulations, which fail to capture the dynamic nature of material behavior under real operating conditions. Real-time data such as changes in structural integrity, charge transfer resistance, or ion diffusion can offer valuable insights into transient processes that govern long-term performance and degradation. By embedding ML frameworks into platforms coupled with in situ Raman spectroscopy, electrochemical impedance spectroscopy (EIS), or synchrotron-based X-ray absorption techniques, researchers can unveil temporal correlations and early signs of failure mechanisms. This approach not only enhances predictive accuracy but also paves the way for more resilient and application tailored material designs. Looking further, the future of 2D materials synthesis will be driven by the seamless integration of machine learning, autonomous synthesis, and advanced data analytics including the use of generative AI for discover novel materials. These data centric strategies combined with high throughput experimentation, which will accelerate the rediscovery of 2D materials. Once researchers become familiar with machine learning tools (also referred to as AI), their use will resemble the way scientific calculators revolutionized the handling of complex equations. AI is poised to become an indispensable tool in materials innovation, significantly reducing development time while enhancing precision in the chemistry of 2D materials.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Grima-Carmena L.Oyonarte-Andrés S.Giner-Sanz J. J.García-Gabaldón M.Bosch-Mossi F.Pérez-Herranz V.Statistical analysis of the effect of the electrochemical treatment and the acid concentration on the leaching of NMC cathodes from spent Li-ion batteries Journal of Environmental Chemical Engineering 202311511042310.1016/j.jece.2023.110423 · doi ↗
- 2Jitapunkul K.Deshsorn K.Payakkachon K.Chaisrithong T.Lawtrakul L.Iamprasertkun P.Insights into Heteroatom-Doped Graphene Supercapacitor Data through Manual Data Separation and Statistical Analysis J. Phys. Chem. C 202312737183161832610.1021/acs.jpcc.3c 04956 · doi ↗
- 3Chenwittayakhachon A.Jitapunkul K.Nakpalad B.Worrayotkovit P.Namuangruk S.Sirisinudomkit P.Iamprasertkun P.Machine learning approach to understanding the ‘synergistic’ pseudocapacitive effects of heteroatom doped graphene 2D Materials 202310202500310.1088/2053-1583/acaf 8d · doi ↗
- 4Zhou M.Vassallo A.Wu J.Data-Driven Approach to Understanding the In-Operando Performance of Heteroatom-Doped Carbon Electrodes ACS Applied Energy Materials 2020365993600010.1021/acsaem.0c 01059 · doi ↗
- 5Deshsorn K.Lawtrakul L.Iamprasertkun P.How false data affects machine learning models in electrochemistry J. Power Sources 202459723412710.1016/j.jpowsour.2024.234127 · doi ↗
- 6Deshsorn K.Payakkachon K.Chaisrithong T.Jitapunkul K.Lawtrakul L.Iamprasertkun P.Unlocking the Full Potential of Heteroatom-Doped Graphene-Based Supercapacitors through Stacking Models and SHAP-Guided Optimization J. Chem. Inf. Model.202363165077508810.1021/acs.jcim.3c 0067037635637 · doi ↗ · pubmed ↗
- 7Zeni C.Pinsler R.Zügner D.Fowler A.Horton M.Fu X.Wang Z.Shysheya A.CrabbéJ.Ueda S.Sordillo R.Sun L.Smith J.Nguyen B.Schulz H.Lewis S.Huang C.-W.Lu Z.Zhou Y.Yang H.Hao H.Li J.Yang C.Li W.Tomioka R.Xie T.A generative model for inorganic materials design Nature 2025639805562463210.1038/s 41586-025-08628-539821164 PMC 11922738 · doi ↗ · pubmed ↗
- 8Merchant A.Batzner S.Schoenholz S. S.Aykol M.Cheon G.Cubuk E. D.Scaling deep learning for materials discovery Nature 20236247990808510.1038/s 41586-023-06735-938030720 PMC 10700131 · doi ↗ · pubmed ↗
