Machine learning models with different cheminformatics data sets to forecast the power conversion efficiency of organic solar cells
Omar A. Alvarez-Gonzaga, Ulises A. Vergara-Beltran, Juan I. Rodriguez

TL;DR
This study employs machine learning models with cheminformatics descriptors to predict and improve the power conversion efficiency of organic solar cells, considering donor and acceptor types for better accuracy and forecasting new OSC performance.
Contribution
It introduces a ML approach that incorporates donor and acceptor features, significantly enhancing PCE prediction accuracy and enabling theoretical forecasting of OSC efficiency.
Findings
Adding donor features improves prediction by up to 34%.
RF with RDkit descriptors achieves a 0.96 training and 0.62 testing correlation.
Over 50% of exchanged-acceptor OSCs are predicted to have higher PCE.
Abstract
Random Forest (RF) and Gradient Boosting Regression Trees (GBRT) regression models along with three cheminformatics data sets (RDkit, Mordred, Morgan) have been used to predict the power conversion efficiency (PCE) of organic solar cells (OSCs). The data consists of cheinformatics descriptors of the electron donor used in 433 OSCs for which the experimental PCE (target variable) is reported in the literature. The donor is either a polymer or a small organic molecule, and the acceptor the fullerene derivatives PCBM or PC71BM. Unlike previous methods, our ML approach considers the type of donor and the acceptor by adding four extra donor's features using the one-hot encoder tool. It is demonstrated that this additional information improves the prediction performance up to 34%. We have also exploited this feature to theoretically forecast the PCE of new OSCs by evaluating the ML model for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science
