# Predicting Compressive Strength of Consolidated Molecular Solids Using   Computer Vision and Deep Learning

**Authors:** Brian Gallagher, Matthew Rever, Donald Loveland, T. Nathan Mundhenk,, Brock Beauchamp, Emily Robertson, Golam G. Jaman, Anna M. Hiszpanski, and T., Yong-Jin Han

arXiv: 1906.02130 · 2020-03-02

## TL;DR

This study applies computer vision and machine learning to SEM images to accurately predict the compressive strength of molecular solids, outperforming traditional methods and revealing new crystal attributes.

## Contribution

It introduces a combined ML and deep learning approach for predicting material strength from SEM images, highlighting the complementarity of methods and discovering new informative features.

## Key findings

- ML models reduce prediction error by 24% over traditional analysis.
- RF performs better with limited data; DL excels with abundant data.
- Models uncover previously underutilized crystal attributes.

## Abstract

We explore the application of computer vision and machine learning (ML) techniques to predict material properties (e.g. compressive strength) based on SEM images. We show that it's possible to train ML models to predict materials performance based on SEM images alone, demonstrating this capability on the real-world problem of predicting uniaxially compressed peak stress of consolidated molecular solids samples. Our image-based ML approach reduces mean absolute percent error (MAPE) by an average of 24% over baselines representative of the current state-of-the-practice (i.e., domain-expert's analysis and correlation). We compared two complementary approaches to this problem: (1) a traditional ML approach, random forest (RF), using state-of-the-art computer vision features and (2) an end-to-end deep learning (DL) approach, where features are learned automatically from raw images. We demonstrate the complementarity of these approaches, showing that RF performs best in the "small data" regime in which many real-world scientific applications reside (up to 24% lower RMSE than DL), whereas DL outpaces RF in the "big data" regime, where abundant training samples are available (up to 24% lower RMSE than RF). Finally, we demonstrate that models trained using machine learning techniques are capable of discovering and utilizing informative crystal attributes previously underutilized by domain experts.

---
Source: https://tomesphere.com/paper/1906.02130