# Optimal design of experiments by combining coarse and fine measurements

**Authors:** Alpha A. Lee, Michael P. Brenner, Lucy J. Colwell

arXiv: 1702.06001 · 2017-11-22

## TL;DR

This paper introduces a strategy combining coarse and fine measurements to efficiently build accurate predictive models, reducing costly high-resolution data needs by leveraging abundant categorical data.

## Contribution

It presents a novel approach inspired by statistical physics that integrates coarse and fine measurements for improved model accuracy.

## Key findings

- Effective combination of coarse and fine data improves prediction accuracy.
- Method applied successfully to predict molecular properties.
- Reduces need for extensive high-resolution measurements.

## Abstract

In many contexts it is extremely costly to perform enough high quality experimental measurements to accurately parameterize a predictive quantitative model. However, it is often much easier to carry out large numbers of experiments that indicate whether each sample is above or below a given threshold. Can many such categorical or "coarse" measurements be combined with a much smaller number of high resolution or "fine" measurements to yield accurate models? Here, we demonstrate an intuitive strategy, inspired by statistical physics, wherein the coarse measurements are used to identify the salient features of the data, while the fine measurements determine the relative importance of these features. A linear model is inferred from the fine measurements, augmented by a quadratic term that captures the correlation structure of the coarse data. We illustrate our strategy by considering the problems of predicting the antimalarial potency and aqueous solubility of small organic molecules from their 2D molecular structure.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.06001/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1702.06001/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/1702.06001/full.md

---
Source: https://tomesphere.com/paper/1702.06001