# A Multi-Armed Bandit to Smartly Select a Training Set from Big Medical   Data

**Authors:** Benjam\'in Guti\'errez, Lo\"ic Peter, Tassilo Klein and, Christian Wachinger

arXiv: 1705.08111 · 2017-05-30

## TL;DR

This paper introduces a multi-armed bandit approach using Thompson sampling to efficiently select training data from large medical image datasets, improving prediction accuracy while reducing data requirements.

## Contribution

It presents a novel method for training set selection in medical imaging using a bandit model that relies solely on meta information, not image features.

## Key findings

- Achieved higher age prediction accuracy from brain MRI data.
- Required only a fraction of the total data for training.
- Validated on 7,250 subjects across 10 datasets.

## Abstract

With the availability of big medical image data, the selection of an adequate training set is becoming more important to address the heterogeneity of different datasets. Simply including all the data does not only incur high processing costs but can even harm the prediction. We formulate the smart and efficient selection of a training dataset from big medical image data as a multi-armed bandit problem, solved by Thompson sampling. Our method assumes that image features are not available at the time of the selection of the samples, and therefore relies only on meta information associated with the images. Our strategy simultaneously exploits data sources with high chances of yielding useful samples and explores new data regions. For our evaluation, we focus on the application of estimating the age from a brain MRI. Our results on 7,250 subjects from 10 datasets show that our approach leads to higher accuracy while only requiring a fraction of the training data.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.08111/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1705.08111/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1705.08111/full.md

---
Source: https://tomesphere.com/paper/1705.08111