# Multi-Task Classification for Improved Health Outcome Prediction Based on Environmental Indicators

**Authors:** MITRA ALIREZAEI, QUYNH C. NGUYEN, ROSS WHITAKER, TOLGA TASDIZEN

PMC · DOI: 10.1109/access.2023.3295777 · 2024-02-23

## TL;DR

This paper improves health outcome predictions by using multi-task learning on neighborhood environment data from Google Street View and Flickr images.

## Contribution

The novel approach uses multi-task learning with Flickr images to enhance the accuracy of classifying Google Street View images for health outcome prediction.

## Key findings

- Multi-task learning improves GSV image classification accuracy by up to 6% compared to single-task learning.
- Health outcome predictions using multi-task learning show up to 4% higher R2 values than traditional methods.

## Abstract

This paper aims to address the challenges associated with evaluating the impact of neighborhood environments on health outcomes. Google street view (GSV) images provide a valuable tool for assessing neighborhood environments on a large scale. By annotating the GSV images with labels indicating the presence or absence of specific neighborhood features, we can develop classifiers capable of automatically analyzing and evaluating the environment. However, the process of labeling GSV images to analyze and evaluate the environment is a time-consuming and labor-intensive task. To overcome these challenges, we propose using a multi-task classifier to enhance the training of classifiers with limited supervised GSV data. Our multi-task classifier utilizes readily available, inexpensive online images collected from Flickr as a related classification task. The hypothesis is that a classifier trained on multiple related tasks is less likely to overfit to small amounts of training data and generalizes better to unseen data. We leverage the power of multiple related tasks to improve the classifier’s overall performance and generalization capability. Here we show that, with the proposed learning paradigm, predicted labels for GSV test images are more accurate. Across different environment indicators, the accuracy, F1 score and balanced accuracy increase up to 6 % in the multi-task learning framework compared to its single-task learning counterpart. The enhanced accuracy of the predicted labels obtained through the multi-task classifier contributes to a more reliable and precise regression analysis determining the correlation between predicted built environment indicators and health outcomes. The R2 values calculated for different health outcomes improve by up to 4 % using multi-task learning detected indicators.

## Full-text entities

- **Diseases:** CLASSIFICATION MODEL (MESH:D008310), SINGLE (MESH:D012640), neglect (MESH:D058069), MTL (MESH:D007859), smoking (MESH:D015208), MULTI-TASK (MESH:D015161), chronic disease (MESH:D002908), obesity (MESH:D009765), blood pressure (MESH:D006973), stroke (MESH:D020521), cancer (MESH:D009369), MODELS (MESH:D004195), depression (MESH:D003866), REGRESSION MODELS (MESH:C537770), Physical disorder (MESH:D059445), diabetes (MESH:D003920), HEALTH (OMIM:603663), arthritis (MESH:D001168), IMAGES (MESH:C564543)
- **Chemicals:** fences (-), cholesterol (MESH:D002784)

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10888441/full.md

---
Source: https://tomesphere.com/paper/PMC10888441