# Idea density for predicting Alzheimer's disease from transcribed speech

**Authors:** Kairit Sirts, Olivier Piguet, Mark Johnson

arXiv: 1706.04473 · 2017-06-15

## TL;DR

This paper introduces new dependency-based methods for calculating Idea Density to predict Alzheimer's disease from speech, demonstrating improved classification performance across different datasets and speech domains.

## Contribution

Develops DEPID and DEPID-R, novel methods for automatic propositional idea density calculation, and compares their effectiveness with semantic idea density in AD diagnosis.

## Key findings

- Adding PID improves classification accuracy by 1.7 F-score.
- PID outperforms SID on free-topic speech with 77.6 F-score.
- Combining PID with word embedding features increases F-score to 84.8.

## Abstract

Idea Density (ID) measures the rate at which ideas or elementary predications are expressed in an utterance or in a text. Lower ID is found to be associated with an increased risk of developing Alzheimer's disease (AD) (Snowdon et al., 1996; Engelman et al., 2010). ID has been used in two different versions: propositional idea density (PID) counts the expressed ideas and can be applied to any text while semantic idea density (SID) counts pre-defined information content units and is naturally more applicable to normative domains, such as picture description tasks. In this paper, we develop DEPID, a novel dependency-based method for computing PID, and its version DEPID-R that enables to exclude repeating ideas---a feature characteristic to AD speech. We conduct the first comparison of automatically extracted PID and SID in the diagnostic classification task on two different AD datasets covering both closed-topic and free-recall domains. While SID performs better on the normative dataset, adding PID leads to a small but significant improvement (+1.7 F-score). On the free-topic dataset, PID performs better than SID as expected (77.6 vs 72.3 in F-score) but adding the features derived from the word embedding clustering underlying the automatic SID increases the results considerably, leading to an F-score of 84.8.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1706.04473/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1706.04473/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1706.04473/full.md

---
Source: https://tomesphere.com/paper/1706.04473