# Capturing The Pain Experience: Creation and Testing of an AI-Driven Pain Assessment App

**Authors:** Marcia Shade, Mrinal Rawool, Stephen Scott

PMC · DOI: 10.1093/geroni/igaf122.3724 · Innovation in Aging · 2025-12-31

## TL;DR

This paper introduces an AI-driven app to assess pain in older adults, showing promising performance in generating questions and handling user interactions.

## Contribution

The novel contribution is the creation and testing of an AI model integrated into a mobile app for remote pain assessment in older adults.

## Key findings

- The model achieved ROUGE scores of 0.5–0.6 and BLEU scores of 0.62 during synthetic testing.
- The average response time was 4 seconds per request, with minimal impact from user message length.
- The system had a low crash rate and a 1.2% grammatical error rate in user utterance transcription.

## Abstract

Pain is undertreated and poorly managed in older adults. Pain symptoms may not be reported or identified, and time is limited in clinical practice to address pain among multiple comorbid conditions. Can artificial intelligence assist? The purpose of this project was to create and test a machine learning model (MLM) that captures the remote biopsychosocial pain experience.

Mock data interactions were used to fine-tune a pre-trained model. The model was integrated with a mobile software application’s architecture and deployed for testing. Metric data was collected on the MLM’s response time, crash, and transcription grammatical accuracy.

The MLM generated pain assessment questions, such as, Is the pain worse at certain times of the day, like mornings or evenings? During the initial model creation, synthetic conversation testing yielded ROUGE scores of 0.5–0.6 and BLEU scores of 0.62. During testing, the MLM’s average response time was 4 seconds per request. Requests were impacted slightly by the length of the spoken message from the user. One app crash occurrence was documented. The MLM was grammatically accurate; however, the user utterance transcription grammatical error rate was 1.2%.

These findings demonstrate the promising metrics from iterative development, deployment, and testing of an MLM designed to engage with older adults with chronic pain. The response time underscores the preliminary efficiency of this integrated system. Continued refinement of the model with performance and user experience testing in real-world settings will help optimize this solution.

---
Source: https://tomesphere.com/paper/PMC12763418