# Longitudinal and Multimodal Recording System to Capture Real-World Patient-Clinician Conversations for AI and Encounter Research: Protocol for an Observational Study

**Authors:** Misk Al Zahidy, Kerly Guevara Maldonado, Luis Vilatuna Andrango, Ana Cristina Proano, Ana Gabriela Claros, Maria Lizarazo Jimenez, Esteban Gomez-Alvarez, David Toro-Tobon, Victor M Montori, Oscar J Ponce-Ponte, Juan P Brito

PMC · DOI: 10.2196/84688 · JMIR Research Protocols · 2026-03-24

## TL;DR

This study creates a system to record patient-clinician interactions with video, audio, and surveys, combining them with EHR data to better train AI in healthcare.

## Contribution

A new multimodal system for capturing and linking patient-clinician encounters with EHR data and postvisit surveys is designed and evaluated.

## Key findings

- High consent rates were achieved for clinicians (97%) and patients (75%).
- 76% of consented encounters resulted in complete 360° video recordings.
- 96% of consented encounters had completed postvisit surveys.

## Abstract

The promise of artificial intelligence (AI) in medicine depends on its ability to learn from data that reflect what matters to patients and clinicians in the care process. Most existing models are trained on electronic health records (EHRs), which primarily capture biological measures but rarely the interactions and relationships between patients and clinicians. These relationships, central to how care is understood, negotiated, and delivered, unfold across multiple modalities, including voice, text, and video, yet remain largely absent from current datasets. As a result, AI systems trained solely on EHRs risk perpetuating a narrow biomedical view of medicine and overlooking the lived exchanges that define clinical encounters.

This study aims to design, implement, and evaluate the feasibility of a longitudinal, multimodal system for capturing patient-clinician encounters, linking 360° video or audio recordings with postvisit surveys and EHR data, to create a foundational dataset for downstream AI research.

This single-site study was conducted in an academic outpatient specialty clinic (Division of Endocrinology, Mayo Clinic, Rochester, Minnesota, United States). Adult patients attending in-person visits with participating clinicians were invited to enroll. Encounters were recorded using a 360° 2D monocular video camera and dual-channel audio. After each visit, patients completed a brief survey assessing relational empathy, satisfaction, visit pace, and treatment burden. Demographic and clinical data were extracted from the EHR. Feasibility was assessed using 5 prespecified end points: clinician consent, patient consent, recording success, survey completion, and data linkage across modalities.

Recruitment began in January 2025. By August 2025, 35 of 36 (97%) eligible clinicians and 212 of 281 (75%) approached eligible patients had consented. Of the consented encounters, 162 (76%) resulted in a complete 360° video recording, and the postvisit surveys were completed for 204 of 212 (96%) consented encounters, reflecting 1 survey per encounter. Data collection is ongoing as of December 2025, and further analyses will be reported in subsequent publications.

This protocol describes a longitudinal multimodal encounter capture system that links 360° audio or video with postvisit surveys and EHR data. The study specifies operational definitions, workflows, feasibility end points, and governance procedures to support implementation and replication in other clinical settings.

## Full-text entities

- **Diseases:** HIPAA (OMIM:603663), AI (MESH:C538142), disease (MESH:D004194), cognitive impairment (MESH:D003072), diabetes (MESH:D003920)
- **Chemicals:** CARE (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13012220/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13012220/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/PMC13012220/full.md

---
Source: https://tomesphere.com/paper/PMC13012220