# Decoding Clinician Authorial Style: A Style-Informed Pipeline for Clinical Document Summary Generation with Large Language Models

**Authors:** Scott Zhao, Abbas Alili, Usman Afzaal, Muhammet F. Demir, Hao Lu, Padageshwar Sunkara, Metin N. Gurcan

PMC · DOI: 10.21203/rs.3.rs-9054955/v1 · Research Square · 2026-03-26

## TL;DR

This paper introduces a pipeline that personalizes clinical summaries generated by large language models to match individual clinicians' writing styles, reducing the need for post-editing.

## Contribution

A novel style-informed generation framework that extracts clinician-specific stylistic features using LLM feedback and a Train→Generate paradigm.

## Key findings

- LLM-guided feature extraction improved authorship classification accuracy up to 73%.
- Gemini 2.5 Pro pipeline produced drafts preferred at rates comparable to or exceeding clinician-authored summaries in blinded tests.
- High-fidelity prompt engineering mitigated hallucination risks while adhering to source-only data constraints.

## Abstract

Large language models (LLMs) can automate clinical document summary generation. However, even clinically accurate outputs often fail to reflect individual clinicians’ writing styles, leading to substantial post-editing. We examine this stylistic gap using a multi-author corpus of de-identified clinical summaries. We propose a style-informed generation framework that extracts clinician-specific stylistic features through LLM feedback and applies a Train→Generate paradigm to produce personalized clinical summaries. Conventional metrics (ROUGE, BERTScore, cosine similarity) largely failed to distinguish intra-author from inter-author writing patterns, while Jaro-Winkler and BLEU demonstrated limited sensitivity. Targeted LLM-guided feature extraction—emphasizing rhythm, narration, and sentence or list structure—improved authorship classification up to 73% of accuracy. In blinded clinician A/B testing, GPT-4-generated drafts were preferred less often than original notes, whereas the Gemini 2.5 Pro pipeline produced drafts preferred at rates comparable to, and in some cases exceeding, clinician-authored summaries. While inherent hallucination risks were noted, they were mitigated via high-fidelity prompt engineering and explicit adherence to source-only data constraints. These results suggest that style-informed generation can reduce the style gap and produce clinically acceptable clinical summaries that better align with the clinician’s voice.

## Full-text entities

- **Diseases:** hallucination (MESH:D006212)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13042165/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13042165/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/PMC13042165/full.md

---
Source: https://tomesphere.com/paper/PMC13042165