Maximizing the clinical utility and performance of cytology samples for comprehensive genetic profiling – A report on the impact of process optimization through the analysis of 4,871 cytology samples profiled by MSK-IMPACT
David Kim, Chad Vanderbilt, Soo-Ryum Yang, Subhiksha Nandakumar, Khedoudja Nafa, Rusmir Feratovic, Natasha Rekhtman, Ivelise Rijo, Jacklyn Casanova, Anita Yun, Angela Rose Brannon, Michael Berger, Marc Ladanyi, Oscar Lin, Maria Arcila

TL;DR
This study shows how optimizing processes can improve genetic profiling from small cytology samples, achieving high success rates comparable to surgical samples.
Contribution
The study introduces optimized strategies for maximizing genetic profiling success from cytology samples, particularly using residual supernatant cell-free DNA.
Findings
Cytology samples achieved up to 93% success rates in identifying genomic alterations with full optimization.
Residual supernatant cell-free DNA (ScfDNA) provided negligible contamination and successful results in 71% of depleted cases.
Cell block samples showed low-level cross-contamination in 4.7% of cases, suggesting the need for improved quality control.
Abstract
Comprehensive molecular profiling by next generation sequencing (NGS) has revolutionized tumor classification and biomarker evaluation. However, routine implementation is challenged by the scant nature of diagnostic material obtained through minimally invasive procedures. Here, we describe our long-term experience in profiling cytology samples with an in-depth assessment of the performance, quality metrics, biomarker identification capabilities, and potential pitfalls. We highlight the impact of several optimization strategies to maximize performance with 4,871 prospectively sequenced clinical cytology samples tested by MSK-IMPACT™. Special emphasis is given to the use of residual supernatant cell free DNA (ScfDNA) as a valuable source of tumor DNA. Overall, cytology samples were similar in performance to surgical samples in identifying clinically relevant genomic alterations, achieving…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCancer Genomics and Diagnostics · Gene expression and cancer classification · Lung Cancer Treatments and Mutations
Introduction
Comprehensive tumor molecular profiling using next generation sequencing (NGS) technology is steadily increasing in routine oncologic practice, in order to guide precise disease classification and the selection of targeted therapies^1,2^. Concurrently, minimally invasive procedures have also steadily and systematically become a dominant tumor sampling modality. Despite indisputable patient benefits, the amount of tissue procured through such procedures is limited, raising concerns on their sufficiency and suitability for comprehensive downstream analysis.
Cytologic specimens are among the most limited tissue samples that are obtained through minimally invasive procedures. While often the only source of tumor for both diagnostic and biomarker evaluation, judicious protocols for tissue testing have been explored to maximize the material available for genetic studies. In this context, the performance of NGS across various preparations, including cellblocks (CB), smears, and liquid-based suspensions have been studied and described in the literature primarily focusing on small gene panels (< 100 genes)^3–11^. However, with the increasing need for the assessment of a wider range of genetic alterations, the sufficiency and robust performance for comprehensive NGS assays have remained a major concern.
To address this, our pathology department embarked on a comprehensive performance improvement project which involved several years of sequential process optimization to strategically improve the use of cytologic tissue samples for molecular profiling. This encompassed coordinated changes by the cytology lab, including the use of a modified HistoGel based cell-block processing to improve pellet density^12,13^, as well as changes in the diagnostic molecular lab in tissue processing such as deparaffinization with mineral oil^14–16^, improved bead-based extraction techniques, implementation of dual index sequencing and adjustments of minimum DNA input requirements (Supplementary Table 1). Throughout this time, we also implemented the use of residual cytology supernatant fluids as an additional source of tumor DNA for NGS applications. Commonly discarded in routine cytology practice, supernatant fluids contain variable amounts of DNA from fragmented cells as well as whole cells, denoted here-on as supernatant cell-free DNA (ScfDNA)^16^. The strategic use of this DNA enables the preservation of cellular tissue for other ancillary studies that rely on visual assessment of intact cells, such as immunohistochemistry and cytogenetics.
This study presents our overall clinical experience using cytologic material for comprehensive NGS, integrating the use of ScfDNA as a rescue sample when other material is unavailable. To our knowledge, this is the largest cytology cohort to date including the largest cohort of residual supernatant fluid, across a wide range of tumor types. We re-analyzed all cytology sample data collected from our institution-wide prospective sequencing effort using the Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT ^™^) assay, an FDA cleared, paired tumor-normal hybridization-capture based NGS test, designed to comprehensively assess mutations, copy number alterations, and select rearrangements^17^. A summary of the performance characteristics across years of process optimization is presented, describing the utility and potential pitfalls of cytology samples for the identification of clinically relevant biomarkers, with comparisons to existing sequencing data from biopsies and resections from the same patients.
Materials and methods
Patient consent and cohort selection
The prospectively maintained database of samples submitted for sequencing using our institution’s large-panel NGS assay (MSK-IMPACT^™^) between the years 2014 and August 2022 was queried to identify all cytology samples; this included all requests on samples deemed to be malignant by morphologic assessment prior to their assessment of suitability for sequencing. MSK-IMPACT^™^ testing was ordered by the treating physician to identify clinically significant genomic alterations for the clinical care of patients with cancer. Patient’s receiving testing signed a clinical consent form and was enrolled on an institutional IRB-approved research protocol (MSKCC; NCT01775072). Additionally following consent, a patient blood draw was obtained as a source for normal (germline) DNA. Basic demographic data (age and sex) and any existing pre-analytic information, including the type of preparation, tissue source, tumor type, tumor content, and DNA yield were collected. All sequencing data, encompassing QC metrics, sequencing coverage, somatic variants identified, and variant allele frequency (VAF) were gathered, as well as sequencing qualification (pass, fail, reason for failure) as established at the time of clinical signout. When available, the same sequencing metrics and information described above were also collected for corresponding biopsy and resections samples from the same patient tumor, to compare results side-by-side. All samples were collected with informed consent and testing was performed in our CLIA certified laboratory. This study was approved by the MSKCC Institutional Review Board/Privacy Board.
Cytology sample processing
Cytologic samples were received as formalin fixed paraffin embedded (FFPE) tissue sections from cell block (CB) preparations or as supernatants. Samples were either collected in CytoLyt fixative (Hologic, Malborough, MA, USA) or in 10% neutral buffered formalin fixative and were paraffin embedded (FFPE). CB preparation for MSKCC procured samples followed a modified HistoGel-Based Cell Block Preparation Method as previously described^12,13^. Procedural details of externally procured samples (cases submitted for review at MSKCC for diagnosis confirmation and IMPACT testing) were not available. For each case, 20 unstained sections (5 um thick) were submitted mounted on glass slides, along with a hematoxylin and eosin–stained (H&E) section to assess adequacy and tumor fraction. Macro-dissection was performed to enrich for tumor, when possible and necessary, aiming for > 50% tumor cell content. Samples were rejected / failed if the tumor proportion was < 10% and the sample was not amenable to manual enrichment.
For MSKCC samples that were collected in CytoLyt fluid, residual material was saved after the ThinPrep^®^ and CB were prepared. The corresponding ThinPrep^®^ slide was assessed as a surrogate for tumor presence and content. If tumor cells were present at ≥ 10% based on visual inspection, the supernatant was considered suitable and DNA was extracted. Further details of the processing of supernatants for ScfDNA extraction are described in a previous publication^16^.
Extraction procedures:
FFPE material was deparaffinized using Citrasolv (2014 to March 2016) or mineral oil (March 2016 to 2022) and DNA was extracted using the Chemagic STAR DNATissue-10 Kit (Perkin Elmer, Waltham, MA) with the magnetic-bead method automated on a Chemagic STAR Standard Solutions Workstation (Hamilton, Bonaduz, GR, Switzerland), following manufacturer’s protocols. DNA from supernatants was extracted using the same kit and automated system, eliminating deparaffinization and overnight lysis incubation at 56°C and, instead, 1 hour lysis incubation was used (56°C).
Extracted DNA was eluted and quantified using a Qubit DNA high-sensitivity assay kit (Life Technologies, Carlsbad, CA). FFPE samples with DNA concentration of < 0.9 ng/uL were deemed insufficient for further testing until 09/2021 when the threshold was lowered to 0.54ng/uL to proceed with sequencing, which translates to minimal total inputs of 50 and 30ng, respectively based on maximal volume inputs for the assay of 55ul. ScfDNA samples were sequenced below these thresholds aiming to spare the patient from a future biopsy and to further evaluate performance characteristics.
Next Generation Sequencing
DNA was sheared and processed (along with matched DNA from blood as normal control) to generate bar-coded libraries which were pooled and subjected to targeted capture using custom-designed probes as previously detailed^17^. All samples in this study underwent testing by MSK-IMPACT ^™^, targeting all coding regions of up to 505 genes, select introns and over 1,000 custom intergenic and intronic regions throughout the genome (centered on common SNPs). Updates to the panel sequentially increased the number of genes captured from 341 (year 2014), 410 (years 2015–2016), 468 (years 2017–2020), to 505 (years 2021–2022). Captured DNA fragments were sequenced on an Illumina HiSeq2500 or NovaSeq 6000 system, before being submitted to the bioinformatics analysis pipeline for calling of somatic alterations. Clinical actionability and treatment associations of the genomic alterations detected were assessed and annotated using OncoKB^18^ MSK’s precision oncology knowledge base. Levels of evidence are assigned to each alteration based upon therapeutic levels of evidence specific to the tumor type profiled including alterations predictive of resistance to a therapy. Results were compared to the previously published data for corresponding tumor types in the AACR Project GENIE Pan-Cancer Cohort^19^.
Tumor mutational burden (TMB) was calculated for each sample as the total number of nonsynonymous mutations, including driver mutations in oncogenes, normalized to the exonic coverage of the respective MSK-IMPACT panel in megabases (Mb).
Samples were deemed clinically successful if they passed all quality control metrics defined for our assay (e.g. adequate tumor quantity, coverage, base quality, etc.) and were formally reviewed by a board-certified molecular pathologist before the report was released clinically.
Next generation sequencing quality metrics and contamination assessment
For QC purposes, in addition to standard NGS quality metrics, assessment for potential sample contamination was a critical component of our assessment. Our established analysis pipelines compute pairwise genotype concordance across all SNP sites included in the panel. This unique genotype analysis, enabled by our paired tumor:normal sequencing approach, allows us to identify potential sample swaps and contamination, either due to the presence of DNA from another individual or contamination among different barcode adapters, which could lead to erroneous mutation calling. Contamination levels are defined by the analysis of SNP sites at which the patient is homozygous (based on normal control profile). Because a homozygous site is defined by 2 identical alleles at the particular genetic locus, any allelic discrepancy where the variant is not expected indicates contamination. A cutoff of ≥ 2% is used to denote clinically significant contamination (the threshold for mutation calling). Samples with a contamination rate higher than 2% were evaluated in the context of the tumor content and mutation profile; low level contamination in samples with very high tumor content, remained partially evaluable by filtering variants within the range of contamination.
Sample type comparisons and concordance analysis
To further assess the performance of cytologic samples compared to larger tissue samples, existing MSK-IMPACT sequencing data from corresponding tissue core biopsies or subsequent resection samples from the same tumor were obtained. Once matched, metrics including sequencing coverage, genomic alterations identified, mutation VAFs, and OncoKB levels were compared to the cytologic counterpart. In a subset of cytology cases, results from CB deemed adequate for testing were also compared to the corresponding ScfDNA. This data was analyzed separately to avoid duplication.
Statistical Analysis
Statistical analysis for group comparisons of continuous data were performed using a two-tailed Mann-Whitney test. A Pearson’s Chi-squared test was performed for comparing three or more categorical groups. A Fischer’s exact test was performed for comparing two categorical groups. Statistical significance was set at p < 0.05. Cases with missing values were removed from the analyses and only complete cases were considered. All statistics and graphical representations were performed using R project.
Results
Clinicopathologic characteristics of cytology sequencing cohort
In total, 4,871 cytology tumor samples from 4,633 patients were received for MSK-IMPACT testing with patient demographics detailed in Table 1. Most samples were from CB preparations, 94.2% (4,588/4,871) while 5.8% (283/4,871) were received as ScfDNA. Of note, ScfDNA testing was requested only when no other material was suitable or available. The majority, 63%, were procured at MSKCC and processed internally while 37% were submitted from outside institutions (Table 1). Testing was cancelled on 3% (146/4,871) prior to sample processing for logistical considerations, including the lack of a submitted normal control for matched testing or testing no longer relevant for patient management. A diverse array of tissue sites and sample types were profiled as detailed in Fig. 1a and 1b. The cohort encompassed 181 unique tumor types with lung and pancreatic adenocarcinomas being the most common. The number of samples and relative frequencies of the profiled tumor types are further summarized in Supplementary Table 2.
Success rate of NGS testing on Cytology Samples
Overall, 81% (3,806/4,725) of all samples were successfully tested. The success rate was higher for CB (81%; 3,616/4,457) samples compared to ScfDNA (71%; 190/268), noting that ScfDNA samples encompassed only cases for which the CB had already been deemed unsuitable for any analysis. Across the study period, the use of ScfDNA as a rescue sample boosted the overall success rate of the cytologic procedures from 77–81%. Causes of failure, in descending order of frequency, included low DNA yield below the minimum cutoffs established for sequencing (11.3%), very scant tumor tissue (< 10% tumor) seen on manual review (4.6%), low sequencing coverage below a median of 50x (1.8%), high sample level DNA contamination (1.6%) and low DNA quality (0.1%) including adequate coverage but high background noise and low base quality scores. (Fig. 1c).
To assess the impact of optimization efforts implemented across the study period, all requests (excluding cancellations) were stratified by year and testing success status. Sequential and statistically significant improvements were observed, even in the context of increased number of genes tested by panel updates, reaching 89% in the last year of assessment (Fig. 2a). For samples deemed sufficient for sequencing (following qualification for tumor content and DNA yield), success rates were consistently high across all years (range 96–98%) (Fig. 2b).
Among CB preparations, success rates were significantly higher for internal samples compared to those from outside laboratories, with highest success rates of 92–93% (internal) and 79–82% (external) in the last 2 years, in accordance with the full optimization efforts (p < 0.01; Fig. 2c, Supplementary Table 3).
On average, successful cytology samples had higher tumor purities compared to samples that failed testing, across both preparations (CB: p = 0.0003, ScfDNA: p = 0.63; Fig. 2c). Median total DNA yields were 427.5 ng and 182.2 ng (p = 2.2 × 10^− 16^) for CB and ScfDNA, respectively (Fig. 2d). The lower DNA yield in ScfDNA was expected as these constituted rescue samples when the corresponding cytology tissue was too scant or exhausted.
Sequencing performance: total coverage and sample quality metrics
Among the 4,725 samples sequenced, the total median coverage was 586x. Coverages were significantly higher for CB samples compared to rescue ScfDNA samples, at 595x and 263x, respectively (p = 2.2 × 10^− 16^) (Fig. 3a). Despite the lower coverage, most rescue samples retained coverages above 200x, which is above our established requirements to maintain sensitivity for variants calling at 2%.
Notably, in 2021 the minimum DNA input requirement for MSK-IMPACT was lowered from 50ng to 30ng for cell blocks. Following this change, we saw no significant differences in sequencing coverage (Fig. 3b) and sequencing success rate remained steady at 98% across sequenced cases.
Among all samples sequenced, contamination checks revealed clinically relevant non-patient DNA contamination (≥ 2%) in 5.2% of cases (246/4725) (Fig. 3c). Excluding those that failed due to very low coverage (< 50X), the overall rate was 4.8% (227/4725), with a significantly higher rate for CB samples compared to ScfDNA, at 4.7% (226/4725) and 0.3% (1/4725) respectively. Notably, in the context of optimal coverage (> 200X), no ScfDNA samples exhibited clinically significant sample contamination (Fig. 3d). Also, no contamination was identified, even for samples with low coverage, following the implementation of dual indexing. By contrast, CB samples showed variable and significantly higher levels of contamination (range: 2–32%), which remained present despite adequate coverage and after implementation of dual indexing. Among samples with optimal coverage, 4% (189) of the CB samples exhibited contamination rates above 2%. In 34% (65) of these samples, for which sufficient material for re-extraction and STR analysis was available, contamination could be tracked to foreign tissue material embedded in the tissue blocks. Representative cases are included in Supplementary Fig. 1.
Biomarker/mutation identification and therapeutic actionability
A total of 30,149 somatic alterations were detected across the 3,806 successfully sequenced cases. Of these, 93.8% of cases (3,570/3,806) harbored at least 1 somatic alteration, including 3,394 (93.9%) of CB and 176 (92.6%) of the ScfDNA samples. No significant differences in sample coverage were observed between samples with and those without alterations (CB: p = 0.19; ScfDNA: p = 0.91, Supplementary Fig. 2). However, tumor purity estimations were significantly lower for the subset without alterations, with a median tumor content of 10% vs 30% for those with detected alterations.
When stratified, the median number of alterations was similar for both sample types, 9 for CB (range 1–170; 95% CI: 11.9–12.7) and 10 for ScfDNA (range 1–63; 95% CI: 8.5–11.9). The average TMB was 7.64 mutations/Mb and 7.43 mutations/Mb for CB and ScfDNA samples, respectively. Overall, the mutational profiles recapitulated the expected landscape and frequency of driver and common alterations for the tumor type. Stratified by level of actionability, 65% (n = 2487) had at least one targetable alteration as defined by the presence of an OncoKB level 1, 2, 3A, or 3B alteration and 2% (n = 93) had a standard care resistance mutation (OncoKB level R1). The highest frequency of level 1 OncoKB alterations was observed in thyroid, breast, non-small cell lung (NSCLC), and bladder cancer patients at 58%, 58%, 45%, and 29% respectively. For resistance mutations, 87 CB and 6 ScfDNA samples identified an OncoKB level R1 alteration. To ensure that significant alterations were being identified at similar rates to non-cytology samples, results were compared to those published in the AACR GENIE cohort. Across the different histologic tumor types, similar rates of OncoKB alterations were identified (Fig. 4a). Representative oncoplots of the most frequent, clinically actionable alterations detected in NSCLC, Bladder Cancer, and Breast Cancer (most common tumor types in our cohort) are presented in Fig. 4b and 4c which demonstrate the expected distributions across both CB and ScfDNA. OncoKB level 1 alterations were commonly seen in EGFR, KRAS, PIK3CA, and ERBB2 genes. ALK, BRAF, RET, and ROS1 level 1 alterations were also seen at lower frequencies.
Comparison with surgical core biopsies/resections
To further assess the general performance of cytology samples, we identified 526 cases (CB: n = 482; ScfDNA: n = 44) of patients who had a corresponding surgical sample of the same tumor assessed by MSK-IMPACT. While the same tumor was profiled across each surgical:cytology pair, it should be noted that there were variations in the time of collection across their treatment course as the samples were profiled clinically. Thus, many of the cytologic samples were collected at the time of disease progression or development of resistance. Overall, cytologic samples demonstrated similar sequencing metrics compared to their surgical biopsy/excision counterparts with adequate average coverages of 584x for cytology samples and 628x for corresponding surgical samples (p = 0.00028).
Comparing detected alterations, a large proportion of the cytology samples identified all the alterations detected on the corresponding surgical sample, 266 (55%) CB (Fig. 5a) and 19 (43%) ScfDNA samples (Fig. 5e). The median VAF’s for shared alterations were slightly higher for cytology samples, compared to the surgical pair in both CB and ScfDNA samples (CB: p = 4.9 × 10^− 11^; ScfDNA: p = 0.13) (Fig. 5b and Fig. 5f).
In all, a total of 5,593 mutations were identified in the CB:surgical paired set, of which 2789 events (49.8%) were shared (Fig. 5c). For the ScfDNA:surgical paired set, 692 mutations were detected with slightly lower overlap (34.8%; Fig. 5g). Importantly, when alterations were stratified by level of actionability, the overwhelming majority of driver alterations with OncoKB Level 1 actionability were shared events, at 93% and 83% for the CB and ScfDNA sets, respectively. Non-detection of OncoKB Level 1 alterations in the surgical or the cytology sample was related to low coverage or low tumor content in all cases. Events categorized as Level R1 or No level showed the lowest overlap, with 27% and 42% shared events, respectively, likely reflecting differential passenger events, the acquisition of additional mutations in the time interval of the two samples, or the heterogeneous nature of resistance mechanisms in the samples. Further details are provided in Fig. 5d and 5h and Supplementary Table 4.
Review of contamination check data for surgical samples revealed that < 1% of surgical samples had clinically relevant contamination (0.81%; 5/619; Supplementary Fig. 3a). Of note, all 6 surgical samples with contamination were minute biopsy samples with low tumor purity (Supplementary Fig. 3b) with contamination below 4%.
Comparison of successful CB preparations with corresponding ScfDNA
Among CB cytology samples with adequate tumor and successful sequencing, 24 had the corresponding ScfDNA samples tested to allow direct comparisons. Both DNA concentration and sequencing coverages were significantly lower for the ScfDNA samples. Total DNA yields averaged 505ng (range: 76–5453) and 1200ng (range 83.4–3340) for ScfDNA and CB preparations, respectively. Accordingly, ScfDNA had resulting lower sequencing coverage averaging 387x (range 3x – 1335x) compared to 669x (74x – 1193x) for the corresponding CB. Of the 24 ScfDNA samples, 7 (27%) failed sequencing due to low sample coverage. Detection of clinically relevant alterations and the VAF were the same across both sample preparations based on comparison of successfully sequenced sets.
Discussion
Comprehensive NGS sequencing is becoming a common approach for upfront assessment of a broad range of genetic biomarkers that are pivotal for diagnostic, prognostic, and therapeutic decisions in cancer patients. While ideally, molecular testing is greatly facilitated when large tumor samples are available (i.e. resections or excisional biopsies), the reality of clinical practice is that a very large proportion of testing must be performed on scant material obtained through minimally invasive procedures. Historically, this has presented distinct challenges, prompting the adoption of alternate approaches, such as liquid biopsies, which attempt to circumvent tumoral cell assessment altogether. At present, while arguments can be made for the superiority or inferiority of each modality over another, cytologic samples stand as the one middle approach that unites the most desirable attributes of both worlds. Namely, they retain the key morphologic correlates required for tumor diagnosis, while still sparing the patient from the more invasive procedures. One fact remains constant, however, which is that small samples require very high optimization of the entire process to maximize the genomic yield.
In this study, we have outlined our institutional approach and longitudinal experience in comprehensive profiling of cytology samples in routine clinical care. To our knowledge, this represents the largest prospective clinical cohort reported to date, demonstrating that molecular testing can be performed on routinely procured cytology samples with high success rates, similar to surgical samples. Proportions of clinically actionable genomic alterations, specifically OncoKB Level 1–3B, as well as R1 alterations, recapitulated the expected patterns across all tumor types when compared to those published in AACR GENIE cohort. For immediately actionable alterations (OncoKB Level 1), the concordance of cytology to corresponding surgical samples from the same patient were very high (93%). Notably rescue ScfDNA from supernatant CytoLyt fluid material, utilized for our internal cases, proved highly valuable and enabled the detection of a level 1 alteration in 83% of the successfully sequenced cases.
Our review of data compiled across 8 years, highlighted the central roles of optimized sample handling and processing. In our hands, 2 critical early steps enabled higher DNA recovery which, consequently, promoted increased utilization of cytologic material for molecular testing. The first was the optimization of cell block preparation, which incorporated pretreatment of pelleted cells with 95% ethanol before addition of HistoGel^12,13^. This enhanced the density of cell pellets to deliver higher amount of cellular material in fewer sections of the paraffin block. The second was the transition to mineral oil deparaffinization which markedly reduced tube transfers, centrifugation, and decanting steps, all key vulnerabilities responsible for major nucleic acid losses in the processing of scant FFPE material^14–16^. It should be noted that, with the implementation of mineral oil extraction, requests for testing on cytologic material vs needle biopsies markedly increased at our institution. Details of this transition have been previously published by our group^16,20^. Notably, among lung cancer patients undergoing endobronchial ultrasound transbronchial needle aspiration (EBUS-TBNA) this change, alone, significantly improved sequencing success rates from 76.3–93%. Moreover, these success rates corresponded to NGS testing that was performed after standard rapid testing for EGFR on the same samples^21^, further supporting the high suitability and sufficiency of the DNA recovered.
An important, and often underreported, consideration in molecular testing of cytology samples are the diagnostic challenges and inaccuracies that may arise from sample cross-contamination. While sample-to-sample contamination may happen across any point, highly vulnerable points lie in processes that involve batching and pooling of multiple samples in a single run. In particular, established histopathology practices of tissue processing (i.e. carry over from microtome blades, common water baths, pooled tissue processors, etc.), pose distinct risks for contamination for small tissue samples as processes are primarily optimized to enhance microscopic diagnostic analysis but not downstream molecular applications. Common holding of numerous specimens in single chambers in automated tissue processors, the use of common equipment for embedding, cutting and tissue mounting, all increase the potential for low level cross contamination. While this may remain inconsequential for morphologic assessment or the molecular analysis of large tissue samples, this can distinctly impact small samples where similar contamination levels become proportionally higher. Cytology samples may be even more vulnerable due to processing of cell blocks with paper wrapping and HistoGels, which may promote trapping of cellular impurities from other samples. Indeed, in our analysis of cytologic samples, contamination was significantly higher across CB samples compared to all other samples. This held true when analyzing samples with adequate coverage (> 200x) with 4% of CB samples exhibiting contamination. ScfDNA samples by contrast, which are processed individually in a closed system and not batched, had negligible levels with contamination patterns exclusively associated with sequencing failures or borderline coverage and more likely related to artifact rather than true contamination. Despite the presence of higher contamination levels in cell blocks, the overall rate was low (4.8%) among successfully sequenced samples, which encompassed samples procured and processed across numerous laboratories across the county. These rates are in keeping with sequencing data on surgical samples published by Sehn et al^22^ but are significantly lower to what is reported by the ASC Clinical Practice Committee/Workgroup for Cross-Contamination in a recent survey for general cytopathology practice, quoting rates as high as 56% for cell-block preparations^23^. This high rate may be related to the reporting of contamination per case, affecting some but not all unstained sections and which may not be high enough to be detectable in the sequencing of DNA recovered from a set of several slides. Importantly, while contamination was detectable in several cases in our cohort, most were sufficiently low in comparison to the overall tumor content of the sample, allowing informed filtering of low–variant allele fraction events without compromising all mutation calling. In all, only 1.3% of the samples were failed due to contamination, while others could be reported with modification. Within the molecular laboratory, a notable source of cross-contamination may arise from index-hopping during multiplexing. This, however, is generally lower level (well below 2%) and more prone to affect higher sensitivity applications. Nonetheless, in the process of improvement for our MSK-IMPACT assay we have incorporated several strategies to mitigate this phenomenon, including optimization of PCR conditions and the implementation of dual indexing to facilitate the removal of misaligned reads. These finetuning steps facilitated our decision to reduce the assay input requirements which markedly reduced failures due to insufficient DNA. No significant changes in coverage or contamination rates were seen with this change.
Finally, a pivotal component of our optimization process was the implementation of testing ScfDNA recovered from liquid cytology preparations. While, generally, this sample type was not submitted if the cell block was deemed suitable for testing, it became an important rescue sample to avoid re-biopsy procedures. The use of this material also relieved some of the challenges in triaging very small biopsy samples for other ancillary studies. In all, while the success rate of the ScfDNA samples was approximately 71%, which is below what is seen across tissue biopsies and CB, these samples were specifically tested after the corresponding cytologic material was deemed unsuitable, thus boosting the overall success for the individual aspirate procedures by approximately 3%. An important observation, gathered from the comparison of ScfDNA and corresponding cell blocks, is that the VAF’s of detected alterations were similar for both preparations, supporting that the assessment of the block or cytoprep represents a suitable surrogate for estimating the proportion of tumor derived DNA that may be present in the ScfDNA sample. Additionally, given the high integrity of the DNA in these non-formalinized samples, lower DNA inputs still delivered excellent results, provided that the tumor proportion was suitable. Confirmatory testing with higher sensitivity methods may also be implemented for low tumor samples, without concerns for false positivity due to artifacts imparted by formalin fixation.
In conclusion, this study confirms that the routine use of cytologic samples for molecular testing constitute a robust approach that can deliver the same results as larger biopsy samples. Process optimization and the implementation of robust quality control processes, including contamination checks are pivotal to maximizing the yield and utility of these samples. A reassessment on how tissue blocks are processed and prepared would be an important aspect of cytology practice as a whole, to include specialized instrumentation for processing small samples without risk of cross-contamination. ScfDNA recovered from supernatants is an invaluable source of tumor derived DNA which circumvents the processing where most contamination is bound to happen in current practice, and while failure rates due to limited nucleic acid recovery are higher than tissue blocks, their use could rescue the majority of cases where high tumor is identified but FFPE material is insufficient for sequencing.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Garraway LA (2013) Genomics-driven oncology: framework for an emerging paradigm. J Clin Oncol 31:1806–181423589557 10.1200/JCO.2012.46.8934 · doi ↗ · pubmed ↗
- 2Wakai T (2019) Next-generation sequencing-based clinical sequencing: toward precision medicine in solid tumors. Int J Clin Oncol 24:115–12230515675 10.1007/s 10147-018-1375-3 · doi ↗ · pubmed ↗
- 3Turner SR (2018) Feasibility of endobronchial ultrasound transbronchial needle aspiration for massively parallel next-generation sequencing in thoracic cancer patients. Lung Cancer 119:85–9029656758 10.1016/j.lungcan.2018.03.003PMC 5905717 · doi ↗ · pubmed ↗
- 4Ramani NS (2021) Utilization of cytology smears improves success rates of RNA-based next-generation sequencing gene fusion assays for clinically relevant predictive biomarkers. Cancer Cytopathol 129:374–38233119213 10.1002/cncy.22381 PMC 12002355 · doi ↗ · pubmed ↗
- 5Baum JE (2017) Accuracy of next-generation sequencing for the identification of clinically relevant variants in cytology smears in lung adenocarcinoma. Cancer Cytopathol 125:398–40628272845 10.1002/cncy.21844 · doi ↗ · pubmed ↗
- 6Roy-Chowdhuri S (2018) Salvaging the supernatant: next generation cytopathology for solid tumor mutation profiling. Mod Pathol 31:1036–104529463880 10.1038/s 41379-018-0006-x · doi ↗ · pubmed ↗
- 7Kanagal-Shamanna R (2014) Next-generation sequencing-based multi-gene mutation profiling of solid tumors using fine needle aspiration samples: promises and challenges for routine clinical diagnostics. Mod Pathol 27:314–32723907151 10.1038/modpathol.2013.122 · doi ↗ · pubmed ↗
- 8Gan Q, Roy-Chowdhuri S (2020) Small but powerful: the promising role of small specimens for biomarker testing. J Am Soc Cytopathol 9:450–46032507626 10.1016/j.jasc.2020.05.001 · doi ↗ · pubmed ↗
