# Gene duplication is associated with gene diversification and potential neofunctionalization in lung cancer evolution

**Authors:** Paul Ashford, Alexander M. Frankell, Zofia Piszka, Camilla S.M. Pang, Mahnaz Abbasian, Maise Al Bakir, Mariam Jamal-Hanjani, Nicholas McGranahan, Charles Swanton, Christine A. Orengo

PMC · DOI: 10.1101/gr.278663.123 · Genome Research · 2026-03-01

## TL;DR

This study explores how gene duplication in lung cancer leads to genetic diversity and new gene functions, potentially aiding tumor evolution.

## Contribution

A novel computational protocol identifies postduplication mutations and their impact on protein function in lung cancer evolution.

## Key findings

- 355 functional impact events were identified through proximity and clustering near functional sites in gene paralogs.
- Postduplication diversification of driver genes and functions suggests selection for somatic copy number changes in lung tumors.
- Some metabolic enzymes show potential neofunctionalization following gene duplication in lung adenocarcinomas.

## Abstract

Tumors evolve through a process of selection on somatic mutations, driving cell division and tissue growth through aberrations in cell-cycle control. In non-small-cell lung cancer (NSCLC), genome instability occurs early in tumor growth, resulting in pronounced intratumor heterogeneity, including changes in gene copy number, and whole-genome doubling (WGD) in ∼75% of tumors. Gene duplication, genetic drift, and selection mediate functional diversification during evolution. In this study, we seek to identify the diversification and potential gene neofunctionalization of lung tumors in the TRACERx cohort. We develop a novel computational protocol to identify preduplication and postduplication mutations predicted to affect protein function. Mutations are analyzed using paralogs grouped into functional families with highly similar functions, identifying 355 functional impact events (FIEs) through their proximity and clustering near to functional sites. The use of functional family paralogs to map mutations to protein structures from the PDB helps predict putative rare driver events in lung tumors. By extending the analysis with high-quality structural models from AlphaFold using The Encyclopedia of Domains (TED), we find a significant increase in the diversity of both genes and functional families with postduplication FIEs in lung adenocarcinomas, including some metabolic enzymes with the potential to be neofunctional. The postduplication diversification of driver genes and functions may indicate selection for somatic copy number changes in lung tumors and an increased scope for tumor adaptations.

## Linked entities

- **Diseases:** lung cancer (MONDO:0005138), non-small-cell lung cancer (MONDO:0005233), adenocarcinomas (MONDO:0004970)

## Full-text entities

- **Genes:** TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, CDKN2A (cyclin dependent kinase inhibitor 2A) [NCBI Gene 1029] {aka ARF, CAI2, CDK4I, CDKN2, CMM2, INK4}, PTEN (phosphatase and tensin homolog) [NCBI Gene 5728] {aka 10q23del, BZS, CWS1, DEC, GLM2, MHAM}, BRAF (B-Raf proto-oncogene, serine/threonine kinase) [NCBI Gene 673] {aka B-RAF1, B-raf, BRAF-1, BRAF1, NS7, RAFB1}, CECR (cat eye syndrome chromosome region) [NCBI Gene 1055] {aka CES}, HLA-A (major histocompatibility complex, class I, A) [NCBI Gene 3105] {aka HLAA}, IDH1 (isocitrate dehydrogenase (NADP(+)) 1) [NCBI Gene 3417] {aka HEL-216, HEL-S-26, IDCD, IDH, IDP, IDPC}, RRAS2 (RAS related 2) [NCBI Gene 22800] {aka NS12, TC21}, ERBB2 (erb-b2 receptor tyrosine kinase 2) [NCBI Gene 2064] {aka CD340, HER-2, HER-2/neu, HER2, MLN 19, MLN-19}, SMARCA4 (SWI/SNF related BAF chromatin remodeling complex subunit ATPase 4) [NCBI Gene 6597] {aka BAF190, BAF190A, BRG1, CSS4, MRD16, OTSC12}, GAPDH (glyceraldehyde-3-phosphate dehydrogenase) [NCBI Gene 2597] {aka G3PD, GAPD, HEL-S-162eP}, PTPRD (protein tyrosine phosphatase receptor type D) [NCBI Gene 5789] {aka HPTP, HPTPD, HPTPDELTA, PTPD, R-PTP-delta, RPTPDELTA}, RAC2 (Rac family small GTPase 2) [NCBI Gene 5880] {aka EN-7, Gx, HSPC022, IMD73A, IMD73B, IMD73C}, GOT1 (glutamic-oxaloacetic transaminase 1) [NCBI Gene 2805] {aka AST, AST1, ASTQTL1, GIG18, SGOT, cAspAT}, EGFR (epidermal growth factor receptor) [NCBI Gene 1956] {aka ERBB, ERBB1, ERRP, HER1, NISBD2, NNCIS}, ERBB4 (erb-b2 receptor tyrosine kinase 4) [NCBI Gene 2066] {aka ALS19, HER4, p180erbB4}, LRP1B (LDL receptor related protein 1B) [NCBI Gene 53353] {aka LRP-1B, LRP-DIT, LRPDIT}, PIK3CA (phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha) [NCBI Gene 5290] {aka CCM4, CLAPO, CLOVE, CWS5, HMH, MCAP}, TALDO1 (transaldolase 1) [NCBI Gene 6888] {aka TAL, TAL-H, TALDOR, TALH}, KRAS (KRAS proto-oncogene, GTPase) [NCBI Gene 3845] {aka 'C-K-RAS, C-K-RAS, CFC2, K-RAS2A, K-RAS2B, K-RAS4A}, STK11 (serine/threonine kinase 11) [NCBI Gene 6794] {aka LKB1, PJS, hLKB1}, PRDX3 (peroxiredoxin 3) [NCBI Gene 10935] {aka AOP-1, AOP1, HBC189, MER5, PPPCD, PRO1748}, Gpi1 (glucose-6-phosphate isomerase 1) [NCBI Gene 14751] {aka Amf, Gpi, Gpi-1, Gpi-1r, Gpi-1s, Gpi-1t}, RAC1 (Rac family small GTPase 1) [NCBI Gene 5879] {aka MIG5, MRD48, Rac-1, TC-25, p21-Rac1}, PRDX6 (peroxiredoxin 6) [NCBI Gene 9588] {aka 1-Cys, AOP2, HEL-S-128m, LPCAT-5, NSGPx, PRX}, MAP2K1 (mitogen-activated protein kinase kinase 1) [NCBI Gene 5604] {aka CFC3, MAPKK1, MEK1, MEL, MKK1, PRKMK1}, VHL (von Hippel-Lindau tumor suppressor) [NCBI Gene 7428] {aka HRCA1, RCA1, VHL1, pVHL}, NALF2 (NALCN channel auxiliary factor 2) [NCBI Gene 27112] {aka CXorf63, FAM155B, TED, TMEM28, bB57D9.1}, RIT1 (Ras like without CAAX 1) [NCBI Gene 6016] {aka NS8, RIBB, RIT, ROC1}
- **Diseases:** hypoxia (MESH:D000860), Infectious Diseases (MESH:D003141), colon adenocarcinoma (MESH:D003110), NSCLC (MESH:D002289), aneuploidy (MESH:D000782), LUSC (MESH:D002294), bladder, uterine, and pancreatic cancers (MESH:D001749), LUADs (MESH:D000077192), hypoxic (MESH:D002534), Allergy (MESH:D004342), Cancer (MESH:D009369), lung cancer (MESH:D008175), metastases (MESH:D009362), prostate cancer (MESH:D011471), melanoma (MESH:D008545), gliomas (MESH:D005910), FIEs (MESH:D014095)
- **Chemicals:** Hydrogen (MESH:D006859), cysteine (MESH:D003545), amino acid (MESH:D000596), peroxide (MESH:D010545), dNdS (MESH:C022306), FunVar (-), D-2-hydroxyglutarate (MESH:C019417), GTP (MESH:D006160), methionine (MESH:D008715), H2O2 (MESH:D006861), disulfide (MESH:D004220)
- **Species:** Mus musculus (house mouse, species) [taxon 10090], Homo sapiens (human, species) [taxon 9606]
- **Mutations:** V600E, L858R, L861Q, cysteine-cysteine, arginine to leucine, R273L, R132C

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12951968/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12951968/full.md

## References

114 references — full list in the complete paper: https://tomesphere.com/paper/PMC12951968/full.md

---
Source: https://tomesphere.com/paper/PMC12951968