# PAL-AI reveals genetic determinants that control poly(A)-tail length during oocyte maturation, with relevance to human fertility

**Authors:** Kehui Xiang, David P. Bartel

PMC · DOI: 10.1038/s41467-025-62171-5 · Nature Communications · 2025-08-01

## TL;DR

A new AI model called PAL-AI identifies genetic factors controlling poly(A)-tail length in oocytes, linking these changes to human fertility.

## Contribution

PAL-AI is a novel machine-learning model that predicts poly(A)-tail length changes and identifies regulatory elements and genetic variants affecting oocyte maturation.

## Key findings

- PAL-AI identified known and new sequence elements that control poly(A)-tail length in oocytes.
- Genetic variants predicted to disrupt tail lengthening are under negative selection in human populations.
- The model links mRNA tail-length regulation to human female fertility.

## Abstract

In oocytes of mammals and other animals, gene regulation is mediated primarily through changes in poly(A)-tail length. Here, we introduce PAL-AI, an integrated neural network machine-learning model that accurately predicts tail-length changes in maturing oocytes of frogs and mammals. We show that PAL-AI learned known and previously unknown sequence elements and their contextual features that control poly(A)-tail length, enabling it to predict tail-length changes resulting from 3′-untranslated region single-nucleotide substitutions. It also predicted tail-length-mediated translational changes, allowing us to nominate genes important for oocyte maturation. When comparing predicted tail-length changes in human oocytes with genomic datasets of the All of Us Research Program and gnomAD, we found that genetic variants predicted to disrupt tail lengthening have been under negative selection in the human population, thereby linking mRNA tail lengthening to human female fertility.

Gene regulation in oocytes relies heavily on poly(A) tail-length changes. Here, the authors develop PAL-AI, a neural network model that predicts tail-length changes, identifies regulatory motifs, and links disruptive genetic variants to negative selection in humans, implicating tail-length control in female fertility.

## Linked entities

- **Species:** Homo sapiens (taxon 9606)

## Full-text entities

- **Genes:** CPSF1 (cleavage and polyadenylation specific factor 1) [NCBI Gene 29894] {aka CPSF160, HSU37012, MYP27, P/cl.18}, SHCBP1 (SHC binding and spindle associated 1) [NCBI Gene 79801] {aka PAL}, MAGOH (mago homolog, exon junction complex subunit) [NCBI Gene 4116] {aka MAGOH1, MAGOHA}, MOS (MOS proto-oncogene, serine/threonine kinase) [NCBI Gene 4342] {aka MSV, OZEMA20}, CPE (carboxypeptidase E) [NCBI Gene 1363] {aka BDVS, CPH, IDDHH}, CPEB1 (cytoplasmic polyadenylation element binding protein 1) [NCBI Gene 64506] {aka CPE-BP1, CPEB, CPEB-1, h-CPEB, hCPEB-1}, PNKP (polynucleotide kinase 3'-phosphatase) [NCBI Gene 11284] {aka AOA4, CMT2B2, EIEE10, MCSZ, PNK}, DAZL (deleted in azoospermia like) [NCBI Gene 1618] {aka DAZH, DAZL1, DAZLA, SPGYLA}, FIP1L1 (factor interacting with PAPOLA and CPSF1) [NCBI Gene 81608] {aka FIP1, Rhe, hFip1}, NEB (nebulin) [NCBI Gene 4703] {aka AMC6, NEB177D, NEM2}, SND1 (staphylococcal nuclease and tudor domain containing 1) [NCBI Gene 27044] {aka TDRD11, TSN, Tudor-SN, p100}, CEBPD (CCAAT enhancer binding protein delta) [NCBI Gene 1052] {aka C/EBP-delta, CELF, CRP3, NF-IL6-beta}, CCNB1 (cyclin B1) [NCBI Gene 891] {aka CCNB}, Shcbp1 (Shc SH2-domain binding protein 1) [NCBI Gene 20419] {aka mPAL}
- **Diseases:** Us (MESH:D019966), female infertility (MESH:D007247), developmental arrest (MESH:D006323)
- **Chemicals:** B&amp;W buffer (-), HCl (MESH:D006851), formaldehyde (MESH:D005557), spermidine (MESH:D013095), sodium citrate (MESH:D000077559), HEPES (MESH:D006531), EDTA (MESH:D004492), NaCl (MESH:D012965), progesterone (MESH:D011374), NaOH (MESH:D012972), isopropanol (MESH:D019840), TRIzol (MESH:C411644), Poly (MESH:D011061), dithiothreitol (MESH:D004229), phosphate (MESH:D010710), MgCl2 (MESH:D015636), Gentamicin (MESH:D005839), ethanol (MESH:D000431), Triton X-100 (MESH:D017830), (A) (MESH:D001151), acrylamide (MESH:D020106), water (MESH:D014867), cycloheximide (MESH:D003513), chloroform (MESH:D002725), CaCl2 (MESH:D002122), urea (MESH:D014508), KCl (MESH:D011189), agarose (MESH:D012685), phenol (MESH:D019800)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]
- **Mutations:** M2080S, M2403L, C-to-G substitution at position -12, C in 10, C-to-G at position -54, M0303S, C-to-G at position -16, M0201S, A-to-G at position -35, M0204S

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12316995/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12316995/full.md

## References

11 references — full list in the complete paper: https://tomesphere.com/paper/PMC12316995/full.md

---
Source: https://tomesphere.com/paper/PMC12316995