Gene model for the ortholog of wrd in Drosophila ananassae
Megan E. Lawson, Samantha Hoffman, Mikhail Sanu, Daniel Morris, Evan Merkhofer, Stephanie Toering Peters, Nikolaos Tsotakos, Chinmay P. Rele, Laura K. Reed

TL;DR
This paper presents a gene model for the wrd ortholog in Drosophila ananassae to study the evolution of a signaling pathway.
Contribution
The paper provides a new gene model for studying the evolution of the IIS pathway in Drosophila.
Findings
A gene model for the wrd ortholog was characterized in Drosophila ananassae.
The model is part of a dataset for studying the evolution of the IIS pathway.
It follows a gene annotation protocol for undergraduate research experiences.
Abstract
Gene model for the ortholog of well-rounded ( wrd ) in the May 2011 (Agencourt dana_caf1/DanaCAF1) Genome Assembly (GenBank Accession: GCA_000005115.1 ) of Drosophila ananassae . This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus Drosophila using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1|
"In this GEP CURE protocol students use web-based tools to manually annotate genes in non-model
“The particular gene ortholog described here was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus
“
|
- —National Institutes of Health (United States)https://ror.org/01cwqze88
- —National Science Foundation (United States)https://ror.org/021nxhr62
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene Regulatory Network Analysis
Description
**: **
The model presented here is the ortholog of wrd in the May 2011 (Agencourt dana_caf1/DanaCAF1) assembly of D. ananassae ( GCA_000005115.1 – Drosophila 12 Genomes Consortium et al., 2007) and corresponds to the Gnomon Peptide ID ( XP_032305638.1 ) predicted model in
- D. ananassae * ( LOC6499555 ) . This gene model is based on RNA-Seq data from D. ananassae (Gravely et al . , 2011; SRP006203 , PRJNA257286 , SRP007906 , PRJNA388952 ) and the
- wrd * ( GCA_000001215.4 ) in *D. melanogaster * from FB2022_03 (Larkin et al *., * 2021; Gramates et al., 2022; Jenkins et al., 2022).
The well-rounded ( wrd ) gene was first identified in a gain-of-function screen for molecules that regulate synaptic development in the neuromuscular junction (NMJ) in Drosophila melanogaster (Viquez et al., 2006). Its gene product is one of the two B' regulatory subunits of protein phosphatase 2A (PP2A), which interacts with liprin-α to stabilize specification of the active zone, a site in the axonal plasma membrane that regulates endo- and exocytosis of synaptic vesicles (Li et al., 2014). Another interacting partner of wrd is S6K, an important kinase in the insulin/TOR signaling pathway that is targeted for dephosphorylation by the PP2A holoenzyme through its molecular interaction with wrd (Hahn et al., 2010). Loss-of-function wrd mutants are viable and have fewer synaptic boutons that are larger and have a smooth round contour, a phenotype which gave the gene its name (Viquez et al., 2006). Knockout flies for the *wrd * B' subunit of PP2A are lean, have a reduced lifespan, and elevated insulin signaling (Hahn et al., 2010), with tissue specificity fine-tuned by cyclin G (Fischer et al., 2016).
** Synteny **
*wrd * occurs on chromosome 3L in *D. melanogaster * and is flanked upstream by * CG7215 * and Peroxiredoxin 5 ( * Prx5 ) * , and downstream by cadmus ( * cdm ) * and * CG7208 . * We determined that the putative ortholog of wrd is found on scaffold 13340 ( CH902617.1 ) in D. ananassae with LOC6499555 ( XP_032305638.1 ) (via tblastn search with an e-value of 0.0 and percent identity of 88.81%). The *wrd * ortholog is flanked upstream by LOC6501022 ( XP_001954393.1 ) and LOC6501023 ( XP_001954394.2 ) which correspond to * CG7215 * and * Prx5 * in *D. melanogaster * with e-values of 7e-78 and 3e-127, respectively, and percent identities 82.31% and 93.68%, respectively, as determined by blastp . The *wrd * ortholog is flanked downstream by LOC6501021 ( XP_001954390.2 ) and LOC6499556 ( XP_001954389.1 ) which correspond to
- cdm * and
CG6142 * in *D. melanogaster * with e-values of 0.0 and 0.0, respectively, and percent identities of 87.44% and 88.64%, respectively, as determined by blastp ( Figure 1A, Altschul et al . , 1990). We believe this is the correct ortholog assignment for wrd in D. ananassae because synteny is well-conserved, with only one gene in the genomic neighborhood being non-orthologous, and because all of the BLAST searches used to determine orthology were very high quality.
** Protein Model **
*wrd * in
- D. ananassae * has nine unique protein-coding isoforms ( Figure 1B ). The protein isoform encoded by mRNAs wrd-RD, wrd-RE, wrd-RM, wrd-RL , and wrd-RN (which differ in their UTRs) contains seven CDSs. The protein isoform encoded by mRNAs wrd-RK and wrd-RO contain eight protein-coding CDSs. Isoform wrd-PJ, is encoded by *wrd-RJ * which contains eight CDSs. Protein isoform wrd-PI is encoded by five CDSs. Isoform wrd-PQ is encoded by nine CDSs. Isoform wrd-PP is encoded by eight CDSs. Isoform wrd-PC is encoded by six CDSs. Isoform wrd-PB is encoded by eight CDSs. Isoform wrd-PA is encoded by seven CDSs. This isoform strucutre is the same relative to the ortholog in D. melanogaster , which also has 9 unique protein-coding isoforms. All of the *D. melanogaster * isoforms have the same number of CDSs as their corresponding ortholog in *D. ananassae. * The amino acid sequence of
- wrd * in
- D. ananassae* has 83.0% identity with the wrd in *D. melanogaster * as determined by
- blastp* ( Figure 1C ). The coordinates of the curated gene models can be found in NCBI at GenBank/BankIt using the accessions BK064428 , BK064429 , BK064430 , BK064431 , BK064432 , BK064433 , BK064434 , BK064435 , BK064436 , BK064437 , BK064438 , BK064439 , BK064440 , and BK064441 . These data are also available in Extended Data files below, which are archived in CaltechData.
Methods
Detailed methods including algorithms, database versions, and citations for the complete annotation process can be found in Rele et al. (2023). Briefly, students use the GEP instance of the UCSC Genome Browser v.435 ( https://gander.wustl.edu ; Kent WJ et al., 2002; Navarro Gonzalez et al., 2021) to examine the genomic neighborhood of their reference IIS gene in the D. melanogaster genome assembly (Aug. 2014; BDGP Release 6 + ISO1 MT/dm6). Students then retrieve the protein sequence for the D. melanogaster reference gene for a given isoform and run it using tblastn against their target *Drosophila * species genome assembly on the NCBI BLAST server ( https://blast.ncbi.nlm.nih.gov/Blast.cgi ; Altschul et al., 1990) to identify potential orthologs. To validate the potential ortholog, students compare the local genomic neighborhood of their potential ortholog with the genomic neighborhood of their reference gene in D. melanogaster . This local synteny analysis includes at minimum the two upstream and downstream genes relative to their putative ortholog. They also explore other sets of genomic evidence using multiple alignment tracks in the Genome Browser, including BLAT alignments of RefSeq Genes, Spaln alignment of
- D. melanogaster* proteins, multiple gene prediction tracks (e.g., GeMoMa, Geneid, Augustus), and modENCODE RNA-Seq from the target species. Detailed explanation of how these lines of genomic evidenced are leveraged by students in gene model development are described in Rele et al. (2023). Genomic structure information (e.g., CDSs, intron-exon number and boundaries, number of isoforms) for the D. melanogaster reference gene is retrieved through the Gene Record Finder ( https://gander.wustl.edu/~wilson/dmelgenerecord/index.html ; Rele et al *., * 2023). Approximate splice sites within the target gene are determined using tblastn using the CDSs from the D. melanogaste r reference gene. Coordinates of CDSs are then refined by examining aligned modENCODE RNA-Seq data, and by applying paradigms of molecular biology such as identifying canonical splice site sequences and ensuring the maintenance of an open reading frame across hypothesized splice sites. Students then confirm the biological validity of their target gene model using the Gene Model Checker ( https://gander.wustl.edu/~wilson/dmelgenerecord/index.html ; Rele et al., 2023), which compares the structure and translated sequence from their hypothesized target gene model against the *D. melanogaster * reference gene model. At least two independent models for a gene are generated by students under mentorship of their faculty course instructors. Those models are then reconciled by a third independent researcher mentored by the project leaders to produce the final model. Note: comparison of 5' and 3' UTR sequence information is not included in this GEP CURE protocol.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Altschul SF Gish W Miller W Myers EW Lipman DJ 1990105 Basic local alignment search tool.J Mol Biol 21530022-283640341010.1016/S 0022-2836(05)80360-22231712 · doi ↗ · pubmed ↗
- 2Drosophila 12 Genomes Consortium. Clark AG Eisen MB Smith DR Bergman CM Oliver B Markow TA Kaufman TC Kellis M Gelbart W Iyer VN Pollard DA Sackton TB Larracuente AM Singh ND Abad JP Abt DN Adryan B Aguade M Akashi H Anderson WW Aquadro CF Ardell DH Arguello R Artieri CG Barbash DA Barker D Barsanti P Batterham P Batzoglou S Begun D Bhutkar A Blanco E Bosak SA Bradley RK Brand AD Brent MR Brooks AN Brown RH Butlin RK Caggese C Calvi BR Bernardo de Carvalho A Caspi A Castrezana S Celniker SE Chang JL Chapple C Chatterji S Chinwalla A Civetta A C · doi ↗ · pubmed ↗
- 3Fischer P Preiss A Nagel AC 2016316 A triangular connection between Cyclin G, PP 2A and Akt 1 in the regulation of growth and metabolism in Drosophila.Fly (Austin)1011933-6934111810.1080/19336934.2016.116236226980713 PMC 4934794 · doi ↗ · pubmed ↗
- 4Gramates LS, Agapite J, Attrill H, Calvi BR, Crosby MA, Dos Santos, G, et al, The Fly Base Consortium 2022. Fly Base: a guided tour of highlighted features. Genetics. 220: iyac 035.10.1093/genetics/iyac 035PMC 898203035266522 · doi ↗ · pubmed ↗
- 5Graveley BR Brooks AN Carlson JW Duff MO Landolin JM Yang L Artieri CG van Baren MJ Boley N Booth BW Brown JB Cherbas L Davis CA Dobin A Li R Lin W Malone JH Mattiuzzo NR Miller D Sturgill D Tuch BB Zaleski C Zhang D Blanchette M Dudoit S Eads B Green RE Hammonds A Jiang L Kapranov P Langton L Perrimon N Sandler JE Wan KH Willingham A Zhang Y Zou Y Andrews J Bickel PJ Brenner SE Brent MR Cherbas P Gingeras TR Hoskins RA Kaufman TC Oliver B Celniker SE 20101222 The developmental transcriptome of Drosophila melanogaster.Nature 47173390028-083647 · doi ↗ · pubmed ↗
- 6Grewal SS 20081018 Insulin/TOR signaling in growth and homeostasis: a view from the fly world.Int J Biochem Cell Biol 4151357-27251006101010.1016/j.biocel.2008.10.01018992839 · doi ↗ · pubmed ↗
- 7Hahn K Miranda M Francis VA Vendrell J Zorzano A Teleman AA 201055 PP 2A regulatory subunit PP 2A-B' counteracts S 6K phosphorylation.Cell Metab 1151550-413143844410.1016/j.cmet.2010.03.01520444422 · doi ↗ · pubmed ↗
- 8Hietakangas V Cohen SM 2009 Regulation of tissue growth through nutrient sensing.Annu Rev Genet 430066-419738941010.1146/annurev-genet-102108-13481519694515 · doi ↗ · pubmed ↗
