Gene model for the ortholog of Ilp3 in Drosophila ananassae
Madeline L. Gruys, James O’Brien, Alyssa C. Koehler, Alejandro Almazan, Katheryn Opperman, Rachel Sterne-Marr, Zeynep Ozsoy, Maire Kate Sustacek, Jacqueline Wittke-Thompson, Andrew M Arsham, Stephanie Toering Peters, Chinmay P. Rele, Laura K Reed

TL;DR
This paper describes a gene model for the Ilp3 ortholog in Drosophila ananassae, part of a study on the evolution of insulin signaling in fruit flies.
Contribution
The paper provides a new gene model for Ilp3 in Drosophila ananassae, contributing to the study of insulin signaling evolution.
Findings
The Ilp3 ortholog was identified in the Drosophila ananassae genome assembly.
The gene model is part of a dataset for studying the evolution of the IIS pathway in Drosophila.
Abstract
Gene model for the ortholog of Insulin-like peptide 3 ( Ilp3 ) in the May 2011 (Agencourt dana_caf1/DanaCAF1) Genome Assembly (GenBank Accession: GCA_000005115.1 ) of Drosophila ananassae . This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus Drosophila using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1|
"In this GEP CURE protocol students use web-based tools to manually annotate genes in non-model
“The particular gene ortholog described here was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus
“
|
- —National Institutes of Health (United States)https://ror.org/01cwqze88
- —U.S. National Science Foundation (United States)https://ror.org/021nxhr62
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetics, Bioinformatics, and Biomedical Research · Protein Degradation and Inhibitors · Ubiquitin and proteasome pathways
Description
**: **
We propose a gene model for the D. ananassae ortholog of the D. melanogaster *Insulin-like peptide 3 * ( * Ilp3 * ) gene. The genomic region of the ortholog corresponds to the uncharacterized protein XP_001956273.1 (Locus ID LOC6507752 ) in the D. ananassae May 2011 (Agencourt dana_caf1/DanaCAF1 Genome Assembly ( GCA_000005115.1 ; Drosophila 12 Genomes Consortium et al., 2007)). This model is based on RNA-Seq data from D. ananassae ( SRP006203 , SRP007906 ; PRJNA257286 , PRJNA388952 ; Graveley et al., 2011) and * Ilp3 * in *D. melanogaster * using FlyBase release FB2022_04 ( GCA_000001215.4 ; Larkin et al., 2021; Gramates et al., 2022; Jenkins et al., 2022).
Invertebrate insulins function similarly to metazoan insulin-like growth factors and play a role in cell and organ growth (Jin Chan and Steiner 2000). In Drosophila, seven insulin-like peptides (Ilp1-Ilp7) have a two-chain structure similar to vertebrate insulin and interact with the sole insulin-like receptor, InR, to initiate the insulin signaling cascade (Brogiolo et al., 2001; Nassel and Broeck 2016). Like the * Ilp2 * and * Ilp5 * genes, the * Ilp3 * gene is expressed in median neurosecretory cells (MNCs) in the brain (Ikeya et al., 2002). While the seven Ilps act redundantly with respect to promoting growth, they also have unique expression patterns and functions (Ikeya et al., 2002; Grönke et al., 2010). Ilp3 may act with the transcription factor dFOXO in a positive feedback loop to regulate Ilp2 and Ilp5 secretion from MNCs (Grönke et al., 2010). In female Drosophila , ablation of MNCs or knockout of * Ilp3 * have been shown to reduce fecundity and remating rates (Grönke et al., 2010; Wigby et al., 2011). Knockout of * Ilp3 * also results in sleep defects (Yamaguchi et al., 2022).
** Synteny **
The reference gene, * Ilp3 * , occurs on chromosome 3L in *D. melanogaster * and is nested within * CG32052 * , alongside Insulin-like peptide 4 ( * Ilp4 * ) (upstream) and Insulin-like peptide 2 ( * Ilp2 * ) (downstream). * Ilp3 * is flanked further upstream by L-2-hydroxyglutarate dehydrogenase ( * L2HGDH ) * and * CG43897 * ( * CG43897 * ) and further downstream by *Insulin-like peptide 1 * ( * Ilp1 * ) and Z band alternatively spliced PDZ-motif protein 67 ( * Zasp67 * ). The tblastn search of D. melanogaster Ilp3-PA (query) against the D. ananassae (GenBank Accession: GCA_000005115.1 ) Genome Assembly (database) placed the putative ortholog of * Ilp3 * within scaffold scaffold_13337 ( CH902618.1 ) at locus LOC6507752 ( XP_001956273.1 )— with an E-value of 5e-08 and a percent identity of 31.13%. Furthermore, the putative ortholog is nested within LOC6507310 ( XP_014765147.1 ) alongside LOC6507751 ( XP_032309882.1 ; upstream) and LOC6507309 ( XP_001956274.1 ; downstream), which correspond to * CG32052 * , * Ilp4 * and * Ilp2 * in D. melanogaster (E-value: 0.0, 3e-31 and 2e-27; identity: 86.67%, 51.52% and 46.79%, respectively, as determined by blastp ; Figure 1A, Altschul et al., 1990). The putative ortholog is flanked further upstream by LOC6507750 ( XP_001956268.3 ) and LOC6507311 ( XP_032310388.1 ), that nests LOC6502822 ( XP_001956270.2 ); which correspond to * L2HGDH * , * CG43897 , * and * Ilp5 * in *D. melanogaster * (E-value: 2e-111, 0.0 and 9e-09; identity: 68.06, 69.28% and 39.51%, respectively, as determined by blastp ). The putative ortholog of * Ilp3 * is flanked downstream by LOC6507308 ( XP_001956275.2 ) and LOC6507753 ( XP_014765448.1 ), which correspond to * Ilp1 * and * Zasp67 * in D. melanogaster (E-value: 6e-35 and 0.0; identity: 52.46% and 71.34%, respectively, as determined by blastp ). The putative ortholog assignment for * Ilp3 * in D. ananassae is supported by the following evidence: The genes surrounding the * Ilp3 * ortholog are orthologous to the genes at the same locus in D. melanogaster and local synteny is completely conserved, supported by E-values and percent identities, so we conclude that LOC6507752 contains the correct ortholog of * Ilp3 * in D. ananassae ( Figure 1A ).
** Protein Model **
Consistent with the blastp search result which shows 51.52% identity between D. melanogaster Ilp3-PA and the *D. ananassae * gene model, the dot plot features a few minor gaps along the diagonal, indicating significant conservation between the two protein sequences. * Ilp3 * in
- D. ananassae * has one protein-coding isoforms (Ilp3-PA; Figure 1B ). Isoform (Ilp3-PA) contains two CDSs. Relative to the ortholog in D. melanogaster , the CDS number and isoform count are conserved. The sequence of Ilp3-PA in
- D. ananassae* has 54.74% identity (E-value: 8e-31) with the protein-coding isoform Ilp3-PA in D. melanogaster , as determined by
- blastp * ( Figure 1C ). Coordinates of this curated gene model are stored by NCBI at GenBank/BankIt (accession BK064566 ). These data are also archived in the CaltechDATA repository (see “Extended Data” section below).
Methods
Detailed methods including algorithms, database versions, and citations for the complete annotation process can be found in Rele et al. (2023). Briefly, students use the GEP instance of the UCSC Genome Browser v.435 ( https://gander.wustl.edu ; Kent WJ et al., 2002; Navarro Gonzalez et al., 2021) to examine the genomic neighborhood of their reference IIS gene in the D. melanogaster genome assembly (Aug. 2014; BDGP Release 6 + ISO1 MT/dm6). Students then retrieve the protein sequence for the D. melanogaster reference gene for a given isoform and run it using tblastn against their target *Drosophila * species genome assembly on the NCBI BLAST server ( https://blast.ncbi.nlm.nih.gov/Blast.cgi ; Altschul et al., 1990) to identify potential orthologs. To validate the potential ortholog, students compare the local genomic neighborhood of their potential ortholog with the genomic neighborhood of their reference gene in D. melanogaster . This local synteny analysis includes at minimum the two upstream and downstream genes relative to their putative ortholog. They also explore other sets of genomic evidence using multiple alignment tracks in the Genome Browser, including BLAT alignments of RefSeq Genes, Spaln alignment of
- D. melanogaster* proteins, multiple gene prediction tracks (e.g., GeMoMa, Geneid, Augustus), and modENCODE RNA-Seq from the target species. Detailed explanation of how these lines of genomic evidenced are leveraged by students in gene model development are described in Rele et al. (2023). Genomic structure information (e.g., CDSs, intron-exon number and boundaries, number of isoforms) for the D. melanogaster reference gene is retrieved through the Gene Record Finder ( https://gander.wustl.edu/~wilson/dmelgenerecord/index.html ; Rele et al *., * 2023). Approximate splice sites within the target gene are determined using tblastn using the CDSs from the D. melanogaste r reference gene. Coordinates of CDSs are then refined by examining aligned modENCODE RNA-Seq data, and by applying paradigms of molecular biology such as identifying canonical splice site sequences and ensuring the maintenance of an open reading frame across hypothesized splice sites. Students then confirm the biological validity of their target gene model using the Gene Model Checker ( https://gander.wustl.edu/~wilson/dmelgenerecord/index.html ; Rele et al., 2023), which compares the structure and translated sequence from their hypothesized target gene model against the *D. melanogaster * reference gene model. At least two independent models for a gene are generated by students under mentorship of their faculty course instructors. Those models are then reconciled by a third independent researcher mentored by the project leaders to produce the final model. Note: comparison of 5' and 3' UTR sequence information is not included in this GEP CURE protocol.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Altschul SF Gish W Miller W Myers EW Lipman DJ 1990105 Basic local alignment search tool.J Mol Biol 21530022-283640341010.1016/S 0022-2836(05)80360-22231712 · doi ↗ · pubmed ↗
- 2Brogiolo W Stocker H Ikeya T Rintelen F Fernandez R Hafen E 2001220 An evolutionarily conserved function of the Drosophila insulin receptor and insulin-like peptides in growth control.Curr Biol 1140960-982221322110.1016/s 0960-9822(01)00068-911250149 · doi ↗ · pubmed ↗
- 3Jin Chan Shu Steiner Donald F. 200041 Insulin Through the Ages: Phylogeny of a Growth Promoting and Metabolic Regulatory Hormone American Zoologist 4020003-156921322210.1093/icb/40.2.213 · doi ↗
- 4Drosophila 12 Genomes Consortium. Clark AG Eisen MB Smith DR Bergman CM Oliver B Markow TA Kaufman TC Kellis M Gelbart W Iyer VN Pollard DA Sackton TB Larracuente AM Singh ND Abad JP Abt DN Adryan B Aguade M Akashi H Anderson WW Aquadro CF Ardell DH Arguello R Artieri CG Barbash DA Barker D Barsanti P Batterham P Batzoglou S Begun D Bhutkar A Blanco E Bosak SA Bradley RK Brand AD Brent MR Brooks AN Brown RH Butlin RK Caggese C Calvi BR Bernardo de Carvalho A Caspi A Castrezana S Celniker SE Chang JL Chapple C Chatterji S Chinwalla A Civetta A C · doi ↗ · pubmed ↗
- 5Gonnet GH Cohen MA Benner SA 199265 Exhaustive matching of the entire protein sequence database.Science 25650620036-80751443144510.1126/science.16043191604319 · doi ↗ · pubmed ↗
- 6Gramates L Sian Agapite Julie Attrill Helen Calvi Brian R Crosby Madeline A dos Santos Gilberto Goodman Joshua L Goutte-Gattat Damien Jenkins Victoria K Kaufman Thomas Larkin Aoife Matthews Beverley B Millburn Gillian Strelets Victor B Perrimon Norbert Gelbart Susan Russo Agapite Julie Broll Kris Crosby Lynn dos Santos Gil Falls Kathleen Gramates L Sian Jenkins Victoria Longden Ian Matthews Beverley Seme Jolene Tabone Christopher J Zhou Pinglei Zytkovicz Mark Brown Nick Antonazzo Giulia Attrill Helen Garapati Phani Goutte-Gatta · doi ↗ · pubmed ↗
- 7Graveley BR Brooks AN Carlson JW Duff MO Landolin JM Yang L Artieri CG van Baren MJ Boley N Booth BW Brown JB Cherbas L Davis CA Dobin A Li R Lin W Malone JH Mattiuzzo NR Miller D Sturgill D Tuch BB Zaleski C Zhang D Blanchette M Dudoit S Eads B Green RE Hammonds A Jiang L Kapranov P Langton L Perrimon N Sandler JE Wan KH Willingham A Zhang Y Zou Y Andrews J Bickel PJ Brenner SE Brent MR Cherbas P Gingeras TR Hoskins RA Kaufman TC Oliver B Celniker SE 20101222 The developmental transcriptome of Drosophila melanogaster.Nature 47173390028-083647 · doi ↗ · pubmed ↗
- 8Grewal SS 20081018 Insulin/TOR signaling in growth and homeostasis: a view from the fly world.Int J Biochem Cell Biol 4151357-27251006101010.1016/j.biocel.2008.10.01018992839 · doi ↗ · pubmed ↗
