# TSS seq based core promoter architecture in blood feeding Tsetse fly (Glossina morsitans morsitans) vector of Trypanosomiasis

**Authors:** Sarah Mwangi, Geoffrey Attardo, Yutaka Suzuki, Serap Aksoy, Alan Christoffels

PMC · DOI: 10.1186/s12864-015-1921-6 · 2015-09-22

## TL;DR

This study explores how the Tsetse fly regulates gene transcription, identifying unique promoter architectures and motif combinations in a blood-feeding insect.

## Contribution

The study identifies novel motif combinations in Tsetse fly promoters, revealing how they compensate for the absence of TATA motifs.

## Key findings

- Tsetse fly promoters show a preference for AT nucleotides and specific dinucleotide patterns like CA and AA.
- Broad promoters in Tsetse flies frequently contain the MTE-DPE motif pair and lack TATA motifs, unlike Drosophila.
- Gene ontology analysis links developmental processes to narrow and broad with peak promoters.

## Abstract

Transcription initiation regulation is mediated by sequence-specific interactions between DNA-binding proteins (transcription factors) and cis-elements, where BRE, TATA, INR, DPE and MTE motifs constitute canonical core motifs for basal transcription initiation of genes. Accurate identification of transcription start site (TSS) and their corresponding promoter regions is critical for delineation of these motifs. To this end, the genome scale analysis of core promoter architecture in insects has been confined to Drosophila. The recently sequenced Tsetse fly genome provides a unique opportunity to analyze transcription initiation regulation machinery in blood-feeding insects.

A computational method for identification of TSS in newly sequenced Tsetse fly genome was evaluated, using TSS seq tags sampled from two developmental stages namely; larvae and pupae. There were 3134 tag clusters among which 45.4 % (1424) of the tag clusters mapped to first coding exons or their proximal predicted 5′UTR regions and 1.0 % (31) tag clusters mapping to transposons, within a threshold of 100 tags per cluster. These 1393 non transposon-derived core promoters had propensity for AT nucleotides. The −1/+1 and 1/+1 positions in D. melanogaster, and G. m. morsitans had propensity for CA and AA dinucleotides respectively. The 1393 tag clusters comprised narrow promoters (5 %), broad with peak promoters (23 %) and broad without peak promoters (72 %). Two-way motif co-occurrence analysis showed that the MTE-DPE pair is over-represented in broad core promoters. The frequently occurring triplet motifs in all promoter classes are the INR-MTE-DPE, TATA-MTE-DPE and TATA-INR-DPE. Promoters without the TATA motif had higher frequency of the MTE and INR motifs than those observed in Drosophila, where the DPE motif occur more frequently in promoters without TATA motif. Gene ontology terms associated with developmental processes were overrepresented in the narrow and broad with peak promoters.

The study has identified different motif combinations associated with broad promoters in a blood-feeding insect. In the case of TATA-less core promoters, G.m. morsitans uses the MTE to compensate for the lack of a TATA motif. The increasing availability of TSS seq data allows for revision of existing gene annotation datasets with the potential of identifying new transcriptional units.

The online version of this article (doi:10.1186/s12864-015-1921-6) contains supplementary material, which is available to authorized users.

## Linked entities

- **Species:** Glossina morsitans morsitans (taxon 37546), Drosophila melanogaster (taxon 7227)

## Full-text entities

- **Genes:** dachs (dachs) [NCBI Gene 34179] {aka 29C3-D1, 29CD, AAF52683, CG10595, CG13087, CG31610}, Miro (Mitochondrial Rho) [NCBI Gene 42845] {aka B682, CG5410, DMiro, DmMiro, Dmel\CG5410, Q8IMX7-1}, Hsp60A (Heat shock protein 60A) [NCBI Gene 32045] {aka 12, BP5, CG12101, Cpn60, Dm10A, DmHsp60}, bsk (basket) [NCBI Gene 44801] {aka Basket, CG5680, D-JNK, D-junk, DBSK/JNK, DJNK}, GTF2B (general transcription factor IIB) [NCBI Gene 2959] {aka TF2B, TFIIB}, Cdc42 (Cell division cycle 42) [NCBI Gene 32981] {aka CDC-42, CG12530, Cdc 42, Cdc-42, Cdc42Dm, D-CDC42}, Act42A (Actin 42A) [NCBI Gene 35526] {aka 42A, A, ACT2, ACT2_DROME, AFFX-Dros-ACTIN_M_r_at, Act}, Tbp (TATA binding protein) [NCBI Gene 37476] {aka CG9874, Dmel\CG9874, RBP, TBP38, TFIID, TFIIDtau}, Polr2A (RNA polymerase II subunit A) [NCBI Gene 32100] {aka 5, 8WG16, CG1554, CTD, DmCTD, Dmel\CG1554}
- **Diseases:** DDBJ (MESH:D004266), blood (MESH:D006402), Trypanosomiasis (MESH:D014352), HAT (MESH:D014353), TSS (MESH:D020922), Ewing (MESH:D012512)
- **Chemicals:** dry ice (MESH:D004367), dinucleotide (MESH:D015226), CA (MESH:D002118), AT (MESH:D001246), Chitin (MESH:D002686), Nucleotide (MESH:D009711), ATP (MESH:D000255), TRIzol (MESH:C411644), AA dinucleotide (-)
- **Species:** Glossina morsitans (tsetse fly, species) [taxon 7394], Glossina (tsetse flies, genus) [taxon 7393], Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090], Aedes aegypti (yellow fever mosquito, species) [taxon 7159], Glossina morsitans morsitans (subspecies) [taxon 37546], Drosophila melanogaster (fruit fly, species) [taxon 7227]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC4578606/full.md

---
Source: https://tomesphere.com/paper/PMC4578606