ORFannotate: reproducible coding sequence annotation of transcriptome assemblies
Sonia García-Ruiz, Hannah Macpherson, Laura Caton, Mina Ryten, Emil K Gustavsson

TL;DR
ORFannotate is a tool that improves transcriptome annotations by adding precise coding sequence and translational features directly into GTF/GFF files.
Contribution
ORFannotate introduces a GTF-native tool that reintegrates ORF and CDS annotations into transcript models, enhancing long-read sequencing workflows.
Findings
ORFannotate accurately predicts and inserts CDS and UTR features into GTF/GFF files.
The tool annotates Kozak sequences, uORFs, and NMD susceptibility for biological context.
ORFannotate is fast, scalable, and integrates well with visualization and analysis tools.
Abstract
Accurate annotation of coding sequences and translational features within transcript models is essential for interpreting assembled transcriptomes and their functional potential. Existing open reading frame (ORF) prediction tools typically operate on transcript FASTA files and do not reintegrate coding sequence (CDS) information back into transcript models, limiting their utility in long-read sequencing workflows where GTF/GFF annotations are the primary output. We present ORFannotate, a lightweight, GTF-native Python command-line tool that predicts ORFs from transcript annotations and reinserts precise, exon-aware CDS and UTR features into the original GTF/GFF file. In addition, ORFannotate provides biologically informative translational context by annotating Kozak sequence strength, detecting non-overlapping upstream ORFs (uORFs) with coding probabilities, characterising 5′ and 3′…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · RNA modifications and cancer · Bacterial Genetics and Biotechnology
