The combinatorics of overlapping genes
Sophie Lebre, Olivier Gascuel

TL;DR
This paper investigates the genetic and amino acid constraints of overlapping genes across different reading frames, revealing new formal constraints and providing tools for understanding their evolution and detection.
Contribution
It introduces a formal framework and graph algorithm to characterize amino acid and polypeptide constraints in overlapping genes, including novel constraints involving specific amino acid words.
Findings
Linear amino acid composition constraints identified for overlapping genes.
Novel constraints involving amino acid words like YY in certain frames.
Framework supports understanding gene evolution and aids in detection methods.
Abstract
Overlapping genes exist in all domains of life and are much more abundant than expected at their first discovery in the late 1970s. Assuming that the reference gene is read in frame +0, an overlapping gene can be encoded in two reading frames in the sense strand, denoted by +1 and +2, and in three reading frames in the opposite strand, denoted by -0, -1 and -2. This motivated numerous researchers to study the constraints induced by the genetic code on the various overlapping frames, mostly based on information theory. Our focus in this paper is on the constraints induced on two overlapping genes in terms of amino acids, as well as polypeptides. We show that simple linear constraints bind the amino acid composition of two proteins encoded by overlapping genes. Novel constraints are revealed when polypeptides are considered, and not just single amino acids. For example, in double-coding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Antimicrobial Peptides and Activities · interferon and immune responses
