The fitness landscape of overlapping genes
Orson Kirsch, Nicole Wood, Steven A Redford, Kabir Husain

TL;DR
This study explores the feasibility, design principles, and evolutionary landscape of overlapping genes in natural genomes, revealing compatibility, trade-offs, and the unique suitability of the natural genetic code.
Contribution
It computationally designs overlapping gene sequences, analyzes their fitness landscapes, and uncovers principles governing their compatibility and evolution.
Findings
Widespread compatibility between protein families for overlapping genes.
The natural genetic code is uniquely suited to support overlaps.
Overlapped genes can be connected through networks of near-neutral mutations.
Abstract
Natural genomes sometimes encode two different proteins in staggered reading frames of the same DNA sequence. Despite the prevalence of these 'overlapping genes' across the tree of life, it remains unknown whether arbitrary protein pairs can overlap, to what extent such overlaps are feasible, or what design principles govern them. Here, we study compatibility, frustration, and connectivity in the fitness landscape of overlapping genes. We computationally design sequences de novo that satisfy the dual functional constraints of two distinct protein families. The joint fitness landscape, inferred via Potts models from multiple sequence alignments, reveals a fundamental trade-off between the two proteins and provides a simple criterion for when overlap is feasible. We find widespread compatibility between protein families, with one class of reading frames markedly more permissible than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
