MAP Format for Representing Chemical Modifications, Annotations, and Mutations in Protein Sequences: An Extension of the FASTA Format
Akshay Shendre, Naman Kumar Mehta, Anand Singh Rathore, Nishant Kumar,, Sumeet Patiyal, and Gajendra P. S. Raghava

TL;DR
The paper introduces MAP, a new comprehensive format for representing detailed chemical modifications, annotations, and mutations in protein sequences, extending the capabilities of FASTA for bioinformatics applications.
Contribution
It proposes the MAP format with meta tags and inline annotations to better represent complex protein modifications and variants, addressing limitations of existing formats.
Findings
MAP enables detailed annotation of protein modifications.
It supports residue-specific features like phosphorylation and non-natural residues.
The format is demonstrated in bioinformatics and protein therapeutics contexts.
Abstract
Several formats, including FASTA, PIR, GenBank, EMBL, and GCG, have been developed for representing protein sequences composed of natural amino acids. Among these, FASTA remains the most widely used due to its simplicity and human readability. However, FASTA lacks the capability to represent chemically modified or non-natural residues, as well as structural annotations and mutations in protein variants. To address some of these limitations, the PEFF format was recently introduced as an extension of FASTA. Additionally, formats such as HELM and BILN have been proposed to represent amino acids and their modifications at the atomic level. Despite their advancements, these formats have not achieved widespread adoption within the bioinformatics community due to their complexity. To complement existing formats and overcome current challenges, we propose a new format called MAP (Modification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetics, Bioinformatics, and Biomedical Research
