CysDuF database: annotation and characterization of cysteine residues in domain of unknown function proteins based on cysteine post-translational modifications, their protein microenvironments, biochemical pathways, taxonomy, and diseases
Devarakonda Himaja, Debashree Bandyopadhyay

TL;DR
The CysDuF database characterizes cysteine residues in proteins with unknown functions, focusing on their roles in biochemical pathways and diseases.
Contribution
The first comprehensive annotation of cysteine post-translational modifications in DUF proteins across multiple pathways and species, including SARS-CoV-2.
Findings
Cysteine residues in DUF proteins are mainly buried and hydrophobic, except in SARS-CoV-2 where they are surface-exposed and hydrophilic.
Cysteine PTMs were predicted with 79% accuracy using the DeepCys server and validated against experimental data.
The database includes annotations for seven biochemical pathways and is accessible via DUF, PFAM, or PDB IDs.
Abstract
Experimental characterization and annotation of amino acids belonging to domains of unknown function (DUF) proteins are expensive and time-consuming, which could be complemented by computational methods. Cysteine, being the second most reactive amino acid at the catalytic sites of enzymes, was selected for functional annotation and characterization on DUF proteins. Earlier, we reported functional annotation of cysteine on DUF proteins belonging to the COX-II family. However, holistic characterization of cysteine functions on DUF proteins was not known, to the best of our knowledge. Here, we annotated and characterized cysteine residues based on post-translational modifications (PTMs), biochemical pathways, diseases, taxonomy, and protein microenvironment. The information on uncharacterized DUF proteins was initially obtained from the literature, and the sequence, structure, pathways,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 10
Figure 11
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 6
Figure 7
Figure 8
Figure 9Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Advanced Proteomics Techniques and Applications · Bioinformatics and Genomic Networks
