An in-silico analysis of OGT gene association with diabetes mellitus
Abigail O. Ayodele, Brenda Udosen, Olugbenga O. Oluwagbemi, Elijah K. Oladipo, Idowu Omotuyi, Itunuoluwa Isewon, Oyekanmi Nash, Opeyemi Soremekun, Segun Fatumo

TL;DR
This study explores how mutations in the OGT gene may contribute to diabetes and identifies potential drug targets based on their effects on protein function.
Contribution
The study identifies 7 deleterious SNPs in the OGT gene and evaluates their structural and functional impact for diabetes treatment.
Findings
Seven SNPs in the OGT gene were predicted to have deleterious effects using multiple tools.
Mutated OGT proteins showed strong binding affinities with the inhibitor OSMI-1 in molecular docking.
These mutations may serve as potential drug targets for diabetes mellitus.
Abstract
O-GlcNAcylation is a nutrient-sensing post-translational modification process. This cycling process involves two primary proteins: the O-linked N-acetylglucosamine transferase (OGT) catalysing the addition, and the glycoside hydrolase OGA (O-GlcNAcase) catalysing the removal of the O-GlCNAc moiety on nucleocytoplasmic proteins. This process is necessary for various critical cellular functions. The O-linked N-acetylglucosamine transferase (OGT) gene produces the OGT protein. Several studies have shown the overexpression of this protein to have biological implications in metabolic diseases like cancer and diabetes mellitus (DM). This study retrieved 159 SNPs with clinical significance from the SNPs database. We probed the functional effects, stability profile, and evolutionary conservation of these to determine their fit for this research. We then identified 7 SNPs (G103R, N196K, Y228H,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2- —Wellcome Trust
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGlycosylation and Glycoproteins Research · Biochemical and Molecular Research · Carbohydrate Chemistry and Synthesis
Introduction
The human O-linked N-acetylglucosamine transferase (OGT) gene is ∼43 kb long. Located at the Xq13.1 genomic locus, it is alternatively spliced to generate nucleocytoplasmic (nc), mitochondrial (m), and short (s) isoforms. The varying number of tetratricopeptide repeats (TPRs) in their N-terminal domains distinguishes these isoforms. The full-length human nucleocytoplasmic OGT isoform (∼110 kDa) contains 13 TPRs, while mitochondrial OGT (∼103 kDa) and short OGT (∼75 kDa) contain 9 and 3 TPRs, respectively [1, 2]. The OGT gene encodes the OGT protein.
Protein O-GlcNAc transferase (OGT) adds the GlcNAc moiety to cytoplasmic and nuclear proteins’ threonine and serine residues. Because it is involved in cell signalling, glucose homeostasis in the liver, and regulating the clock genes’ circadian oscillation, its absence is lethal in mice [3, 4]. Torres and Hart discovered it about 30 years ago [5], and it is linked to x-linked intellectual disability and insulin resistance in muscle and adipocyte cells when mutated [6, 7]. Its contribution to glucose metabolism via the Hexosamine Biosynthesis Pathway directly links it to diabetes mellitus [8, 9].
Diabetes mellitus (DM) is a metabolic disorder that comes in two forms: T1DM and T2DM. The defective secretion of insulin causes T1DM, while T2DM is caused by a defect in insulin action [10]. Diabetes is caused by a variety of factors, including but not limited to lifestyle, genetics, and diet. Diabetes is estimated to kill 6.7 million people worldwide in 2021, with 537 million adults living with the disease, a figure that is expected to rise to 783 million by 2045 [11].
Non-synonymous single nucleotide polymorphisms (nsSNPs) are protein amino acid substitutions [12]. As a result, this study aims to identify disease-causing and deleterious SNPs within the OGT gene and druggable targets to discover therapeutic drugs for diabetes mellitus via this gene. To obtain an unbiased outcome, it is sensible to evaluate the detrimental prediction of various sequence-and structure-based tools, many of which have different methodologies for variant classification. The likelihood of a SNP being harmful is high if it is projected to be so by the several different predictive tools that use different methodologies. However, the performance, precision, and accuracy of the in-silico biological and clinical predictions can be improved by combining different in-silico methods or tools.
Materials and methods
Data retrieval for single nucleotide polymorphisms
The OGT variants and SNPs were retrieved from the National Centre for Biotechnology Information’s (NCBI) dbSNPs server [14]. The SNPs were chosen based on their clinical significance, as reported by ClinVar [15].
Investigating the functional effects of coding nsSNPs
The deleterious potential of the OGT nsSNPs was assessed using four significant tools: Predictor of Human Deleterious Single Nucleotide Polymorphism (PhD-SNP) [12], SNPs&Go [16], PROVEAN v1.1 [17], and Polymorphism Phenotyping v2 (Polyphen) [18]. SNPs&GO is an algorithm that predicts deleterious nsSNPs based on protein functional annotation. PHD-SNP is an online tool for predicting point mutations in protein sequences and determining the impact of these mutations [19]. The program predicts how the single-point amino acid change will cause disease. PROVEAN predicts changes in a protein’s biological functions caused by single amino acid substitutions, and a score of less than − 2.5 is predicted to be harmful.
Analysis of protein stability of predicted OGT nsSNPs
The i-Stable 2.0 server, which includes tools such as iPTREE-STAB, I-Mutant 2.0, and MUpro, was used to predict the structure-function relationship of the SNPs [20]. The i-Mutant tool calculates the Gibbs free energy for the wild-type protein and subtracts it from the mutant form to estimate the free energy changes. The predicted values of all OGT mutant types may alter protein stability with associated free energy. Positive DDG values indicate that the mutated proteins are highly stable, whereas negative scores indicate less stable [21].
Analysis of the evolutionary conservation of amino acids
The Consurf program investigates the evolutionary conservation of OGT amino acids. It uses a Bayesian method to determine the conserved amino acids to identify the structural and functional residues in the conserved regions [22]. The prediction of the amino acids is into a variable (range between 1 and 4), intermediate (range between 5 and 6), and conserved (range between 7 and 9) based on their scores and colour indications [23].
Protein modelling and molecular docking
Using the protein sequence retrieved from the UniProt database, we used the ROBETTA homology modelling tool to predict the 3D structure of the OGT apo-protein [24]. The predicted structure was viewed using the Schrodinger Maestro v11.1 workspace and validated using the Verify-3D and ERRAT programs available in the SAVES server [25]. Schrodinger-Maestro v11.1’s Protein Preparation Wizard module was used to preprocess, optimise, and minimise the crystal structure of OGT. While keeping the pH at 7, structural water molecules were kept to ensure protein stability, while redundant water molecules were removed to facilitate protein-ligand binding. Hydrogens were also added to fill the gaps and mediate hydrogen bridges and electrostatic forces [26]. We used the SiteMap feature of the Schrodinger Maestro software to identify potential binding pockets on the OGT protein [27]. The generation of receptor grids was expedient to limit ligand docking to only the identified binding pockets [28]. The grid box had dimensions of x = -32.724, y = 51.454, and z = 83.332. The PubChem database was used to retrieve the 2D structure of OSMI-1, a small molecule inhibitor of OGT [29]. The OSMI-1 was prepared and converted to its 3D geometry prior to molecular docking using the LigPrep module of Maestro v.11.1 [30].
Results
nsSNPs obtained from the dbSNPs database
The discovery of disease-causing nsSNPs helps develop candidate drug therapy because they are biological markers involved in disease occurrence or progression [31, 32]. The NCBI server yielded 159 nsSNPs [33]. According to ClinVar, the retrieval favoured only SNPs with clinical significance [15].
Identification of damaging nsSNPs in OGT
We used four (4) tools to predict the potential deleteriousness of 25 nsSNPs, with at least three (3) of the four (4) tools predicting a negative effect (Table 1). PROVEAN predicted seven (7) nsSNPs to be harmful, and using the PolyPhen-2 tool, all seven (7) nsSNPs were probably harmful, with scores ranging from 0.932 to 1.000. SNPs&GO and PhD-SNP both predicted diseased SNPs. The total number of deleterious SNPs was reduced to 7 based on their detrimental effect across all four tools (Table 2).
Table 1. Damaging nsSNPs from OGTS/NrsIDAA Change/positionPROVEANPhD-SNPsSNPs&GOPolyPhen21rs766646613R627C-4.677 DeleteriousDisease RI-2Neutral0.999 probably damaging2rs131705060R117C-4.194 DeleteriousDisease RI-1Disease RI-10.932 probably damaging3rs204042438P879L-9.041 DeleteriousDisease RI-6Neutral0.942 probably damaging4rs204039392P685Q-7.872 DeleteriousNeutralDisease RI-40.994 probably damaging5rs204042400R867C-5.399 DeleteriousDisease RI- 1Neutral0.994 probably damaging6rs204034593A380V-3.790 DeleteriousNeutralDisease RI-40.938 probably damaging7rs766646613R627C-4.677 DeleteriousDisease RI-2Neutral0.999 probably damaging8rs2040347448M401T-4.727 DeleteriousDisease RI-2Neutral0.998 probably damaging9rs2040347668C417Y-8.596 DeleteriousNeutralDisease RI-10.989 probably damaging10rs2040350890D481G-4.340 DeleteriousDisease 1Disease RI-2benign11rs2040368778H611N-5.952 DeleteriousDisease 2Neutral0.55 probably damaging12rs2040387073P657L-9.335 DeleteriousNeutralDisease RI-60.924 probably damaging13rs2040191136Y112S-7.489 DeleteriousNeutralDisease RI-70.973 probably damaging14rs2040329106Y228H-2.680 DeleteriousDisease 5Disease RI-00.997 probably damaging15rs2040334939R250C-6.093 DeleteriousDisease 3Disease RI-31.000 probably damaging16rs2040341169G341V-7.294 DeleteriousDisease RI-3Disease RI-30.991 probably damaging17rs2040345810L367F-3.717 DeleteriousDisease 1Disease RI-10.999 probably damaging18rs2040190682R102G-4.116 DeleteriousNeutralDisease RI-10.930 PROBABLY DAMAGING19rs2040334968R250L-5.405 DeleteriousDisease 2Disease RI-01.000 probably damaging20rs772525369R899C-6.682 DeleteriousDisease RI-0Disease RI-21.000 probably damaging21rs1114167891R284P-4.060 DeleteriousDisease 6Disease RI-60.951 probably damaging22rs1556046834G103R-5.717 DeleteriousDisease 5Disease RI-60.993 probably damaging23rs1602152230N648Y-7.605 DeleteriousDisease 6Disease RI-00.998 probably damaging24rs2040405196C845S-7.654 DeleteriousDisease 3Disease RI-30.930 probably damaging25rs200109331N196K-4.599 DeleteriousDisease 5Disease RI-51.000 probably damaging
Table 2. Predicted deleterious nsSNPs across the four toolsS/Nrs IDAA Change/PositionPROVEANPhD-SNPsSNPs&GOPolyPhen21rs2040329106Y228H-2.680 DeleteriousDisease 5Disease RI-00.997 PROBABLY DAMAGING2rs2040334939R250C-6.093 DeleteriousDisease 3Disease RI-31.000 PROBABLY DAMAGING3rs2040341169G341V-7.294 DeleteriousDisease RI-3Disease RI-30.991 PROBABLY DAMAGING4rs2040345810L367F-3.717 DeleteriousDisease 1Disease RI-10.999 PROBABLY DAMAGING5rs2040405196C845S-7.654 DeleteriousDisease 3Disease RI-30.930 probably damaging6rs1556046834G103R-5.717 DeleteriousDisease 5Disease RI-60.993 PROBABLY DAMAGING7rs200109331N196K-4.599 DeleteriousDisease 5Disease RI-51.000 PROBABLY DAMAGING
Protein stability profile prediction for nsSNPs in OGT
The iStable 2.0 tool predicted protein stability [34]. All seven highly deleterious SNPs were also predicted to reduce OGT protein stability. The results of MUpro SVM, MUpro MM, I-Mutant 2.0, and iPTREE-STAB are shown in Table 3.
Table 3nsSNPs stability profilingS/NSNPsAA ChangeI-Mutant2.0 SEQMUpro_SVMMUpro_NNiPTREE-STAB1rs2040329106Y228HDecreaseDecreaseDecreaseDecrease2rs2040334939R250CDecreaseIncreaseIncreaseDecrease3rs2040341169G341VDecreaseDecreaseIncreaseDecrease4rs2040345810L367FDecreaseDecreaseDecreaseDecrease5rs2040405196C845SDecreaseDecreaseDecreaseDecrease6rs1556046834G103RDecreaseIncreaseIncreaseDecrease7rs200109331N196KDecreaseDecreaseDecreaseDecrease
Conservation prediction of damaging nsSNPs in OGT
Consurf predicted that Y228H, C845S, and L367F would be buried and conserved, whereas G103R, N196K, R250C, and G341V would be exposed and conserved (Table 4).
Table 4. ConSurf result outputS/NAmino acid changePosSeqScoreColourConfidence intervalConfidence interval coloursB/eFunctionMsa dataResidue variety1G103R103G-0.1486-0.417, 0.0417,5e122/150G, N, A, V2N196K196N-0.8059-0.861, -0.7779,9ef127/150N, Y, S3Y228H228Y-0.2186-0.417, -0.0807,5b126/150Y, L, H, F4R250C250R-0.1046-0.350, 0.0417,5e143/150K, R, E, S, T, N, H, Q, A5C845S845C-0.0779-0.272, 0.0419,9bS147/150Q, H, Y, T, E, S, K, R6G341V341G-0.7039-0.799, -0.6609,8ef147/150S, G, C, N7L367F367L-0.789-0.849, -0.7529,9bS146/150I, Y, L
OGT structural characterisation of wild and mutant types in comparison
ERRAT and Verify-3D were used to validate the protein structure (Fig. 1). According to the Verify-3D results, 94.39% of the residues have an average 3D-ID score of 0.2. (Fig. 2a). The Ramachandran plot, which is available in PROCHECK, was used to assess the quality of the 3D protein structure (Fig. 2b). According to the plot, 91.3%, 8.0%, 0.3%, and 0.3% of the residues are in the favoured, allowed, generously allowed, and disallowed regions, respectively (Fig. 2c). This confirms the protein structure’s high quality. ERRAT also demonstrated an overall quality factor of 98.7161 (Fig. 2d), implying that the results obtained from the tools, as mentioned earlier, indicated that our modelled protein is of high quality and can be used for further investigation.
Fig. 1. The Hexosamine Biosynthesis pathway promotes protein O-GlcNAcylation by supplying the O-GlcNAc moiety for addition and removal on nuclear and cytoplasmic proteins [13]
Fig. 2A Verify the 3D plot for the modelled protein, B Ramachandran plot showing the majority of the modelled protein’s residues in the favoured region, C The Ramachandran plot statistics provide values for the residues, D the ERRAT overall quality factor is 98.716
OGT Mutant type as a potential drug target
The Glide module of the Schrödinger Maestro Suite was used to investigate the protein-ligand binding affinity of OSMI-1 and the OGT protein. OSMI-1 interacted well with the active site residues of OGT, and the docking scores for each interaction are shown in Table 5. These predictions can be validated using additional downstream analysis.
Table 5. Molecular docking results of mutant type OGT against OSMI-1S/NAmino acid changeDocking scoresInteracting residues1G103R-4.646GLU649, LYS534, ARG338, and ASN6212N196K-5.183LYS644, GLY645, and ASN6483Y228H-5.069LYS534, ASN621, ALA646, and TYR6424R250C-4.775HIS508, LYS852, THR932, PHE878, LYS908, HIS568, and LYS6445C845S-4.571GLU6496G341V-5.145ASN567, SER594, and LYS6447L367F-5.563GLU649, ALA646, TYR642, and LYS534
Discussion
OGT gene has emerged as the candidate gene associated with diabetes mellitus [35]. However, the relationship is complex and requires consideration of various factors. Several important functional regulatory factors, including SNPs, may significantly impact disease metabolism. Utilising publicly available data, we discovered seven deleterious SNPs associated with the OGT gene. Additionally, we examined the functional consequences of these SNPs, conservation analysis, protein-protein interaction network studies, and protein stability. The OGT gene is crucial in diverse cellular processes, including metabolism, insulin signalling, and stress response. Due to their potential effects on protein structure and function and, eventually, cellular processes involved in glucose metabolism and insulin signalling, deleterious single nucleotide polymorphisms (SNPs) in the OGT gene may have a major impact on diabetes. Our study shows that only the mutation points in G103R, Y228H, R250C, C845S, G341V, N196K, and L367F were found to be harmful across all four tools used, out of the 25 deleterious nsSNPs identified.
Furthermore, we characterised the identified SNPs based on their stability. Protein stability is essential for maintaining these functions. Meanwhile, unstable proteins are more susceptible to degradation by cellular machinery, reducing OGT levels and activity. A protein’s function is determined by changes in its conformational structure, which is influenced by changes in protein stability [36]. Our study shows that the protein stability of the OGT gene is impacted by the identified nsSNPs, which may negatively impact the protein’s structure and function. Decreased protein stability can alter how proteins fold, leading to abnormal protein aggregation or increased degradation [37].
Based on similarity and homology data, Consurf calculates the evolutionary profile of proteins and the effects of amino acid substitutions [23]. The evolutionary profiling of the OGT SNPs predicted all seven to be located in the conserved region. Y228H, G103R, N196K, R250C, G341V, L367F, and C845S amino acids substitute for rs2040329106, rs1556046834, rs200109331, rs2040334939, rs2040341169, rs2040345810 and rs2040405196 (Table 4). SNPs in these areas can significantly alter protein structure and function, potentially leading to disease or altered phenotype [38]. It emphasises its potential significance for understanding disease mechanisms and developing novel therapeutic strategies. Conserved regions often encode crucial parts of proteins, like active sites or binding pockets. Because the nsSNPs were found in a conserved region, a change in the amino acid sequence in those regions will affect the structural and functional profile of the OGT protein.
Our molecular docking analysis indicated that all docking scores vary between the mutants, ranging from − 4.546 to -5.563, suggesting differential binding strengths. The higher the score, the stronger the predicted binding affinity (Table 5) [39]. Overall, our docking results provide valuable insight into the potential impact of OGT mutations on OSMI-1 binding. Further experimental validation and functional analysis are crucial for conclusively understanding their effects on OGT activity and biological significance.
The current study’s strength lies in using various algorithms to obtain precise prediction results for the identified nsSNPs. These could be used as druggable reference points to discover drugs to treat diabetes mellitus. There is a need to investigate more reliable in-vitro and in-vivo investigations to corroborate these results. A significant limitation of this work, like other in-silico studies, is that all of the processes employed to predict the impact of the SNPs are computer-based.
Conclusions
The OGT protein has been linked to the progression of diabetes mellitus because it catalyses the addition of the o-GlcNAc sugar moiety on nucleocytoplasmic proteins, a substrate of the hexosamine biosynthesis pathway, increasing the amount of intracellular glucose content. In this study, 159 OGT nsSNPs in coding regions were chosen, and structural analysis of the seven nsSNPs predicted a negative impact on protein function and stability. The findings indicated that nsSNPs could be used in drug development for diabetes mellitus.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Hanover JA Mitochondrial and nucleocytoplasmic isoforms of O-linked Glc N Ac transferase encoded by a single mammalian gene Arch Biochem Biophys 200340922879710.1016/S 0003-9861(02)00578-712504895 · doi ↗ · pubmed ↗
- 2Love DC Kochran J Cathey RL Shin S-H Hanover JA Mitochondrial and nucleocytoplasmic targeting of O-linked Glc N Ac transferase J Cell Sci 200311646475410.1242/jcs.0024612538765 · doi ↗ · pubmed ↗
- 3Essawy A, Jo S, Beetch M, Lockridge A, Gustafson E, Alejandro EU. O-linked N-acetylglucosamine transferase (OGT) regulates pancreatic α-cell function in mice. J Biol Chem. Jun. 2021;296:100297. 10.1016/j.jbc.2021.100297.10.1016/j.jbc.2021.100297 PMC 794909833460647 · doi ↗ · pubmed ↗
- 4Li M-DO-Glc N Ac signaling entrains the circadian clock by inhibiting BMAL 1/CLOCK ubiquitination Cell Metab 20131723031010.1016/j.cmet.2012.12.01523395176 PMC 3647362 · doi ↗ · pubmed ↗
- 5Torres C-R Hart GW Topography and polypeptide distribution of terminal N-acetylglucosamine residues on the surfaces of intact lymphocytes. Evidence for O-linked Glc N Ac J Biol Chem 1984259533081710.1016/S 0021-9258(17)43295-96421821 · doi ↗ · pubmed ↗
- 6Pravata VM et al. ‘Catalytic deficiency of O-Glc N Ac transferase leads to X-linked intellectual disability’, Proc. Natl. Acad. Sci, vol. 116, no. 30, pp. 14961–14970, 2019.10.1073/pnas.1900065116 PMC 666075031296563 · doi ↗ · pubmed ↗
- 7Yi W et al. Aug., ‘Phosphofructokinase 1 glycosylation regulates cell growth and metabolism’, Science, vol. 337, no. 6097, pp. 975–980, 2012, 10.1126/science.1222278.10.1126/science.1222278 PMC 353496222923583 · doi ↗ · pubmed ↗
- 8Runager K Bektas M Berkowitz P Rubenstein DS Targeting O-glycosyltransferase (OGT) to promote healing of diabetic skin wounds J Biol Chem 201428995462610.1074/jbc.M 113.51395224398691 PMC 3937622 · doi ↗ · pubmed ↗
