Identification of key candidates associated with chronic hepatitis E viral infection
Zoya Shafat, Anam Farooqui, Naaila Tamkeen, Nazim Khan, Asimul Islam, Shama Parveen

TL;DR
This study identifies key genes linked to chronic hepatitis E infection, offering potential targets for better treatment strategies.
Contribution
The study discovers eight up-regulated genes associated with persistent hepatitis E infection and highlights six as consistently involved across infection stages.
Findings
Eight genes are up-regulated in persistent hepatitis E viral infection.
Six of these genes are consistently present in protein-protein interaction networks across infection stages.
The findings suggest potential therapeutic targets for chronic hepatitis E.
Abstract
An in-depth understanding of chronic hepatitis E viral infection is of interest as the underlying molecular mechanisms remain unexplored. An analysis of mRNA expression profile revealed a total of 69, 157 and 411 Differentially Expressed Genes (DEG) for mild, moderate and severe hepatitis E viral infection, respectively. We found 8 up-regulated genes BATF2, OASL, IFI44L, IFIT3, RSAD2, IFIT1, RASGRP3 and IFI27 having association with persistent hepatitis E viral infection. Of these genes, 6 (OASL, IFI27, IFIT1, IFIT3, RSAD2 and IFI44L) were in protein-protein interaction network and at each stage of infection. Thus, this data provides insights into key genes and linked pathways which could be targeted to offer better interventions for chronic hepatitis E viral infection.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHepatitis Viruses Studies and Epidemiology · Hepatitis C virus research · Liver Diseases and Immunity
Background:
Hepatitis E viral (HEV) is a quasi-enveloped virus, with a single-stranded, linear, positive-sense RNA genome [1]. hepatitis E viral is an important cause of waterborne acute hepatitis in adults in developing countries [2, 3]. Previously hepatitis E viral had been reported to have caused acute infections associated with the clearance of the virus. However, Kamar et al. reported failure of viral clearance in immunosuppressed patients, i.e., hepatitis E viral patients receiving solid-organ transplant (SOT), leading to chronic hepatitis infections [4]. Chronic hepatitis E viral infection is considered a persistent viral infection as the hepatitis E viral RNA in the patients lasts for more than six months. The chronic hepatitis E viral infection infections were initially reported in 2008 [5, 6], later have been reported by numerous European teams [7, 8-9]. These chronic infections started becoming persistent lasting for more than six months. After kidney transplantation, the incidence of hepatitis E viral infection in patients was estimated to be 2.7 cases/100 person-years [10]. Further, it was recognized that half of the recipients receiving kidney transplantation infected with hepatitis E viral develop chronic hepatitis E viral infection [11]. Moreover, the effective antiviral drug against hepatitis E viral infections is still not available. Chronic hepatitis E viral infection has been well documented in solid-organ transplant recipients; therefore, hepatitis E viral induced chronic hepatitis should be studied intensively. The hepatitis E viral infection evolution toward chronic hepatitis E viral infection seems to be dependent on the patient's immunological status. In solid-organ transplant recipients, the development of chronic hepatitis E viral infection has been linked to the type or dose of immunosuppressive drugs received [12] and one-third of chronic hepatitis E viral infection patients' clear hepatitis E viral after reduction of the dose of immunosuppressive drugs [11]. The mechanisms linked with the development of chronic hepatitis E viral infection are poorly understood. In organ transplant recipients, chronic hepatitis E viral infection has is associated with impaired hepatitis E viral specific T-cell responses [13]. Also, an earlier microarray study has shown the enhanced expression of the interferon-stimulated genes (ISGs), in chimpanzee livers infected with hepatitis E viral suggesting activation of the interferon response by hepatitis E viral [14]. Recently, with the continuous development of bioinformatics and molecular biology, microarray technology has been widely used for exploring the molecular mechanisms of various diseases [15, 16- 17]. Since limited information is available about the prevalence and impact of hepatitis E viral infection in kidney transplant patients, a more detailed study is the need of the hour. Therefore, it is of interest to present to provide insights into the influence of chronic hepatitis E viral infection in kidney trans-plantation. Herein, the original human microarray dataset GSE36539 was accessed from the NCBI-Gene Expression Omnibus database (NCBI-GEO). Following the bioinformatics approach as suggested in previous reports [18, 19]. Therefore, it is of interest to report key candidates associated with chronic hepatitis E viral infection using sequence, structure and function data.
Materials and Methods:
Methods flowchart:
The workflow of the integrative network-based method used for the present analysis is illustrated in (Figure 1 - see PDF).
Microarray data retrieval:
The gene expression dataset GSE36539, deposited by Moal et al. was retrieved from the Gene Expression Omnibus gene expression omnibus database (https://www.ncbi.nlm.nih.gov/geo/) of NCBI [20]. The dataset was generated based on the GPL6480 (Agilent-014850 Whole Human Genome Microarray 4x44K G4112F) platform. The experiment contained 8 control samples (kidney transplant recipients without HEV) and 8 infected samples (kidney transplant recipients with HEV) from whole blood tissue. These samples were categorized into three stages of infection, mild (6 samples), moderate (6 samples) and severe (4 samples).
Data pre-processing:
After GSE36539 was downloaded, probe identification numbers were transformed into gene symbols. For multiple probes corresponding to one gene, the significant expression value was taken as the gene expression value.
Identification of differentially expressed genes (DEGs):
The GEO2R online tool [21] was used to examine the differential expression of genes. GEO2R is an interactive web tool that compares two groups of samples under the same experimental conditions. The expression prfiles of healthy and infected patients were compared to identify the DEGs. The adjusted P-value (P < 0.05) and a I log_2_ (fold-change)
1 I were set as the inclusion criteria for the genes from each group. To obtain the list of overlapping DEGs, we used Venny 2.1.0, an online tool that can calculate the intersection(s) of listed elements.
Gene transition among different stages:
To understand the behavior of normal gene expression perturbation, we tried a normal way for finding the associated genes while moving from one stage to another. We identified a list of upregulated and downregulated genes of each transitional stage of infection (i.e., mild, moderate and severe). Studying the genes with differential expression at each stage of infection allows us to get an insight into the on-going mechanobiology inside the cell. In this study, we made a comparison of the gene expression profiles among healthy, mild, moderate and severe stages of infection. These transitions are discussed in brief as follows.
[1] Prodromal phase (mild infection): This segment considers Differentially Expressed Genes which are differentially expressed in between the mildly infected patient group and healthy controls. The prodromal phase includes early disease symptoms like fever, joint pain or arthritis, rash and edema.
[2] Preicteric phase (moderate infection): This section considers Differentially Expressed Genes which are differentially expressed in the moderately infected patient group and healthy controls. The preicteric phase includes the symptoms like myalgia, anorexia, fever, nausea, dark urine, diarrhea, etc..
[3] Icteric phase (severe infection): This section considers Differentially Expressed Genes which are differentially expressed in the severely infected patient group and healthy controls. The icteric phase includes the symptoms like jaundice, anorexia, skin lesions and may subside other symptoms.
Protein-Protein Interaction (PPI) network analysis:
For further evaluation of the functional interactions among differentially expressed genes, the protein-protein interaction network was constructed from these Differentially Expressed Genes using STRING (Search Tool for the Retrieval of Interacting Genes) [22], which is an online database of known and predicted protein-protein interactions. These interactions include physical and functional associations and the data are mainly derived from computational predictions, high-throughput experiments, automated text mining and co-expression networks. The Differentially Expressed Genes were mapped onto the PPI network and an interaction score of >0.7 was set as the threshold value. Thus, only the interaction pairs with a protein-protein interaction combined score of >0.7 were considered significant. Subsequently, Cytoscape 3.4 [23] was used to visualize and construct the protein-protein interaction network. Nodes with the greatest numbers of interactions with neighboring nodes were considered hub nodes (high degree nodes).
Functional and pathway enrichment analysis of DEGs:
We used DAVID (The Database for Annotation, Visualization and Integrated Discovery) online server [24] to perform the functional enrichment analysis of differentially expressed genes; this analysis included the functional categories, Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways [25, 26]. The GO analysis included 3 categories, namely, biological process (BP), cellular component (CC) and molecular function (MF), which were used to predict protein functions (15). Kyoto Encyclopedia of Genes and Genomes pathway analysis was used to assign sets of Differentially Expressed Genes to specific pathways to enable the construction of the molecular interaction, reaction and relationship networks. Benjamini-adjusted P < 0.05 and an enriched gene count >5 was chosen as the criteria for significance.
Results:
Identification of DEGs:
The microarray expression dataset GSE36539 was downloaded from the gene expression omnibus database. The Differentially Expressed Genes between controls and the disease samples were analyzed using the GEO2R tool. The total number of upregulated and downregulated genes were identified between hepatitis E viral infected patients and healthy individuals, using the adjusted P values (P<0.05) and I log_2_FC I >1. The hepatitis E viral patients were distributed into three separate datasets based on the severity of infection: mild, moderate and severe. For each of the datasets, the Differentially Expressed Genes including the upregulated and downregulated genes were identified (Table 1). Comparison of the mild group with the control samples identified 34 upregulated and 35 downregulated genes. Similarly, a comparison of the moderate group with the healthy control samples identified 138 upregulated and 19 downregulated genes. Also, a comparison of the severe group with control identified 326 upregulated and 85 downregulated genes. Thus, a total of 69, 157 and 411 specific Differentially Expressed Genes were identified for mild, moderate and severe hepatitis E viral infections, respectively.
The interrelationship between stages of infection:
The chronic hepatitis E viral infection patients were sorted into 3 groups: the first category involved patients who experienced early hepatitis E viral clearance (mild infection), i.e., within 6 months after inclusion (median time, 4 months; range, 1.3-4.6 months), the second category involved patients who experienced delayed hepatitis E viral clearance (moderate infection) i.e., >6 months after inclusion (median time, 11.5 months; range, 8.9-17.4 months) and the final category involved patients who did not experience hepatitis E viral clearance (severe infection) during data analysis time (>17.4 months after inclusion). It was noteworthy in Table 1 that with the increase in the severity of the infection, more number of genes got differentially expressed. From Table 1, we tried to find the common genes between these transitional stages. It was found that 9 upregulated genes were commonly differentially expressed among various stages of chronic hepatitis E viral infection (including mild vs. control, moderate vs. control and severe vs. control) (Table 2). However, none of the genes was common in the downregulated differentially expressed genes (Figure 2 - see PDF). The fold change of these genes increases with the increase in the severity of infection. This shows that these 9 genes are the key genes that act from start and with the increase in the upregulation of their gene expression, the infection increases and worsens.
The expression of genes BATF2, OASL, IFI44L, IFIT3, RSAD2, IFIT1, RASGRP3 and IFI27 was significantly higher in patients who did not clear the hepatitis E viral infection as compared to early and later phases of hepatitis E viral infection. However, the expression pattern for the gene OCLN followed a different pattern with the highest expression levels in patients who cleared hepatitis E viral infection at a later stage. The gene expression level in patients with no hepatitis E viral clearance was only slightly higher than patients who cleared the hepatitis E viral infection in early phase. The upper panels of (Figure 3 - see PDF) represent volcano plot of the upregulated and downregulated chronic hepatitis E viral infection genes (Figure 3A - see PDF) while the heat map shows the expression pattern of obtained 8 key genes (BATF2, OASL, IFI44L, IFIT3, RSAD2, IFIT1, RASGRP3 and IFI27) associated with chronic hepatitis E viral infection in different stages (mild, moderate and severe) (Figure 3B - see PDF). The lower panels show the chord plot (Figure 3C - see PDF) depicting the relation between key genes and hepatitis E viral stages of infection. While the correlation heatmap (Figure 3D - see PDF) shows the correlation of the 8 key genes (BATF2, OASL, IFI44L, IFIT3, RSAD2, IFIT1, RASGRP3 and IFI27) with their correlation values.
Gene term enrichment analysis of degs:
Further, the identified Differentially Expressed Genes were systematically characterized to explore their functions and pathways. The Differentially Expressed Genes were classified into 3 functional categories: biological process (BP), cellular component (CC) and molecular function (MF). The Differentially Expressed Genes that showed significant enrichment are listed in the table (Table 3, Table 4, Table 5). In the molecular function group, the genes were mainly linked to binding and catalytic activities. The moderate and severe gene functions included RNA binding, oligoadenylate synthetase activity, helicase activity, transferase activity, receptor activity, GTP binding, nucleotidyltransferase activity. The mild genes were involved in antigen binding, immunoglobulin receptor binding and RNA polymerase II activating transcription factor binding. In the biological process group, the top GO terms which moderate and severe genes were mainly enriched with included defense response to viruses, negative regulation of viral genome replication, interferon signaling pathways and immune response. In addition, the mild genes were also associated with complement activation, classical pathway, receptor-mediated endocytosis, response to estradiol and response to lithium-ion. In the cellular component group, the mild genes were related to the blood micro particle and mitochondrial outer membrane. The moderate and severe genes were associated with clathrin-coated endocytic vesicle membrane, cytosol, cytoplasm, MHC class II protein complex, an integral component of lumenal side of endoplasmic reticulum membrane.
Kyoto Encyclopedia of genes and genomes pathway analysis:
For the mild genes group, non-availability of the significantly enriched pathways was observed. The significant signal pathways of moderate and severe genes were mainly enriched with influenza A, Herpes simplex infection, measles, Hepatitis C, RIG-I-like receptor signaling pathway, cytosolic DNA-sensing pathway and Staphylococcus aureus infection Table 6. Additionally, the moderate genes were enriched with asthma, rheumatoid arthritis, cytokine-cytokine receptor interaction and severe genes with mineral absorption, antigen processing and presentation and graft-versus-host disease.
PPI network for transitional stages of infection:
The PPI network with a 0.7 confidence level and no interactors in 1st and 2nd shells was constructed for different stages of infection: (A) mild infection (Figure 4A - see PDF), (B) moderate infection (Figure 4B - see PDF) and (C) severe infection (Figure 4C - see PDF). We identified 6 genes found to be common in all three levels of infection whose expression level increases with the increase in the level of infection (OASL, IFI27, IFIT1, IFIT3, RSAD2, IFI44L). Though previously identified, there were a total of 9 genes common among all three stages, here only 6 of these genes made into the PPI network. It is noteworthy that these six genes form motif with each other. This means that they work together in combination. Surprisingly these genes are low in degree; however they form a motif together. Furthermore, the top nodes were extracted using the Origin 8.0. On the basis of degree centrality, we extracted the top nodes with their degree score (at least 40 degree) from the main network (Figure 5 - see PDF). The Figure 5 (see PDF) shows the seed genes in the decreasing order of their degree with UBA52 possessing the highest degree and GBP1 is having the lowest degree.
Discussion:
Patients with kidney transplantation have progressively been reported with chronic hepatitis E viral infection since 2008 [5, 6], but the underlying molecular mechanisms leading to the development of this disease remain obscure/unexplored. In this study, transcriptional profiles of the kidney transplant recipients with chronic hepatitis E viral infection were compared with the matched kidney transplant recipients without hepatitis E viral infection for the selection of DEGs. The patients with chronic hepatitis E viral infection were separated into 3 groups according to the time of hepatitis E viral clearance (early, late, or no hepatitis E viral clearance at the time of the analysis). Our analysis revealed a total of 69, 157 and 411 specific Differentially Expressed Genes which included 34 upregulated and 35 downregulated genes, 138 upregulated and 19 downregulated genes and 326 upregulated and 85 downregulated genes for mild, moderate and severe in chronic hepatitis E viral infection respectively. The gene term enrichment analysis of Differentially Expressed Genes associated with mild, moderate and severe stages mostly was associated with signaling processes. More specifically, the top ten Differentially Expressed Genes were mainly involved in interferon and signaling pathways which are consistent with the previous observations made in chronic hepatitis E viral infection patients [27]. The hepatitis E viral genome is organized into three open-reading frames (ORFs), i.e., ORF1, ORF2 and ORF3. The ORF1 polyprotein is further subdivided into multiple domains and is majorly attributed to viral replication. The domains include methyltransferase domain (Met), Y-domain (Y), papain-like-cysteine protease domain (PCP), hyper variable region domain (HVR), X-domain (X), helicase domain (Hel) and RNA-dependent RNA polymerase domain (RdRp) [28]. Previous investigations have explored some of the domains that suggested their functional implications, such as, transferase activity, RNA binding, helicase activity, zinc-ion binding and protein binding as major molecular functions. The Met domain has been suggested as a putative methyltransferase [29]. A highly conserved α-helix counterpart 'LYSWLFE' (aa 410-416) has been predicted in the Y-domain, required for its cytoplasmic membrane binding and efficient replication [30]. The presence of potential Appr1"-pase active site (Asn806, Asn809, His812, Gly815, Gly816 and Gly817) in the X-domain has demonstrated its significant catalytic/regulatory function in hepatitis E viral replication [31, 32]. The Pro domain has been shown to be involved in the disorder-to-order state transition required for binding to various substrates. Also, earlier it has been suggested that Pro domain is not required for the replication of a viral and its infectivity, but performs a role in replication efficiency [33, 34]. The Hel domain consists of two conserved motifs: Walker A (GVPGSGKS; aa 975-982) and Walker B (DEAP; aa 1029-1032), which has been demonstrated to participate in purine nucleoside triphosphate (NTP)-binding activity [35, 36]. The RdRp protein found in ORF1 OF hepatitis E viral (positive sense) is necessary for its genome replication [37, 38]. The ORF2 encodes the capsid protein which forms the hepatitis E viral virion major structural component [39]. ORF2 (110 amino acids at N-terminus) interacts with the 5' region (encapsidation signal) of hepatitis E viral genomic RNA [40].
Additionally, ORF2 N-terminus entails a signal peptide followed by an arginine-rich domain, which has shown involvement in viral RNA encapsidation during the assembly process [41]. ORF3 is a phosphoprotein consisting of about 113 to 114 amino acid residues that perform a crucial function in the viral egress or release from infected cells [42, 43]. The ORF3 protein has been demonstrated to participate in interaction with several host proteins in addition to interaction with the ORF2 protein [44]. Additionally, the proline-rich motif (PSAP) in the C-terminal region has been reported to play a role in the activity of ESCRT machinery by interacting with Tsg101 [45]. Thus, these predicted functions further substantiate our findings. Interestingly, in our study, we found the upregulated expression levels of 8 genes BATF2, OASL, IFI44L, IFIT3, RSAD2, IFIT1, RASGRP3 and IFI27. The expression of these genes was significantly higher in chronic hepatitis E viral infection patients who did not clear the hepatitis E viral infection as compared to chronic hepatitis E viral infection patients with early hepatitis E viral clearance. Also, it was revealed that the gene expression level in chronic hepatitis E viral infection patients who had late hepatitis E viral clearance was intermediate between the hepatitis E viral patients with early clearance and no hepatitis E viral clearance. BATF2 is a transcription factor and is involved in T cell receptor signaling pathway [46, 47]. BATF2 regulates the expression of target genes by binding as a heterodimer on recognizing the target DNA sequence. BATF2 (SARI) was found to be upregulated in chronic hepatitis E viral infection patients in an earlier study [27]. OASL is a gene known to encode proteins known to be antiviral effectors and is mainly linked with signaling pathways [48]. Upregulated expression of the OASL has previously been reported in chronic hepatitis E viral infection patients [27]. IFI44L, a paralog of IFI44, has been reported to encode a protein that induces interferons. IFI44L a 44-kD protein originally identified as the HCV-associated microtubular aggregate in the hepatocytes of chimpanzees infected with HCV [49]. IFI44L has also been demonstrated with high anti-HCV antiviral activity [50]. RSAD2 (viperin) has been reported as one of the most highly induced interferon effector proteins [51]. Previous investigation has shown the upregulated expression of RSAD2 in chronic hepatitis E viral infection patients [27]. RASGRP3 is known to encode a guanine nucleotide exchange factor protein. IFIT1 and IFIT3 genes belonging to the IFIT complex are known to antagonize the viruses by their nucleic acid sequestration. Earlier, the upregulated expression of IFIT complex has been documented in chronic hepatitis E viral infection patients [27]. IFI27 codes for interferon-inducing protein. It functions in TNFSF10-induced apoptosis and is also involved in the interferon-induced negative regulation of the transcriptional activity. However, the OCLN gene had a peculiar characteristic gene expression pattern from the other upregulated genes. OCLN gene is an essential entry factor in HCV cell and encodes a membrane protein and is involved in cytokine-induced regulation [52]. Taken together, these findings from the present study propose that the association of these upregulated genes BATF2, OASL, IFI44L, IFIT3, RSAD2, IFIT1, RASGRP3 and IFI27 is associated with persistent hepatitis E viral infection. Also, it was important to discuss that though our study identified a total of 9 genes that were common among three stages, but only 6 of them (OASL, IFI27, IFIT1, IFIT3, RSAD2 and IFI44L) made into the PPI network whose expression level increases with the increase in the level of infection. Thus, these selected 6 genes were found to be common at each stage of infection. It is noteworthy to mention that these six genes formed motif with each other and thus work together in combination. To sum up our results, we can conclude that patients with no hepatitis E viral clearance had significantly higher hepatitis E viral loads than patients who cleared the hepatitis E viral infection, irrespective of early or late stages. This clearly indicates that hepatitis E viral loads are associated with the timing of hepatitis E viral and is in agreement with the previous investigation carried out in chronic hepatitis E viral infection patients [27]. Previous microarray-based report offered significant insight into the pathogenesis of hepatitis E viral associated with persistent hepatitis E viral infection. It is worth mentioning that some of the upregulated genes discovered in our analysis presented the same genes as in the previous report that demonstrated the upregulation of EPSTI1, ISG15, IFIT1, IFI44L and RSAD2 [27]. However, our study captured other additional genes BATF2, OASL, IFIT3, RASGRP3 and IFI27 and explored the other important aspect of genes specific to stage transition of hepatitis E viral Further studies are envisaged to explicate the functional role of these upregulated genes in chronic hepatitis e viral infection These findings further support the hypothesis proposed by the current study.
Conclusions:
Our understanding of chronic hepatitis E viral infection pathophysiology has improved through this analysis. However, exploration of possible mechanisms for chronic hepatitis E viral infection is warranted.
Declarations:
Availability of data and material:
Not applicable
Funding:
Not applicable
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Reyes G.R Science. 19902471335210757410.1126/science.2107574 · doi ↗ · pubmed ↗
- 2Dalton H.R Lancet Infect Dis. 200886981899240610.1016/S 1473-3099(08)70255-X · doi ↗ · pubmed ↗
- 3Purcell R.H Emerson SU.J Hepatol. 2008484941819205810.1016/j.jhep.2007.12.008 · doi ↗ · pubmed ↗
- 4Kamar NJ Clin Exp Hepatol. 201331342575548710.1016/j.jceh.2013.05.003PMC 3940092 · doi ↗ · pubmed ↗
- 5Kamar NN Engl J Med. 20083588111828760310.1056/NEJ Moa 0706992 · doi ↗ · pubmed ↗
- 6Gerolami RN Engl J Med. 20083588591828761510.1056/NEJ Mc 0708687 · doi ↗ · pubmed ↗
- 7Haagsma E.B Liver Transpl. 2008145471838308410.1002/lt.21480 · doi ↗ · pubmed ↗
- 8Haagsma E.B Liver Transpl. 20091512251979014710.1002/lt.21819 · doi ↗ · pubmed ↗
