Genomic Insights into Bacterioruberin and Halorhodopsin Biosynthetic Genes in 12 Halophilic Archaea Isolated from Korean Solar Salterns
Dongseok Lee, Chi Young Hwang, Eui-Sang Cho, Young-Hyun You, Myung-Ji Seo

TL;DR
This study analyzes the genomes of 12 halophilic archaea from Korean solar salterns to understand their adaptation to high-salt environments through bacterioruberin and halorhodopsin biosynthesis.
Contribution
The study reveals the genetic diversity and conservation of bacterioruberin and halorhodopsin biosynthetic genes in halophilic archaea.
Findings
Core bacterioruberin biosynthesis genes are conserved across all strains, but crtI is missing in some.
Halorhodopsin gene (hop) is present in only 7 of the 12 strains.
crtD compensates for the absence of crtI in certain strains.
Abstract
Halophilic archaea are extremophilic microorganisms uniquely adapted to thrive in hypersaline environments such as solar salterns, saline lakes, and brines. Their ability to survive under high-salt conditions is closely associated with the production of unique compounds specifically synthesized by haloarchaea, including bacterioruberin and halorhodopsin. Bacterioruberin is a carotenoid pigment that protects cells from oxidative stress and contributes to osmotic stress resistance. Halorhodopsin is a light-driven Cl− pump that helps maintain ionic homeostasis. These functional molecules play crucial roles in osmotic stress resistance and energy conversion under extreme conditions. Therefore, understanding their genomic information is essential to uncover the molecular mechanisms underlying their remarkable adaptation and survival in high-salt conditions. In this study, we performed…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7- —National Research Foundation of Koreahttp://dx.doi.org/10.13039/501100003725
- —Ministry of Science and ICT, South Koreahttp://dx.doi.org/10.13039/501100014188
- —National Institute of Biological Resourceshttp://dx.doi.org/10.13039/501100005880
- —Ministry of Environmenthttp://dx.doi.org/10.13039/501100003562
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMicrobial Community Ecology and Physiology · Photosynthetic Processes and Mechanisms · Origins and Evolution of Life
Introduction
Halophilic archaea are remarkable microorganisms uniquely adapted to thrive in hypersaline environments, including saline lakes, solar salterns, seawater, and salted fermented foods. According to the current taxonomic classification within the domain Archaea, these organisms belong to the class Halobacteria within the phylum Methanobacteriota [1]. They were first identified over a century ago in diverse contexts, including cured fish, animal products, and aquatic environments, with colonies exhibiting red, pink, or purple color [2, 3]. Various biomolecules produced by haloarchaea often exhibit unusual properties due to their adaptation to survive in high-salt environments [4]. For example, haloarchaeal enzymes, such as proteases, amylases, and lipases, demonstrate remarkable stability and activity in high-salt, aqueous, and non-aqueous conditions [5, 6]. Haloarchaea produces polyhydroxyalkanoates as intracellular carbon and energy storage under high-salinity harsh conditions [7, 8]. In addition, they adapt to high-salinity environments with reduced dissolved oxygen by producing gas vesicles, allowing them to float on brine surfaces and access atmospheric oxygen for respiration [4].
Among these adaptation strategies, bacterioruberin (BR) and halorhodopsin (HR) are unique compounds produced by haloarchaea, playing crucial roles in their adaptation and survival in hypersaline environments. Especially, BR is a C_50_ carotenoid pigment predominantly found in haloarchaea, characterized by its 13 conjugated double bonds, which confer superior antioxidant and radical scavenging properties [9-11]. This carotenoid stabilizes the cellular membrane under hypersaline conditions, acts as a water barrier, allows selective ion and oxygen transport, and provides protection against UV radiation, oxidative stress, and DNA damage, thereby enabling survival in extreme environments [12]. Research on haloarchaeal carotenoids began in the 1960s [13], with major discoveries like bacterioruberin and its derivatives reported in the 1970s [14], but detailed studies on their biosynthetic pathways only emerged in 2015 [15]. Only a few limited genera of genomic and pathway-related studies have been conducted on C_50_ carotenoid biosynthesis in genera such as Haloferax [16], Halorubrum [17], Haloarcula [15, 16], and Halobellus [18]. Exploring and comparing the genes associated with BR biosynthesis across diverse haloarchaeal genera seems necessary to better understand their physiological traits.
Rhodopsins are seven-transmembrane proteins consisting of a protein component (opsin) and a light-absorbing chromophore (retinal) [19]. In 1980, microbial type-1 rhodopsins were first identified in extremely haloarchaea including an inward Cl^-^-pumping rhodopsin HR [20]. This process contributes to increasing the electrochemical potential of the proton gradient which is essential for energy production [21]. In addition, the process of Cl^-^ transport in a high-salt environment directly supports the cell's ability to manage ionic and osmotic stress according to adapting to hypersaline conditions [22]. A light-driven pump HR enables haloarchaea to maintain osmotic balance without relying on ATP, conserving metabolic energy [22, 23]. It is a critical adaptation strategy for haloarchaea to survive in energy-limited hypersaline environments. However, HR genes exhibit distinct evolutionary patterns that reflect their functional differences and critical role in environmental adaptation, particularly through mechanisms such as lateral gene transfer and gene loss observed across diverse haloarchaeal lineages [21]. Therefore, discovery and comparative analysis of HR gene sequences across diverse haloarchaeal species are essential for understanding their evolutionary adaptations.
In this study, whole-genome sequencing and comparative genomic analyses were performed on 12 haloarchaea isolated from Korean solar salterns to investigate the biosynthetic genes of bacterioruberin and halorhodopsin. We focused on the diversity and evolutionary patterns of these genes across haloarchaeal lineages to better understand their adaptation to hypersaline environments.
Materials and Methods
Haloarchaeal Strain Cultivation and Genomic DNA Extraction
Twelve haloarchaeal strains were isolated from solar salterns in Sorae (37°24'08.6"N 126°45'22.7"E), Republic of Korea. The DB Characterization medium No. 2 used for strains cultivation was prepared based on the composition described in Ref. 24 (Table S1) [24].
Genomic DNA extraction from archaeal samples was performed using a nitrogen-powered homogenization method that effectively processes tissue samples while preserving DNA integrity. For each ground sample, a lysis buffer was added, consisting of 50 mM Tris-HCl, 20 mM EDTA, 1.4 M NaCl, 2% CTAB, and 1% PVP. The homogenates were incubated for 30 min at 65°C. Following lysis, the samples were extracted with an equal volume of phenol:chloroform:isoamyl alcohol (25:24:1) and mixed thoroughly to form a complete emulsion. The mixture was centrifuged at 14,000 ×g in a microfuge for 5 min to separate the phases, and the upper phase containing the DNA was carefully transferred to a new tube. The samples were then re-extracted with an equal volume of chloroform:isoamyl alcohol (24:1), centrifuged at 14,000 ×g for 5 min, and the upper phase was transferred to a new tube. The aqueous phase was transferred to new tubes containing 2/3 of the volume of isopropanol, mixed by inversion, and allowed to precipitate at room temperature for 30 min before centrifugation at 12,000 ×g for 15 min at 4°C. The resulting DNA pellet was washed twice in ice-cold 70% (v/v) ethanol and resuspended in Tris-HCl buffer (pH 8.0).
Whole Genome Sequencing and De Novo Assembly
DNA library was prepared using the TruSeq Nano DNA library Prep Kit (Illumina, USA) following manufacturer’s instructions. For sample library preparation, inserting 550 bp size of high molecular weight genomic DNA was randomly sheared to yield DNA fragments using a Covaris S2 ultrasonicator (Covaris Inc., USA). The fragments were blunt ended and phosphorylated, and a single A nucleotide was added to the 3' ends of the fragments in preparation for ligation to an adapter with a single-base T overhang. Adapter ligation at both ends of the genomic DNA fragment conferred different sequences at the 5' and 3' ends. Ligated DNA was PCR amplified, and fragment size distribution was assessed using a Tapestation electrophoresis system (Agilent Technologies, USA). Library preparation and sequencing were performed by DNA Link (Republic of Korea) and Phyzen (Republic of Korea). After quantitative PCR (qPCR) using KAPA SYBR FAST qPCR Master Mix (Kapa Biosystems, USA), libraries with index tags were combined in equimolar amounts into a pool. Sequencing was performed using an Illumina NovaSeq 6000 system following the provided protocols for 2 × 150 sequencing.
Genomic DNA was mechanically sheared into 10 kb fragments using the Megaruptor v3 system (Diagenode, USA). Using the AMpureXP bead purification system to remove the small fragments. A total of 500 ng for each sample was used as input into library preparation. The SMRTbell library was constructed by using SMRTbell Express Template Preparation Kit v2.0 (101-685-400). Pool samples according to the volumes provided by the Microbial Multiplexing Calculator. Using the AMpureXP bead purification system to remove the < 3 kb small fragments for large-insert library. After a sequencing primer is annealed to the SMRTbell template, DNA polymerase is bound to the complex (Sequel II Binding kit 3.2). Purify the complex using AMPure Purification to remove excess primer and polymerase prior to sequencing. The SMRTbell library was sequenced using SMRT cells (Pacific Biosciences, USA) using Sequel II Sequencing Kit v2.0 and 1 × 15 h movies were captured for each SMRT Cell 8M using the Sequel II (Pacific Biosciences) sequencing platform [25]. For de novo assembly, the Hifiasm assembler (v0.16.1-r375) was used with parameters default [26]. HiFi data quality was assessed based on average quality scores ranging from Q31 to Q21 and average pass counts ranging from 15 to 17 for each sample.
Phylogeny Analysis and Genome Comparison
Full length of 16S rRNA gene sequences of 12 species were obtained through whole-genome sequencing described above and compared with those of related species using EzBioCloud (https://www.ezbiocloud.net/). Multiple sequence alignments (MSA) were performed using the ClustalW tool in BioEdit version 7.2.5 to evaluate sequence similarity between each species and other closely related species [27]. Phylogenetic trees were constructed using three different algorithms including Maximum Likelihood (ML), Neighbor-Joining (NJ), and Maximum Parsimony (MP) using Molecular Evolutionary Genetics Analysis (MEGA) version 12 with 1,000 bootstrap replicates applied to assess the robustness of each tree [28-31]. The Kimura two-parameter substitution model was employed to calculate evolutionary distances between species [32].
To conduct genome-based analyses, genome sequences of closely related 33 species were obtained from the NCBI database (https://www.ncbi.nlm.nih.gov/) [33]. Approximately five closely related reference species were selected for the 12 isolates based on 16S rRNA gene similarity, and due to overlap among the selected taxa, a total of 33 reference species were included. The whole-genome based phylogenomic tree was constructed using the bacterial genome tool provided by the BV-BRC version 3.50.5 (https://www.bv-brc.org/), with all parameters set to the platform’s default values [34]. To strengthen the taxonomic resolution, a phylogenomic tree was reconstructed using the Up-to-date Bacterial Core Gene (UBCG) pipeline [35], based on a concatenated alignment of conserved core genes from genomes. The final nucleotide alignment was used to infer the phylogenetic tree with the GTR with discrete Gamma model and bootstrap analysis was carried out using 1,000 replications. The core gene-based phylogenomic tree was constructed using RAxML and iTOL [36, 37].
OrthoANI values were computed using OAT software (v0.93.1) to assess genomic similarity [38]. The analysis was performed with bidirectional ANI calculation enabled, and the average value of the reciprocal comparisons was used. Genome-to-genome distances were estimated using the GGDC Form 2 option [24]. AAI values were determined using EzAAI v1.2 [39], while isDDH values were calculated using the Genome-to-Genome Distance Calculator (GGDC 2.1) with the BLAST+ alignment and formula 2 (identities/HSPlength) [40]. Additionally, intergenomic distances were used to construct a balanced minimum evolution tree with branch support via FASTME (v2.1.4) using the Type Strain Genome Server (TYGS).
Genome Analysis and Annotation
To facilitate gene analysis, gene recognition and translation were performed using Prodigal v2.6.3 for haloarchaeal gene prediction [41], across the genomes of 12 analyzed haloarchaea. MSA was conducted using the ClustalW multiple alignment algorithm in ClustalW tool in BioEdit version 7.2.5 without specifying a particular reference sequence [27]. Residues conserved in 70% or more of the aligned sequences were shaded to highlight highly conserved regions following the threshold commonly applied in previous MSA studies [42]. This step served as a foundational process to identify conserved regions and assess sequence similarities across the genomes, facilitating subsequent gene identification and functional annotation. The annotation of BR biosynthetic genes was performed using antiSMASH 7.0 (https://antismash.secondarymetabolites.org/) [43] to identify and analyze biosynthetic gene clusters (BGCs) in haloarchaeal genomes with strict detection criteria. Additionally, Rapid Annotation using Subsystem Technology (RAST, https://rast.nmpdr.org/rast.cgi) analysis was performed to detect BR and HR biosynthetic genes [44]. For the analysis of the BR biosynthetic pathway, six key structural genes were selected: crtE (geranylgeranyl pyrophosphate synthase), crtB (phytoene synthase), crtI (phytoene desaturase), lyeJ (lycopene elongase), crtD (carotenoid-3,4-desaturase), and cruF (carotenoid 2'',3''-hydratase) [14]. These genes are responsible for the dedicated steps converting the C_40_ carotenoid backbone into the BR [15]. To identify the genes involved in HR biosynthesis, a reference library of previously reported amino acid sequences was curated [45], followed by an analysis of conserved regions within the biosynthetic genes. Based on this analysis, the hop gene, which encodes halo-opsin, a retinal binding membrane protein functioning as light-driven chloride pump [46], was selected as a target for assessing the potential for HR biosynthesis in 12 haloarchaeal strains subjected to genome analysis. Subsequently, comparative analysis between the annotated sequences and the reference library was conducted, allowing for the identification of hop genes within the analyzed haloarchaeal genomes.
Comparison of Bacterioruberin and Halorhodopsin Biosynthetic Genes
To assess the orthology of BR biosynthetic genes in 12 haloarchaeal species, sequence comparison was performed using BioEdit software. Amino acid sequence similarities of BR biosynthetic genes of 12 species of haloarchaea were compared and analyzed. And identified genes were compared with orthologous genes from Halobacterium salinarum DSM 3754^T^, which are the first strain reported bacterioruberin biosynthesis [47]. For interspecies similarity analysis of hop, the amino acid sequences of the identified hop genes were analyzed using the MEGA 12 software [31]. A phylogenetic tree was constructed based on the ML method, using reference sequences for comparison. The resulting tree was then examined to identify clustering patterns.
Results
Sequencing, Assembly, and Phylogenetic Analysis
Taxonomic analysis of the 16S rRNA gene sequences from whole genome sequencing of 12 strains (MBLA0001, MBLA0010, MBLA0028, MBLA0071, MBLA0123, MBLA0131, MBLA0133, MBLA0135, MBLA0145, MBLA0170, and MBLA0217) confirmed that the most closely related strains had 99.18-100% similarity by classification analysis (Table 1).
Phylogenetic analysis based on the 16S rRNA gene sequences was conducted using three different algorithms (ML, NJ, and MP) (Fig. S1). Most genera clustered together consistently across the different algorithms. However, strains belonging to the Haloarcula genus did not form a single clade but instead exhibited divergent lineage patterns, with varying positions depending on the method used. As a follow-up to the 16S analysis, phylogenetic analysis based on whole-genome sequences revealed six clearly separated groups. Twelve haloarchaeal strains analyzed in this study were distributed across these groups according to their respective genera (Fig. 1), each forming distinct clades with high genomic similarity. While most genera formed stable and consistent clusters, Haloarcula strains were not grouped into a single lineage but were distributed across multiple clades throughout the whole genome based phylogenomic tree (Fig. 1A). However, the UBCG phylogenomic tree, which was constructed using whole genome sequences, was confirmed that each clade was clearly distinguished at the genus level (Fig. 1B). In addition, UBCG phylogenomic trees showed close clustering of the isolated strains and reference strains.
General Features of the Genomes
The general features of the genomes were shown in Table 2. Twelve strains in the order Halobacteriales belong to the three families Haloarculaceae, Halobacteriaceae, and Haloferacaceae. The Haloarculaceae species (MBLA0131, MBLA0133, MBLA0135, and MBLA0170) had an average G+C content of 64.18 mol% with the 4 draft genomes from three genus having an average size of 4.01 Mbp (standard deviation of 47,976 bp). The Halobacteriaceae species (MBLA0001, MBLA0010, and MBLA0217) had an average G+C content of 66.32 mol% with the 3 draft genomes from two genus having an average size of 2.74 Mbp (standard deviation of 234,194 bp). The Haloferacaceae species (MBLA0028, MBLA0071, MBLA0123, MBLA0129, and MBLA0145) had an average G+C content of 66.57 mol% with the 5 draft genomes from three genus having an average size of 3.52 Mbp (standard deviation of 485,927 bp). The family Halobacteriaceae were smaller genome size (<3.0 Mbp) compared to other families Haloarculaceae and Haloferacaeae (>3.0 Mbp). The G+C content of the haloarchaeal strains showed the ranging from 62.1 to 68.7 mol%, which is consistent with previous reports indicating a high G+C content in the genomic DNA of haloarchaea [48]. Furthermore, genomic annotation used in this study predicted the number of coding sequences (CDSs) and RNAs for each strain, as summarized in the table below (Table 2).
Genome Comparison
To confirm that each strain represents a distinct species at the genomic level, OrthoANI, isDDH, and AAI analyses were performed to determine whether 12 strains similarity values exceeded the respective threshold delimitation (less than 95–96% of OrthoANI, 95–96% of AAI, and 70% of isDDH) (Fig. 2) [39, 49, 50]. Genome comparisons were conducted between the 12 haloarchaeal strains analyzed in this study and 33 reference strains. The OrthoANI, AAI, and isDDH results consistently highlighted phylogenomic divergence among the strains. According to the OrthoANI analysis, pairwise similarity values ranged from 69.5-96.7%. The highest similarity (96.7%) was observed between Haloferax denitrificans MBLA0123 and Halodesulfurarchaeum formicicum HSR6, exceeding the species-level threshold of 95-96%. The lowest OrthoANI value (69.5%) was found between Haloarcula marismortui MBLA0131 and Halobacterium zhouii XZYJT26. In the AAI analysis, similarity values ranged from 60.52- 96.92%. The highest AAI value (96.92%) was found between Haloferax denitrificans MBLA0123 and Haloferax sulfurifontis ATCC-BAA897, exceeding the species-level threshold of 95-96%. The lowest value (60.52%) was observed between Halorubrum trapanicum MBLA0071 and Halodesulfurarchaeum formicicum HSR6. Notably, the pair MBLA0123 and HSR6, which exceeded the threshold in OrthoANI, showed a lower AAI of 60.94%, suggesting method-dependent interpretation. isDDH analysis revealed values ranging from 18.5-68.5%. The highest value (68.5%) was found between Haloferax denitrificans MBLA0123 and Haloferax sulfurifontis ATCC-BAA897, not exceeding the species delineation cutoff of 70%. The lowest value (18.5%) was identified between Halobacterium salinarum MBLA0001 and Halodesulfurarchaeum formicicum HSR6. Additionally, MBLA0123 and HSR6 also displayed a low isDDH value of 18.7%, confirming their genomic distinctiveness.
Genome Annotation
Predicted proteins were functionally categorized using the COGs database, and the COG categories were compared across the genomes of 12 haloarchaeal strains in this study (Fig. 3 and Table S2) [51]. The categories of predicted functional genes classified in the COG database with the major portion (>5%) in all strains were transcription (K, 6.70–8.13%), energy production and conversion (C, 5.07–6.13%), and amino acid transport and metabolism (E, 6.50–9.94%). In contrast, chromatin structure and dynamics (B), intracellular trafficking, secretion and vesicular transport (U) showed lower portion (<1%) in all strains. The percentage of genes annotated under COG T and J, representing signal transduction mechanism and translation, ribosomal structure, and biogenesis, were different in strains MBLA0001, MBLA0010, and MBLA0217, which belong to the family Halobacteriaceae compared to other families. Strains belonging to the family Halobacteriaceae exhibited a lower percentage of genes under COG T, ranging from 2.14% to 2.96%, compared to strains from other families, which ranged from 3.18% to 5.15%. While strains belonging to the family Haloarculaceae exhibited a COG T ratio of approximately 4.41%, which was higher than that observed in other families. For COG J, Halobacteriaceae strains showed higher percentages, with MBLA0001 at 6.34%, MBLA0010 at 5.90%, and MBLA0217 reaching 6.82%, compared to strains from other families, which ranged from 4.24% to 5.51%. For COG V, representing defense mechanisms, accounted for less than 1% in most strains belonging to the family Haloferacaceae, showing a lower value than other families. For COG L, which includes proteins involved in replication, recombination, and repair, strain MBLA0001 contained 275 genes (approximately 10.90% of total annotated COG), compared to minimum 3.24% in MBLA0071 and maximum of 6.56% in MBLA0133.
Comparison of Bacterioruberin Biosynthetic Genes
Analysis of BR biosynthetic genes in 12 haloarchaeal strains revealed the presence of BR-related genes in all strains except the crtI gene. The amino acid sequences of the identified BR biosynthetic genes were compared to their homologs in Halobacterium salinarum DSM 3754^T^ to assess sequence similarity (Table S3). The results showed a sequence similarity ranging from 48.6% to 100% across all genes. A more detailed genomic analysis revealed that strains belonging to the Haloarculaceae family exhibited sequence similarity between 48.6% and 72.0% (average: 59.8%), those in the Halobacteriaceae family ranged from 49.5% to 100.0% (average: 78.1%), and the Haloferacaceae family displayed sequence similarity within the 48.6% to 72.7% range (average: 57.9%). To further investigate these relationships, phylogenetic trees were constructed based on the amino acid sequences of the identified BR biosynthetic genes (Fig. S2). The resulting phylogenetic analyses revealed that, in genera containing two or more species, branches were observed to diverge from a common root.
MSA confirmed the presence of conserved regions in BR biosynthetic genes across all 12 strains (Figs. 4 and S3). Additionally, amino acid sequence similarity analysis of BR biosynthetic genes across 12 haloarchaeal species revealed sequence identities ranging from 46% to 98.2% (Fig. 5). Among the analyzed genes, crtE exhibited the highest level of conservation (62.4–98.2%, average 69.7%). This confirms that the prenyltransferase domain, particularly the DDXXD motifs (Fig. 4A), which is essential for coordinating Mg^2+^ ions and binding the diphosphate substrate to initiate the prenyltransferase reaction [52], is conserved across all 12 analyzed haloarchaeal species. In contrast, crtB exhibited the lowest average sequence similarity (46.0–97.2%, average 57.0%). Despite this overall variability, crtB retained conserved regions with aspartate-rich motifs, which serve as the substrate-binding and Mg^2+^-binding sites, and a hydrophobic flap, which plays a critical role in substrate positioning and catalysis [53], were found to be conserved in all analyzed species.
The crtI gene was identified in only seven out of the 12 haloarchaeal strains (MBLA0001, MBLA0028, MBLA0071, MBLA0131, MBLA0133, MBLA0145, and MBLA0170), with sequence similarity ranging from 50.6% to 94.7% (average 60.4%). Similarly, the crtD gene, which encodes a desaturase, exhibited sequence similarity ranging from 63.9% to 98% (average 72.0%). MSA analysis of crtI and crtD revealed multiple conserved regions between the two genes (Fig. S4).
The gene lyeJ confirmed all strains analyzed in this study and exhibited sequence similarity ranging from 49.5% to 93.1%, average 59.5% (Fig. S3B). Similarly, cruF, involved in the final step of BR biosynthesis, showed sequence similarity ranging from 49.6% to 96.9% (average 58.3%). MSA of 12 species of cruF revealed conserved amino acid regions (Fig. S3D).
Comparison of Halorhodopsin Biosynthetic Genes
The hop genes were identified in seven species including members of Haloarculaceae (MBLA0131, MBLA0133, MBLA0170), Halobacteriaceae (MBLA0001), and Haloferacaceae (MBLA0028, MBLA0071, MBLA0145) (Table 3). Although found in all families, no hop gene was identified in the MBLA0010, MBLA0123, MBLA0129, MBLA0135, and MBLA0217. MBLA0010 belongs to the genus Halobacterium, where the hop gene was confirmed in MBLA0001, a strain from the same genus. Similarly, MBLA0135 belongs to the genus Haloarcula, with the hop gene identified in MBLA0131 and MBLA0133, which also belong to this genus. The amino acid sequence similarity of the hop gene among the haloarchaea analyzed in this study ranged from 57.0% to 96.7%. MSA revealed several conserved sequence regions in the amino acid sequences of hop genes of seven strains (Fig. 6A). These conserved regions were observed across species belonging to different families.
A phylogenetic analysis based on amino acid sequences revealed the formation of three distinct groups (Fig. 6B). Group A primarily consists of members of the Haloarculaceae family, including the genera Halorcula and Horientalis, and comprises strains MBLA0131, MBLA0133, and MBLA0170. Group B includes strains MBLA0028 and MBLA0071 from the Haloferacaceae family, specifically within the genus Halorubrum. Group C is the most diverse, encompassing species from three different families: Haloferacaceae, Haloarculaceae, and Halobacteriaceae. Within Haloferacaceae, the genera Haloplanus, Halobellus, and Haloquadratum were identified, while Haloarculaceae includes Halomicrobium and Halosegnis. The Halobacteriaceae family is represented by the genus Halobacterium, including strain MBLA0001. Groups A and B showed well-supported clades with high bootstrap values (>90%), indicating robust evolutionary relationships. In contrast, Group C displayed generally low bootstrap values (<70%), resulting in an unstable phylogenetic structure. Notably, the clade containing Haloplanus ruber MBLA0145 and Halobellus clavatus CGMCC 1.10118 showed a borderline bootstrap support of 76%, while other branches within Group C lacked significant support, suggesting a random arrangement of taxa.
Discussion
The 16S rRNA and genome-based phylogenetic analyses showed that most strains, except those belonging to the genus Haloarcula, tended to cluster at the genus level. This pattern may be attributed to the fact that Haloarcula is known to possess multiple SSU rRNA operons with substantial sequence divergence [54], and such intra-genomic heterogeneity may have weakened the vertical phylogenetic signal. To overcome these limitations, a UBCG-based phylogenomic analysis was performed using a set of conserved single-copy core genes. In this analysis, the unstable clustering pattern of Haloarcula observed in the previous phylogenetic trees was no longer detected. This improvement can be explained by the fact that UBCG relies on evolutionarily conserved, vertically inherited genes with minimal paralogy, thereby providing a high-resolution phylogenetic signal that more accurately reflects evolutionary relationships among haloarchaea. Based on this approach, the 12 haloarchaeal strains isolated in this study were shown to be genetically distinct from one another and clearly differentiated from previously reported species. These findings raise the possibility that the observed genetic divergence is not limited to simple sequence-level variation but may reflect broader evolutionary processes within haloarchaeal lineages.
The genome analysis revealed variations in genome size and G+C content across different families within Halobacteriales. Halobacteriaceae species, including Halobacterium and Salarchaeum, had smaller genome sizes (<3.0 Mbp), whereas Haloarculaceae and Haloferacaceae species had relatively larger genomes (>3.0 Mbp). The observed G+C content (62.1–68.7 mol%) aligns with previous findings [48], suggesting its role in genome stability and adaptation to high-salinity environments. Differences in genome size may correspond to different adaptive strategies. Halobacteriaceae, which possess comparatively small genomes, may have undergone evolutionary genome streamlining through the reduction of non-essential or redundant genes, potentially contributing to survival in stable hypersaline environments. In contrast, Haloferacaceae and Haloarculaceae have larger genomes, which may reflect an evolutionary tendency toward metabolic flexibility and could be associated with adaptation to fluctuating nutrient conditions and oxygen availability.
Genome comparisons based on OrthoANI, AAI, and isDDH suggested that the haloarchaeal strains analyzed in this study were generally distinct at the genomic level. However, some comparisons yielded unexpectedly high similarity values even between strains from different genera. These results may not only have resulted from methodological differences among genomic metrics but also from the limited number of haloarchaeal strains that had been isolated and genomically characterized to date, which led to a restricted comparative framework. Under such cases, reliance on a single comparative genomic analysis could lead to ambiguous taxonomic interpretations. Therefore, an integrative approach that combined multiple genome-based analysis was essential for improving the accuracy of species delineation and taxonomic classification. Collectively, these findings indicated that substantial genomic diversity existed even among closely related haloarchaeal strains. This underscored the need for broader comparative genomic analyses and continuous discovery of novel strains to more precisely define species boundaries and to better understand evolutionary processes within the order Halobacteriales.
Functional annotation of predicted proteins showed that transcription, energy production and conversion, and amino acid metabolism were the most abundant COG categories across all strains, suggesting that these functions may have played important roles in the survival and adaptation of haloarchaea in hypersaline environments. Interestingly, Halobacteriaceae exhibited differences in the distributions of COG T (signal transduction mechanisms) and COG J (translation, ribosomal structure, and biogenesis) compared with other families. Halobacteriaceae strains showed a lower proportion of COG T genes (2.14–2.96%) but a relatively higher proportion of COG J genes (5.90–6.82%). This trend may have been associated with lineage-specific regulatory adaptations. Given that Halobacteriaceae possessed relatively small genomes, the reduced proportion of signal transduction genes may have been related to a more simplified regulatory system that favored rapid protein synthesis (high COG J) over complex environmental sensing (low COG T). This pattern may have partially reflected an adaptive strategy to relatively stable hypersaline environments. In addition, the lower representation of defense mechanism-related genes (COG V) in Haloferacaceae strains may have indicated differences in stress response strategies, suggesting that this family could have relied more on metabolic flexibility or carotenoid-mediated membrane protection to cope with environmental stress.
Carotenoid biosynthetic pathways have been well characterized in several studies across diverse haloarchaeal lineages [15]. Therefore, we focused on comparing carotenoid biosynthetic gene content among the 12 newly sequenced isolates. The amino acid sequence similarity of BR biosynthetic genes across 12 haloarchaeal strains ranged from 46% to 98.2%, indicating that there are conserved sequences that are shared even among strains belonging to different families. In other words, BR serves as an essential adaptive strategy in hypersaline environments, suggesting that this biosynthetic pathway has been selectively retained for survival. Notably, while most genes involved in carotenoid biosynthesis were identified in the majority of strains, crtI was absent in five strains (MBLA0010, MBLA0123, MBLA0129, MBLA0135, and MBLA0217). However, these strains exhibited pink pigmentation, indicating the production of BR. Previous studies have reported that BR can still be synthesized even in the absence of crtI, possibly due to the activity of the enzyme encoded by crtD [15]. Mechanistically, both CrtI and CrtD catalyze the introduction of additional double bonds into the carotenoid backbone. CrtI functions along the central chain during the conversion of phytoene to lycopene, and CrtD similarly introduces double bonds at the 3,4 (and 3’,4’) positions in downstream reactions. This functional similarity suggests that certain CrtD homologs may act as bifunctional desaturases capable of partially compensating for the loss of CrtI. Accordingly, the existence of crtD and the absence of crtI in our pigment-producing strains supported the possibility of compensatory CrtD activity and implies a potential evolutionary relationship between the two enzymes, possibly resulting from gene duplication followed by functional divergence. Phylogenetic analysis of BR biosynthetic genes revealed consistent clustering patterns within genera, suggesting that these genes have evolved in a lineage-specific manner. MSA confirmed the conservation of functionally critical motifs, such as the DDXXD motif in crtE and the Asp-rich motif and hydrophobic flap in crtB, reinforcing their essential roles in gene function. The role of lyeJ in BR biosynthesis further reflects its specialization as a lycopene elongase unique to haloarchaea, distinguishing them from other C_50_ carotenoid-producing microorganisms such as Flavobacterium dehydrogenans, Corynebacterium glutamicum, and Micrococcus luteus [55]. This specialization highlights an evolutionary adaptation that enhances survival in high-salinity conditions. Similarly, cruF was found exclusively in haloarchaea and was identified in all 12 analyzed species, indicating that it is a uniquely conserved gene. MSA analysis further confirmed that cruF shares conserved regions across different families, indicating a high level of sequence conservation independent of taxonomic affiliation. Collectively, these results indicate that BR biosynthesis is a conserved core adaptation in haloarchaea maintained by strong evolutionary pressure, likely due to its role in membrane stabilization, oxidative stress resistance, and survival in hypersaline environments.
The hop gene was detected in 7 of the 12 analyzed strains, and its presence was not consistent even among species within the same genus. This pattern suggested that species-specific gene acquisition or loss events may have occurred during evolution. Notably, no hop gene was detected in strains of the genus Haloferax, which was consistent with previous reports indicating that Haloferax species had adapted to low-oxygen or anaerobic environments by developing alternative energy-yielding pathways such as nitrate reduction [21]. These findings implied a metabolic shift from phototrophy toward anaerobic respiration and suggested that retention of the hop gene may not be essential for survival when alternative energy sources are available. Phylogenetic analysis of hop genes revealed three distinct clades. Two of these groups showed consistent clustering at the family level, whereas Group C comprised a taxonomically heterogeneous group including members of the families Haloferacaceae, Haloarculaceae, and Halobacteriaceae. This pattern indicated that the evolution of the hop gene in haloarchaea has likely been influenced by horizontal gene transfer and lineage-specific gene loss, reflecting adaptation to diverse energy acquisition strategies in hypersaline environments.
Conclusion
This study represents the genomic characterization of 12 haloarchaeal strains isolated from Korean solar salterns. Whole-genome sequencing revealed clear taxonomic separation among strains, with family-level differences in genome size, G+C content, and functional gene repertoires. Comparative functional annotation based on COG classification showed lineage-dependent variation in genes related to transcription, translation, and stress response, consistent with adaptation to hypersaline environments. Analysis of BR and HR biosynthetic genes revealed distinct evolutionary patterns. Genes involved in BR biosynthesis were broadly conserved across all strains, although the phytoene desaturase gene crtI was absent in five species. The presence of crtD in these strains predicts functional compensation and supports a shared evolutionary origin of the two enzymes. In contrast, genes associated with HR biosynthesis showed irregular distribution, including variation among strains within the same genus, consistent with horizontal gene transfer or lineage-specific gene loss. These results indicate divergent evolutionary strategies for phototrophic traits in haloarchaea. Collectively, the genomic data provide a framework for understanding adaptive strategies of haloarchaea in hypersaline environments and support future studies on their physiology, evolution, and biotechnological potential.
Supplemental Materials
Supplementary data for this paper are available on-line only at http://jmb.or.kr.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Parte AC Sardà Carbasse J Meier-Kolthoff JP Reimer LC Göker M 2020 List of prokaryotic names with standing in nomenclature (LPSN) moves to the DSMZ Int. J. Syst. Evol. Microbiol.7056075612 https://doi.org/10.1099/ijsem.0.004332 10.1099/ijsem.0.00433232701423 PMC 7723251 · doi ↗ · pubmed ↗
- 2Oren A 2012 Taxonomy of the family Halobacteriaceae: a paradigm for changing concepts in prokaryote systematics Int. J. Syst. Evol. Microbiol.62263271 https://doi.org/10.1099/ijs.0.038653-0 10.1099/ijs.0.038653-022155757 · doi ↗ · pubmed ↗
- 3Hari RK Patel TR Martin AM 1994 An overview of pigment production in biological systems: functions, biosynthesis, and applications in food industry Food Rev. Int.104970 https://doi.org/10.1080/87559129409540985 10.1080/87559129409540985 · doi ↗
- 4Singh A Singh AK 2017 Haloarchaea: worth exploring for their biotechnological potential Biotechnol. Lett.3917931800 https://doi.org/10.1007/s 10529-017-2434-y 10.1007/s 10529-017-2434-y 28900776 · doi ↗ · pubmed ↗
- 5Oren A 2010 Industrial and environmental applications of halophilic microorganisms Environ. Technol.3825834 https://doi.org/10.1080/09593330903370026 10.1080/0959333090337002620662374 · doi ↗ · pubmed ↗
- 6Klibanov AM 2001 Improving enzymes by using them in organic solvents Nature 409241246 https://doi.org/10.1038/35051719 10.1038/3505171911196652 · doi ↗ · pubmed ↗
- 7Zhao YX Rao ZM Xue YF Gong P Ji YZ Ma YH 2015 Poly(3-hydroxybutyrate-co-3-hydroxyvalerate) production by Haloarchaeon Halogranum amylolyticum Appl. Microbiol. Biotechnol.9976397649 https://doi.org/10.1007/s 00253-015-6609-y 10.1007/s 00253-015-6609-y 25947242 · doi ↗ · pubmed ↗
- 8Jendrossek D Pfeiffer D 2014 New insights in the formation of polyhydroxyalkanoate granules (carbonosomes) and novel functions of poly(3-hydroxybutyrate)Environ. Microbiol.1623572373 https://doi.org/10.1111/1462-2920.12356 10.1111/1462-2920.1235624329995 · doi ↗ · pubmed ↗
