A Proteoform-Resolved Atlas of Human Cardiac Histones
Zhan Gao, Isabella R. Clemmer, Hsin-Ju Chan, Kwame Osei, Holden T. Rogers, Thomas S. Weir, Matthew S. Fischer, Robert L. Gearhart, Scott J. Price, Yanlong Zhu, Allan R. Brasier, Jingshing Wu, Wuqiang Zhu, Ying Ge

TL;DR
This study creates a detailed map of histone proteins and their modifications in human heart tissue using advanced proteomics techniques.
Contribution
The study introduces a streamlined top-down proteomics workflow to resolve histone proteoforms and their PTMs in native human cardiac tissue.
Findings
A comprehensive cardiac histone proteoform atlas was assembled, revealing co-occurring PTM combinations.
The workflow enables baseline separation of all core histone families and linker histone H1 in a single LC-MS run.
Previously unobserved histone proteoforms were identified, providing insights into chromatin regulation in the heart.
Abstract
Histones variant composition and post-translational modifications (PTMs) of core and linker histones jointly orchestrate chromatin architecture and tissue-specific gene regulation. However, capturing the full complexity of the “histone code” in native human tissues remains challenging. Here we present a human cardiac histone proteoform atlas—comprising intact histone variants with combinatorial PTMs—enabled by a streamlined top-down proteomics workflow. In a single one-dimensional reversed-phase liquid chromatography (LC)-mass spectrometry (MS) run, we achieve baseline separation of all four core histone families (H2A, H2B, H3, and H4) together with the linker histone H1 directly from human myocardium. Notably, this intact-protein analysis preserves the connectivity of co-occurring PTMs to individual histone molecules to reveal the combinatorial histone code, enabling proteoform-level…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Chromatin Dynamics · Epigenetics and DNA Methylation · Protein Degradation and Inhibitors
Introduction
Histones compact the eukaryotic genome into nucleosomes and provide a dynamic regulatory interface on which post-translational modifications (PTMs) and histone variants program chromatin structure and gene function^1,2^. The N-terminal tails of four core histones (H2A, H2B, H3 and H4) are densely and combinatorially modified^3–5^, classically by acetylation^6–8^, methylation^9,10^, phosphorylation^11^, ubiquitination^12^ and sumoylation^13^, and more recently by a broader spectrum of lysine acylation^14–16^. These PTMs modulate chromatin compaction and recruit dedicated histone-directed “reader” enzymes that translate marks into transcriptional outcomes^17^. Beyond the core histones, the linker histone H1 family binds at the DNA entry and exit site of the nucleosome, tuning nucleosome spacing to promote higher-order chromatin folding, and thereby shaping genome accessibility^18^. Additionally, non-canonical histone variants such as H2A.Z, H2A.X, and macroH2A, are non-allelic isoforms that replace canonical histones within nucleosomes, modulating nucleosome stability and DNA accessibility to fine tune chromatin function^19–21^. Together, histone variant composition and combinatorial PTMs in core and linked histones constitute a complex “histone code” central to genome function across diverse biological processes^22^. However, capturing the full molecular complexity of this code remains challenging.
Histone-guided chromatin regulation is fundamental to cardiac development, homeostasis, and disease^23–26^. Despite extensive efforts linking chromatin dynamics to cardiac biology, most existing knowledge derived from cultured cardiomyocytes and animal models using antibody-based profiling or genetic perturbations^27–29^. Moreover, most studies of histones have primarily focused on individual histone families or specific variants^30–34^. As a result, a comprehensive, molecularly resolved understanding of histone variants and their combinatorial PTMs in the human heart remains elusive. With a growing emphasis on human-relevant, non-animal approaches, there is now a critical need to generate a proteoform-resolved map of cardiac histones directly from human myocardium.
Mass spectrometry (MS)-based proteomics is a cornerstone for mapping histone variants and their PTMs in the context of chromatin regulation^35–38^. However, in the conventional bottom-up proteomics, histones are digested into small peptides, which disrupt PTM connectivity and limits the analysis of co-occurring modifications^39–42^. Middle-down proteomics analyzes longer peptides, but still struggles to differentiate highly homologous isoforms^43^. In contrast, top-down proteomics overcomes these limitations by directly analyzing intact proteins^44–46^, capturing the full diversity of proteoforms arising from sequence variation and combinatorial PTMs^47–49^. While powerful, top-down proteomics is more technically demanding than peptide-based approaches^44,45^. Previous top-down studies of histones often required extensive fractionation and/or multidimensional separations to achieve deep proteoform coverage^50–52^, and utilized complex or specialized MS workflows^53–55^. These efforts have largely focused on the core histones, whereas linker histone H1 has received comparatively little attention despite its central role in chromatin organization. Consequently, a streamlined top-down proteomics method capable of capturing all core histones and linker H1 histone remains lacking.
In this study, we develop a streamlined, high-performance top-down proteomics workflow for comprehensive proteoform analysis and applied it to human myocardium. We achieve high coverage across all four core histones together with linker histone H1, and quantitatively profiled histone variant composition and PTM landscape in a single one-dimensional reversed-phase LC (1D RPLC)-MS run. Tandem mass spectrometry (MS/MS) provided confident PTM site localization and direct visualization of combinatorial PTMs on intact histone proteoforms. Notably, we detected multiple macroH2A proteoforms, which represent the largest H2A variants and possess important regulatory functions, directly in cardiac tissue. Together, this work provides the most comprehensive proteoform-resolved cardiac histone atlas to date and can function as a tissue-level reference for future mechanistic and translational studies. By using a standard 1D RPLC–MS-based top-down platform, the workflow is readily transferable and broadly applicable for proteoform-level analysis of histones across diverse tissues.
Results
A streamlined top-down proteomics platform for comprehensive histone proteoform analysis
To enable rapid, comprehensive analysis of the histones from endogenous human tissue, we developed a streamlined top-down proteomics workflow (Fig. 1). Starting from snap-frozen myocardium, tissue was cryopulverized and rinsed in nuclei isolation buffer (NIB). Nuclei were isolated from the tissue pellet with 0.3% NP-40 followed by one trifluoroacetic-acid (TFA) wash to remove interfering proteins (Fig. 1, Steps 1–3). Because histones are rich in arginine and lysine, they carry a strong positive charge under acidic conditions. We therefore extract histones from isolated nuclei with dilute sulfuric acid (0.2M H_2_SO_4_), which disrupts electrostatic interactions with DNA and selectively solubilizes histones while leaving most non-histone proteins insoluble. The acid extract is then precipitated with trichloroacetic acid to obtain histones (Fig. 1, Step 4). Reconstituted intact histones were separated using a single, high-sensitivity 1D RPLC run at a microliter flow rate (6 μL/min) on a home-packed diphenyl column (Fig. 1, Step 5). The use of a diphenyl stationary phase provides enhanced selectivity for basic histones, facilitating single-run separation; however, the workflow is compatible with other high-resolution RPLC formats. High-resolution MS was then used for histone proteoform identification and quantification (Fig. 1, Step 6–7). Selected histones of interest underwent offline fractionation and electron-capture dissociation (ECD) MS/MS to further localize PTM sites (Step 8).
We confirmed high reproducibility for workflows with and without the TFA wash, across both SDS-PAGE and LC-MS (Fig. S1-S3). To reduce interference from sarcomeric proteins intrinsic to cardiac tissue, we incorporated a brief TFA wash following nuclei isolation. This step effectively depleted residual sarcomeric contaminants, reduced ion suppression, and increased signal intensity for core histones without altering downstream relative abundance measurements (Figs. S4–S6). Based on these results, the TFA wash was included in the final workflow (Detailed evaluation of this optimization step is provided in Supplementary Note 1).
Within a single 1D RPLC-MS run, all four core histone families and linker H1 elute sequentially with baseline chromatographic separation (Fig. 2a). Linker histone H1 variants elute first, followed by well resolved H4 and H2B variants. The H4 and H2B histone families often overlap on conventional C18 columns^56^ and typically require additional fractionation or multi-dimensional separation^51,57^. Here, the enhanced selectivity of the diphenyl stationary phase enables their baseline resolution within a single chromatographic dimension (Fig. 2a). The H3 region follows, with H3.3 and H3.2 eluting back-to-back. Multiple H2A variants then elute, and H3.1 appears last, trailing the H2A window as the most retained H3 isoform. Representative tandem mass spectra for protein identification and characterization are provided in Fig. S7–11. High mass-accuracy identified histone proteoforms are provided in Table S2, and the total detected histone proteoforms are shown in Table S3. This efficient separation yields baseline-resolved retention windows for MS acquisition, thereby streamlining subsequent targeted top-down MS/MS experiments.
After averaging across retention time windows, the deconvoluted mass spectra more clearly reveal the PTM patterns of different histones (Fig. 2b). Among linker H1 histones, N-terminal acetylation predominates across all detected variants, although minor non-acetylated species are observed for H1.0 and H1.4. Succinylation was frequently observed PTM, with higher levels than acetylation and phosphorylation in H1.4 and H1.5. We also observed appearance of several proteoforms produced by genetic polymorphisms in H1.2, underscoring the breadth of proteoforms captured by top-down proteomics analysis. Within the H2A elution window, we detected multiple canonical and non-canonical variants, all of which were uniformly N-terminally acetylated and further diversified by additional PTMs, including phosphorylation and succinylation. Across the H1 and H2A families, sequence variation together with combinatorial PTMs generates characteristic, family-specific proteoform fingerprints. Histone H2A variants (H2A1C, H2A2A, H2A2B, H2A2C, H2A.J, and H2A.X) exhibit a conserved, family-specific PTM profile, whereas other H2A variants, such as H2A.V, H2A.Z and H2A1J, show substantially fewer modified states. In contrast, H2B variants behave differently. Multiple variants co-elute within a narrow retention window, and none show N-terminal acetylation in our dataset. Because individual H2B variants differ by only one or a few amino acids, a single intact-mass peak likely represents a mixture of closely related sequence isoforms or isomers. As a result, targeted top-down MS/MS is required to distinguish and confidently assign individual H2B variants within this co-eluting window.
H3 variants (H3.1, H3.2, and H3.3) exhibit the greatest proteoform diversity among the histone families with extensive and combinatorial methylation across all variants. The deconvoluted spectrum of each H3 variant presents a ladder of peaks separated by + 14 Da increments, reflecting successive methylation states. Because acetylation and trimethylation differ by only 0.036 Da, these modifications are effectively isobaric at the intact-mass level. To organize this complexity, we annotated H3 proteoforms by their total number of methyl equivalents (Me-Equiv). Across variants, H3.1, H3.2, and H3.3 each span more than 10 Me-Equiv states, with variant-specific distributions: H3.2 extends beyond 15 Me-Equivs and is centered around 7, whereas H3.1 and H3.3 reach up to 14 Me-Equivs and are centered around 5. For histone H4, averaging across the ~ 3 min H4 elution window yields a more compact envelope dominated by + 14 Da increments, overlaid with trimethylation/acetylation (+ 42 Da) and low abundance succinylation (+ 100 Da). Unlike H3, H4 proteoforms are dominated by a compact set of recurring modification states, yielding a more clearly resolved pattern of PTM co-occurrence. These data establish baseline separation of histone families and enable proteoform-resolved quantification and online MS/MS within a single 1D RPLC–MS run.
A quantitative, proteoform-resolved landscape of cardiac histones from a single LC–MS run
Having deconstructed the nucleosome—separating and measuring intact histones at proteoform level—we then constructed a quantitative histone proteoform landscape from human tissue (Fig. 3). We converted deconvoluted MS1 features into relative abundance bar plots (Fig. 3a–f, Fig. S12), where each bar represents a single proteoform. This visualization simplifies proteoform complexity and provides a direct basis for quantitative comparisons.
Acetylation is a prominent PTM across the cardiac histone proteome, with high-mass-accuracy intact-mass measurements revealing the characteristic + 42.01 Da mass shift while distinguishing it from trimethylation increment (+ 42.05 Da)^58^ (Fig. S13, Table S2). Methylation, phosphorylation, and succinylation were also consistently detected. In several histone families (H1, H2A, and H4), a single N-acetylated “base” proteoform dominates, accompanied by lower-abundance species carrying additional phosphorylation, acetylation, or succinylation (Fig. 3a–c). For example, in linker H1.4 (Fig. 3a), the N-terminally acetylated proteoform is predominant (~ 64.5%). Additional PTMs, including phosphorylation, acetylation, and succinylation, created multiple combinatorial states at lower relative abundance. A similar distribution is evident for H2A.J, H2A2A, and H2A2C (Fig. 3b, Fig. S12b-c): an N-acetylated base proteoform dominates (~ 70.0%), accompanied by less-abundant proteoforms bearing acetylation (~ 7.0%), phosphorylation (~ 6.0%), and succinylation (~ 2.0%). H4 (Fig. 3c) spans a ladder of Me-Equiv states with multiple acetylations and sparse succinylation (~ 2.0%). To organize the dense H3 proteoforms, we sorted proteoforms by Me-Equivs, yielding variant-specific histograms for H3.1, H3.2, and H3.3 (Fig. 3d–f) that extend to high Me-Equiv counts but lack a zero Me-Equiv peak.
We next quantified the relative abundances of individual variants within each histone family, pooling measurements across the entire dataset (Fig. 3g–i). Among linker histones, H1.4 is the major H1 variant, whereas H1.10 is least abundant (1.6 ± 0.01%); H1.0 (16.3 ± 0.4%) and other somatic variants account for a smaller proportion (Fig. 3g). canonical H2As constitute the majority of the H2A pool, with H2A1C (32.3 ± 0.5%), H2A2C (18.9 ± 0.4%), and H2A2A (14.3 ± 0.6%) being the most abundant species. Among non-canonical variants, H2A.Z (0.6 ± 0.03%) and H2A.X (2.0 ± 0.2%) are present at relatively low levels, whereas H2A.J is comparatively enriched (5.7 ± 0.4%). In addition to annotated H2A variants, notably, we also detect several large (~ 20 kDa), unknown H2A-like proteins (U.H2A1–3) and a truncated macroH2A1 proteoform (T.mH2A1), with U.H2A2 being the most abundant variant (6.2 ± 0.5%). To our knowledge, this is the first report of multiple macroH2A proteoforms detected by top-down proteomics from cardiac tissues, which we describe in detail in the following sections. Within the H3 family (Fig. 3i), H3.3 is the dominant H3 species (74.8 ± 0.3%), followed by H3.2 (18.6 ± 0.3%) and H3.1 (6.6 ± 0.5%). This distribution is consistent with the prior bottom-up studies showing that H3.3 steadily overtakes canonical H3.1/H3.2 and becomes the predominant H3 variant in adult human tissues^32^.
PTM levels exhibit distinct, variant-specific patterns that differ both between histone families and among variants within each family (Fig. 3j–l, Supplementary Note 2). Overall phosphorylation is modest (0.02–0.23 mol P/mol protein) but is enriched on large H2A-like species, including truncated macroH2A1 and U.H2A variants, whereas linker H1 variants and H2A.J show the lowest phosphorylation levels (Fig. 3j). In contrast, succinylation is concentrated on linker H1 variants—most prominently H1.5, H1.2, and H1.4—and remains uniformly low across the H2A family (Fig. 3k). Acetylation is the most abundant modification overall, reaching near-stoichiometric levels on canonical H2As and remaining high on H2A1J and H2A.J, while being comparatively lower on H2A.Z, large H2A-like proteins, and linker H1 variants (Fig. 3l).
Mapping of histone combinatorial PTMs by top-down MS/MS
With variant abundances and PTM landscape established, we next applied top-down tandem MS (MS/MS) to resolve intact histone proteoforms. This approach enabled direct visualization of coexisting modifications on the same sequence, thereby resolving combinatorial PTMs (Fig. 4a). Because histone PTMs are interpreted by the modular chromatin-binding domains, distinct combinatorial PTMs can encode functionally divergent chromatin states. We prioritized H4 and H3 for detailed top-down MS/MS analysis because H4 encodes compact, highly structured combinatorial PTM states^57^, whereas H3 exhibits the greatest PTM diversity and complexity^59^, together providing stringent and biologically relevant benchmarks for proteoform-resolved chromatin analysis.
Focusing first on H4, we examined its elution window of H4 (43–47 min), where the base peak chromatogram (BPC) resolved four features corresponding to distinct H4 proteoforms (Fig. 4b). Averaging and deconvoluting the signal across each retention window yielded dominant peaks shifted by + 42.01 Da, diagnostic of acetyl additions^58^ (Fig. 4e, Fig. S13, Table S2). High-resolution MS and online CID MS/MS confirmed the composition of the most abundant proteoform with one acetylation (Ac) and dimethylation (2Me) as 2MeAcH4 (Fig. S11a). Accordingly, the proteoform series was assigned to 2MeAcH4, 2Me2AcH4, 2Me3AcH4, and 2Me4AcH4, increasing acetylation on a dimethylated backbone (Fig. 4e–h). The EICs confirmed these assignments, showing partially overlapping, yet distinct elution profiles, with longer retention as acetylation increases (Fig. 4c). Quantification yielded a clear hierarchy; 2MeAcH4 is most abundant, followed by 2Me2AcH4, with 2Me3AcH4 and 2Me4AcH4 occurring at low to trace levels (Fig. 4d). Within the 2Me4AcH4 window we also detected a trace peak consistent with a fifth acetyl group (Fig. 4h). In each spectrum several + 14 Da shoulder peaks (blue) relative to the main peak suggested a minor contribution from methylation states. Additional low intensity peaks at + 80 Da, + 100 Da, and + 118 Da indicated phosphorylation, succinylation, and one unknown PTM, respectively. Our LC gradient also baseline-separated a minor mono-oxidized H4 proteoforms from the major non-oxidized H4 (Fig. S14). These mono-oxidized proteoforms eluted earlier (~ 41–43 min) (Fig. S14a) and contributed only a small fraction of total H4 signal (Fig. S14b-d). Using online CID top-down MS/MS, we localized this oxidation at methionine 84 (M84) (Fig. S11b). Although histones are now recognized as direct reactive oxygen species (ROS) targets^60,61^, this M84 oxidation could also arise during sample handling and will require further validation.
To further characterize H4 combinatorial PTMs, we collected a fraction from the retention window with the highest H4 abundance, where the deconvoluted spectrum displayed four representative proteoforms, 2MeAcH4, 3MeAcH4, 4MeAcH4, and 2Me2AcH4 (Fig. 4i). We then directly infused the fraction into an ultra-high resolution FTICR-MS and isolated the four proteoforms for ECD MS/MS (Fig. S15). ECD, which cleaves N–Cα bonds to yields c and z• fragment ions while preserving labile PTMs^62^. Near continuous c ions across the N terminal tail confirmed variable methylation states at arginine 3 (R3me1/R3me2) and lysine 20 (K20me1/K20me2/K20me3) and placed acetylation on the N-terminus (S1) (Fig. 4j, Supplementary Note 3, and Fig. S16–19). Notably, the 2Me2AcH4 proteoform contained two isobaric proteoforms that shared the same intact mass but displayed distinct acetylation sites at K12 and K16 (Fig. 4k).
Relative to H4, H3 displays a broader diversity of co-occurring PTM combinations, making it well-suited for top-down mapping of combinatorial states. From the main H3 elution region, we collected an offline fraction, centered at 6 and 7 Me-equivs (Fig. S20a). We then isolated both precursor ions and performed ECD FTICR-MS/MS to maximize fragment ion signal intensity (Fig. S20b, Fig. S21). By comparing mass shifts on all matched c ions across charge states, we observed multiple combinatorial methylation states (Fig. S19c, Fig. S22). K4 was predominantly unmethylated with trace monomethylated state; extending to the K9 region, it was primarily di-methylated, whereas minor me1/me3 and unmodified ions were observed. For the c28 ion, methylation shifted toward higher levels, showing tri- to tetra-methylation, with tetra-methylation being predominant. Finally, the c41 ion reported the overall N-terminal methylation status, revealing four distinct states spanning four to seven methylations. Hence, these results show top-down MS/MS can effectively map combinatorial PTM states on intact histones.
Top-down MS/MS for distinguishing histone sequence isoforms.
Top-down MS/MS enables confident discrimination of highly homologous histone sequence isoforms and isomers by directly interrogating intact proteins and their diagnostic fragment ions. This capability is particularly critical for the canonical H2B family which comprises many closely related variants that differ by only a few residues (Fig. 5a). These variants often share identical intact masses and generate indistinguishable tryptic peptides, which makes isoform-level analysis difficult using antibody- or peptide-centric methods^63^.
In our dataset, the H2B signal resolved into two main elution windows (Fig. 5b–c). In the early window, deconvoluted intact masses matched H2B1N and H2B1L, with adjacent peaks likely reflecting proteoforms derived from them (Fig. 5b). In the later window, multiple isoforms eluted sequentially. H2B1K and H2B1D eluted first, together with a prominent peak likely composed of three variants (H2B1C/1J/1O), followed by H2B2E as the dominant species (Fig. 5c, 5e). Co-elution across this window resulted in residual overlap among late-eluting variants. Quantification at the intact-mass level showed that H2B2E and the H2B1C/1J/1O cluster comprised the majority of the H2B family (Fig. 5d). In addition, we also detected truncated H2BC5 aa[27–126] at later retention window (~ 50min), along with its mono-methylation form (Fig. S23).
Online CID did not yield sufficient diagnostic fragments to separate H2B1C, H2B1J, and H2B1O within the shared peak (Fig. S9). Moreover, the isotopic distribution of this precursor matches the theoretical distributions of H2B1C/1J/H2B1O equally well, precluding unambiguous annotation at the intact-mass level (Fig. 5g). To resolve this ambiguity, we performed ECD on the co-eluting peak (~13,765.5 Da) (Fig. 5f, Fig. S24). This strategy targeted three isoform-specific c/z ions that reported the specific substitutions within the H2B1C/1J/1O sequence (Fig. 5h–j, Fig. S25). Each diagnostic ion matched its theoretical mass with near-zero ppm error and showed high run-to-run reproducibility, providing confident assignments (Fig. S26). Combining these three orthogonal ion pairs, we deconvolved and quantified the composition of H2B1C, H2B1J, and H2B1O (Fig. 5k). H2B1J had the highest proportion (44.2 ± 6.5%), followed by H2B1C (37.9%±1.4%), with H2B1O lowest (18.0%±7.8%). This underscores the unique ability of top-down MS/MS to resolve highly homologous histone isoforms and quantify their relative abundances directly at the intact-proteoform level.
Identification of macroH2A proteoforms using top-down proteomics
During proteoform-resolved analysis of the H2A family, we uncovered a previously unexpected class of large H2A-like species in human heart tissue, which we subsequently identified as truncated macroH2A proteoforms using top-down proteomics (Fig. 6). Within the H2A family elution window, we observed several larger H2A-like species (~ 22 kDa) that co-eluted with canonical H2A isoforms (Fig. 6a). Deconvoluted mass spectra showed the same modification pattern seen on H2As, peaks spaced by + 42 Da and + 80 Da, consistent with acetylation/tri-methylation and phosphorylation, respectively (Fig. 6c–f). Based on the high-resolution high-accuracy intact mass measurements and PTM pattern, we inferred that these large unknown H2A (U.H2A) proteoforms originated from macroH2A, the largest histone variant. MacroH2A has a tripartite architecture comprising an N-terminal H2A-like histone domain, a short linker region, and a large C-terminal macrodomain^21^ (Fig. 6b). The macrodomain in the macroH2A1.1 splice isoform binds ADP-ribose (ADPr) metabolites^64^ and engages poly(ADP-ribose)polymerase (PARP)-dependent signaling^65^.
To further validate these unknown large proteins, we acquired targeted online CID MS/MS isolating the most abundant charge state peak for each of the three proteins (Fig. S27). The resulting spectra were processed using OmniScape to extract high-confidence sequence tags—short amino-acid stretches constrained by prefix and suffix mass information—which were then queried using MS-BLAST^66^. MS-BLAST returns high-scoring pairs (HSPs) representing significant local alignments between query tags and database sequences, scored based on tag length and sequence similarity with limited gaps allowed^67^. Each HSP is scored based on tag length and the number of exact or conservative matches, with a limited number of gaps permitted. The results consistently returned the H2A family as the top match. For two outputs (Fig. S27a-b), HSPs aligned to macroH2A1 and macroH2A2, with tags mapping across the H2A-like histone domain, corroborating assignment of these species as truncated macroH2A proteoforms. For the third species (U.H2A2) (Fig. S27c), MS-BLAST analysis did not return macroH2A among significant HSPs. Nevertheless, multiple independent lines of evidence support its classification as a macroH2A-derived proteoform. Its intact mass substantially exceeds that of canonical H2A and falls within the expected range of truncated macroH2A species. In addition, its PTM pattern mirrors that of canonical H2A variants, displaying characteristic + 42 Da and + 80 Da mass shifts. Most importantly, top-down MS/MS fragments exclusively to the conserved H2A-like histone domain shared by macroH2A isoforms, providing direct structural evidence of a macroH2A-related backbone.
Based on high-accuracy intact-mass measurements and domain-resolved MS/MS, we assigned these large H2A-like species as macroH2A-derived proteoforms. One species can be assigned as macroH2A1.1[1–213] (Fig. 6f–h, Fig. S28), whereas the precise C-terminal truncation boundary of U.H2A2 cannot be definitively localized due to the absence of y-ions spanning the proposed cleavage site. Importantly, all detected truncated species retain the H2A-like histone domain, the linker region, and part of the macrodomain, and they exhibit H2A-typical acetylation and phosphorylation patterns.
The histone proteoform tree: cardiac histone proteoforms atlas
By integrating dispersed identifications into a proteoform-resolved, variant-centric resource for in human heart tissue, we assembled a cardiac histone proteoform atlas that organizes variants and their predominant PTMs into a single view (Fig. 7). In total, around 500 unique histone proteoforms across the H2A, H2B, H3, H4, and linker H1 families were detected at MS1 level (Table S2-S3). The atlas tree is arranged by family, H2A (purple), H2B (blue), H3 (orange), H4 (green), and linker H1 (yellow), with each leaf denoting a histone variant and icons denoting PTM types observed on intact proteoforms in our dataset: N-terminal acetylation (N-Ac), lysine acetylation (Ac), methylation (Me), phosphorylation (P), and succinylation (Suc).
Beyond variant cataloging, the atlas reveals family-specific PTM landscapes at the tissue level. Linker histone H1 variants are notable for prominent succinylation on several members, accompanied by N-terminal acetylation and lower-level phosphorylation, together defining a PTM landscape that is distinct from that of core histones in cardiac chromatin. Among core histones, canonical and non-canonical H2A variants generally display characteristic patterns dominated by acetylation and phosphorylation, generating distinct PTM fingerprints that differentiate variants within this family. In contrast, H2B variants carry relatively sparse PTMs overall, with fewer detectable modification states despite extensive sequence similarity among variants. H3 variants are dominated by extensive and combinatorial methylation, consistent with their central role in epigenetic regulation and chromatin state encoding^9^. Finally, H4 commonly carries N-terminal acetylation together with multi-lysine acetylation and methylation, forming compact but well-defined combinatorial PTM states.
The histone proteoform atlas consolidates complex PTM patterns into a single proteoform-resolved snapshot while also providing a foundation for quantitative cross-study comparisons. This approach can be readily extended to other tissue types and disease contexts and serves as a framework for interpreting newly identified histone proteoform changes in a tissue-specific context.
Discussion
Histone genes give rise to extensive molecular diversity through sequence variation and combinatorial PTMs, generating hundreds of distinct proteoforms from a limited set of gene products^37^. This proteoform-level complexity underlies chromatin organization and gene regulation, yet it is largely obscured by the conventional antibody-based or peptide-centric methods that collapse variant-specific information, obscure co-occurring PTMs, and preclude direct quantification of intact proteoforms^35^. Comprehensive proteoform-resolved characterization of histones in human heart tissue has remained largely unexplored. To address this gap, we developed a streamlined top-down proteomics workflow that generates a cardiac histone proteoform atlas from a single LC–MS run, enabling direct, quantitative interrogation of histone variants and their combinatorial PTMs in native human myocardium.
Effective separation and quantification of all core and linker histones in a single LC-MS run
Our streamlined top-down proteomics method utilizes single LC-MS run to achieve baseline separation of all four core histones together with linker histone H1 directly from human myocardium. By enriching histones while depleting interfering proteins, our workflow boosts histone signal intensity and enables broad proteoform coverage. Importantly, intact-protein analysis preserves connectivity between sequence variation and co-occurring PTMs, enabling direct proteoform-level quantification and confident discrimination of highly homologous or isomeric histone variants. This capability is particularly important for histone families in which small sequence differences or PTM combinations encode distinct functional states but are difficult to resolve using antibody-based or peptide-centric methods^35^.
Previously, Garcia et al. developed hydrophilic interaction chromatography (HILIC)-based top-down strategy for analysis of H3 and identified distinct H3.2 proteoforms^68^. Young and co-workers established a high-throughput quantitative top-down workflow for analysis of histone H4 proteoforms^57^. To increase the separation capacity, Paša-Tolić and colleagues developed online nanoflow MDLC platforms coupling WCX-HILIC to MS/MS or to RPLC-MS/MS for the identification of core histone proteoforms^50,51^. Alternatively, Sun and co-workers coupled offline size-exclusion chromatography (SEC) with capillary zone electrophoresis (CZE)-MS/MS for histone proteoform identification, though the major histone families remained only partially resolved^52^. Recently, Kelleher and co-workers developed Nuc-MS, a native MS method using mass spectrometers with extended mass range and ultra-high mass range that interrogated intact nucleosomes to preserve nucleosome-level variant/PTM relationships and quantified nucleosome co-occupancy of histone H3.3 with variant H2A.Z^55^. Furthermore, Fernandez-Lima and co-workers have developed a complex top-“double-down” approach combining ultraviolet photodissociation followed by ion mobility and mass-selected electron capture dissociation (UVPD-TIMS-q-ECD-ToF MS/MS) for the analysis of histone H4 proteoforms^54^. The majority of the prior studies have focused on the core histones, whereas the linker histone H1 has received little attention despite its central role in chromatin organization^69^.
In contrast, our single run LC–MS strategy enables effective and reproducible proteoform-resolved analysis for all core histones and linker histones directly from native tissue, for the first time. By consolidating effective histone variant separation, proteoform-level quantification, and PTM connectivity into a single run, this approach provides a broadly transferable and scalable framework for tissue-level histone proteoform analysis, lowering the barrier to systematic studies of histone variant composition and PTM crosstalk across complex biological systems.
Combinatorial histone PTM landscapes in the human heart
Histone regulation is not defined by individual PTM marks in isolation, but by combinatorial “histone code” where multiple PTMs co-exist on the same histone molecule to collectively shape chromatin architecture and regulatory output^1,2,22^. However, most chromatin assays profile single PTMs and often cannot resolve highly homologous, leaving the organization and stoichiometry of co-occurring PTMs incompletely defined^37,70^. Here, by directly measuring intact histone proteoforms, our top-down cardiac atlas, enabling variant-resolved quantification of PTM co-occurrence and stoichiometry in adult human myocardium.
Histone acetylation neutralizes lysine positive charge and promotes chromatin accessibility^71^. In human myocardium, acetylation is most prominent on H4 and canonical H2A variants. Multiple acetylation events frequently co-occur on the same H4 backbone together with defined methylation states, forming reproducible combinatorial proteoforms rather than independent marks. These intact H4 proteoforms capture coordinated acetylation within the N-terminal tail, consistent with known roles of H4 acetylation, particularly at K16, in disrupting higher-order chromatin folding and enabling chromatin remodeling during cardiac stress adaptation^71,72^. Histone methylation provides a stable yet versatile layer of chromatin regulation^9^. Our proteoform-resolved analysis shows that H3 variants (H3.1, H3.2, and H3.3) exhibit the greatest combinatorial complexity, each spanning more than ten methyl-equivalent states with variant-specific distributions. These layered methylation patterns co-exist on individual H3 molecules, revealing combinatorial landscapes that cannot be reconstructed from peptide-level data and providing direct insight into PTM crosstalk on intact histones in adult human heart tissue.
Histone phosphorylation is a key regulator of DNA damage responses and chromatin dynamics^11^. In human myocardium, phosphorylation is enriched on H2A variants, including both canonical H2As and large H2A-like species. Although precise site localization was not achieved for all phosphorylated proteoforms, prior studies identify H2A.X phosphorylation at serine 139 (γH2AX) as a hallmark of DNA damage signaling with prognostic relevance in heart failure^73^. Importantly, phosphorylation frequently co-occurs with acetylation on the same H2A proteoforms, supporting the concept that stress-responsive chromatin signaling is encoded by coordinated PTM crosstalk rather than isolated phosphorylation events^74^. Histone succinylation introduces a strong negative charge and can markedly alter histone–DNA interactions^75^. Succinylation is strongly enriched on linker histone H1 variants, where it co-occurs with N-terminal acetylation and low-level phosphorylation, defining a distinct combinatorial PTM landscape compared to core histones. Given the high metabolic demand of the heart, and prior links between succinylation, mitochondrial dysfunction, and cardiomyopathies^76,77^, these observations raise the possibility of a metabolism–chromatin regulatory axis in cardiomyocytes.
These findings demonstrate that combinatorial PTMs define histone proteoform states in the human heart. By preserving intact histones, top-down proteomics uniquely enables direct observation and quantification of coordinated PTM patterns across variants, providing a proteoform-level framework for understanding chromatin regulation in cardiac physiology and disease.
Resolving histone sequence isomers by top-down MS/MS
In addition to PTM diversity, histone sequence heterogeneity presents a major analytical challenge. Many histone variants differ by only one or a few amino acids and generate indistinguishable peptides upon proteolysis, rendering isoform-level analysis difficult for peptide-centric or antibody-based approaches^35^. This limitation is especially pronounced for the canonical H2B family, which comprises multiple highly homologous variants^63^.
In our dataset, several H2B variants co-eluted, and H2B1C, H2B1J, and H2B1O collapsed into a single MS1 feature. By isolating this precursor and performing ECD top-down MS/MS, we obtained isoform-specific diagnostic fragment ions that unambiguously distinguished H2B1C, H2B1J, and H2B1O and enabled their relative quantification within the same chromatographic peak. This analysis confirmed the co-existence of all three variants in human heart tissue and revealed a cardiac-specific H2B isoform composition that differs from those reported in other cell types, suggesting tissue-restricted isoform usage^78^.
This example highlights a key strength of top-down proteomics: the ability to directly resolve sequence isomers at the intact-protein level, even when variants are isobaric and chromatographically inseparable. By establishing isoform-resolved baselines in native tissue, this approach provides a foundation for future studies examining how histone variant selection and associated proteoform states change across cardiac cell types, developmental stages, or disease conditions.
Top-down proteomics reveals unexpected proteoforms in human heart tissue
A key strength of unbiased, proteoform-resolved analysis is its ability to uncover previously unrecognized protein species directly in native tissue, without prior assumptions about variant composition or modification state. Applying this strategy to human myocardium, we identified multiple histone proteoforms that have not been reported in cardiac tissue, highlighting the extent to which chromatin complexity in the heart has remained underexplored at the intact-protein level.
Among these findings, the most unexpected was the detection of multiple truncated macroH2A proteoforms in human heart tissue by top-down proteomics. MacroH2A is the largest H2A variant with three domains, a H2A-like histone domain, a linker region, and a macro domain^21^. MacroH2A has been extensively studied in cancer, where it restricts chromatin plasticity, enforces transcriptional restraint^79^, and limits cellular reprogramming and metastasis^80,81^. In contrast, macroH2A has not been systematically profiled in the human heart, and proteoform-level information has been lacking.
Using top-down proteomics, we directly detected macroH2A-derived proteoforms that retain the H2A-like histone domain and display H2A-typical acetylation and phosphorylation patterns, consistent with nucleosome incorporation. The identification of these large H2A-like proteoforms in native myocardium suggests a previously unrecognized layer of chromatin regulation in the heart. Although further validation will be required to define the precise boundaries and origin of these truncated forms, their consistent detection underscores the ability of top-down proteomics to reveal unexpected chromatin components that are inaccessible to peptide-centric or antibody-based approaches.
Limitations and outlook
Although we have made significant progress, several limitations remain. While our 1D RPLC approach achieved baseline separation of both core and linker histone families, the extreme complexity of certain co-eluting variants within a family and highly modified proteoforms necessitates further resolution. Future integration of orthogonal separation dimensions, such as ion mobility spectrometry, could provide the additional gas-phase separation required to resolve these isobaric proteoforms^54^. Furthermore, although the use of online LC-MS/MS enabled rapid profiling, the limited MS/MS fragmentation achievable at the chromatographic timescale was sometimes insufficient to unambiguously localize all distal N-terminal PTMs. As a result, PTM site confirmation currently relies on offline fractionation followed by ultra-high-resolution FTICR-ECD MS/MS, a highly accurate but lower-throughput strategy that can be difficult for low-abundance proteoforms. The future integration of advanced LC-MS/MS platforms that provides multiple dissociation modes for effective fragmentation at the chromatographic timescale will further increase the throughput^82^. Additionally, we mainly focused on the most prominent marks such as acetylation, methylation, phosphorylation, and succinylation. Lower-abundance PTMs, such as lactylation, may require targeted enrichment to cross the current detection threshold. Despite these limitations, this top-down proteomics workflow establishes a proteoform-resolved framework for the human cardiac epigenome that can be used for support analysis of disease-associated histone changes and future refinement toward site- and cell-type resolution.
Conclusion
In this study, we present the first top-down proteomics-based proteoform-resolved atlas of histones in human heart tissue. By unifying variant separation and quantification, PTM stoichiometry, and isoform discrimination within a single LC–MS analysis, this study overcomes long-standing limitations of peptide-centric and antibody-based chromatin analyses and reveals previously inaccessible layers of histone complexity in the myocardium. Beyond establishing a high-confidence reference map of cardiac histone proteoforms, our results uncover unanticipated features of chromatin regulation, including extensive combinatorial PTMs, tissue-specific isoform distribution, and the presence of macroH2A-derived proteoforms in the human heart. Together, this atlas provides a robust molecular framework for interrogating histone PTM crosstalk, isoform switching, and macroH2A biology across cardiac cell types, developmental stages, and disease states. While demonstrated here in cardiac tissue, the streamlined single-run top-down workflow is broadly applicable and establishes a general foundation for proteoform-resolved epigenomic analysis and systematic mapping of histone proteoform landscapes across diverse human tissues and biological systems.
Methods
Detailed methods are described in the Supplementary Methods.
Reagents and Chemicals
All reagents were purchased from Sigma-Aldrich, Inc. unless otherwise noted. LC-MS grade water was purchased from MilliporeSigma. LC-MS grade acetonitrile was purchased from Fisher Scientific. HALT Protease and Phosphatase inhibitor cocktail was purchased from Thermo Fisher Scientific (Cat. No. 87786). Amicon, 0.5-mL cellulose centrifugal filters with a 10 kDa molecular weight cutoff (MWCO) were purchased from MilliporeSigma.
Human Cardiac Tissue Collection
Left ventricular myocardial tissue was obtained from nonfailing human donor hearts without known cardiac disease through the University of Wisconsin–Madison Organ and Tissue Donation program. Prior to dissection, hearts were maintained in cardioplegic solution. Samples were then snap-frozen immediately in liquid nitrogen and stored at − 80°C. Use of human tissue was approved by the UW–Madison Institutional Review Board.
Histone Extraction from Cardiac Tissue
Cardiac tissue was cryopulverized and incubated in nuclei isolation buffer (NIB) for 5 min, followed by centrifugation (10,000 × g, 5 min, 4°C) to remove blood proteins. The pellet was resuspended in NIB supplemented with 0.3% NP-40 (Polytron homogenization; 5 min shaking per wash) to lyse cells and enrich nuclei, then washed with NIB (5,000 × g, 5 min, 4°C). In the optimized workflow, a single 0.1% TFA wash (1 mM TCEP; 10 min shaking) was inserted between the NIB washes to further deplete interfering proteins. Histones were extracted from the final pellet with 0.2 M H_2_SO_4_ (5 volumes; 2 h rotation, 4°C), clarified by centrifugation (21,000 × g, 15 min, 4°C), and precipitated from the supernatant with 33% (w/v) trichloroacetic acid (2 h on ice). The histone pellet was washed with ice-cold acetone, then air-dried.
Online top-down RPLC-MS/MS Analysis of histone proteoforms
Histone pellets were dissolved in 10% ACN: 0.2% FA in H_2_O, clarified by centrifugation (21,000 × g, 15 min, 4°C), and concentrated using 10 kDa Amicon filters pre-equilibrated in the same buffer. Concentrated extracts were transferred to LC–MS vials and analyzed by top-down RPLC–MS/MS on a Waters Acquity M-class UHPLC coupled to a Bruker maXis II QTOF. For each run, 2 μg protein was injected onto a home-packed diphenyl column (200 × 0.25 mm, 2.7 μm, 1000 Å) and separated at 6 μL/min with the following conditions: 0–5 min, 10% B; 5–15 min, 10–15% B; 15–45 min, 15–25% B; 45–75 min, 25–45% B; 75–90 min, 45–50% B; 90–100 min, 50–55% B; 100–115 min, 55–95% B; 115–120 min, 95–10% B. MS1 spectra were acquired over m/z 200–3000 at 1 Hz, and data-dependent MS/MS used CID to fragment the top three precursors per cycle with dynamic exclusion. Targeted CID was performed as needed with optimized isolation widths and collision energies.
Offline FTICR Analysis of Histone Proteoforms
Individual histone fractions were collected after RPLC and analyzed by offline top-down MS/MS using a TriVersa NanoMate nano-ESI source coupled to a Bruker solariX XR 12 T FTICR-MS. Spectra were acquired over m/z 200–4000 under nano-ESI conditions, and targeted MS/MS was performed by isolating selected precursors and fragmenting them by electron capture dissociation (ECD).
Data Analysis
All data were processed and analyzed using Compass DataAnalysis v. 4.3 software, OmniScape software and MASH Native^83^. The sophisticated numerical annotation procedure (SNAP) peak-picking algorithm (quality factor 0.4; signal-to-noise ratio (S/N) 3.0; intensity threshold 500) was applied to determine monoisotopic mass of all detected ions. Denatured mass spectra were deconvoluted using the Maximum Entropy algorithm within the DataAnalysis 4.3 software with the resolving power for deconvolution set to 70,000. These algorithms were employed to generate mass lists containing the monoisotopic masses of proteoforms, which were subsequently used to determine the total number of proteoforms. The total protein phosphorylation, succinylation, and acetylation (P_total_, S_total_, and Ac_total_, respectively) for each protein were calculated using the following equations: . For each histone protein, six of the most abundant charge state ions of each proteoform were selected to generate extracted ion chromatograms (EIC) and the area under the curve (AUC) was integrated to quantify protein expression. MS/MS data were output from the DataAnalysis software and analyzed using Native MASH for protein identification, sequence mapping, and PTM localization. All fragment ions were manually validated with a mass tolerance of 20 ppm. Adobe Illustrator and R (version 4.2.1) were used for data visualization. Biorender was used for some graphics.
Statistical Analysis
The Welch’s two-sample t-test (two-sided) was performed to evaluate the histone signal differences between the no TFA wash group (n = 3) and the TFA wash group (n = 3). Differences between groups were considered statistically significant at an adjusted (adj.) p-value < 0.05, based on the Benjamini-Hochberg method for controlling the false discovery rate (FDR). The statistical analyses were performed using R (version 4.2.1), utilizing its built-in functions.
Supplementary Material
Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Talbert PB, Henikoff S (2010) Histone variants — ancient wrap artists of the epigenome. Nat Rev Mol Cell Biol 11:264–27520197778 10.1038/nrm 2861 · doi ↗ · pubmed ↗
- 2Zentner GE, Henikoff S (2013) Regulation of nucleosome dynamics by histone modifications. Nat Struct Mol Biol 20:259–26623463310 10.1038/nsmb.2470 · doi ↗ · pubmed ↗
- 3Kouzarides T (2007) Chromatin Modifications and Their Function. Cell 128:693–70517320507 10.1016/j.cell.2007.02.005 · doi ↗ · pubmed ↗
- 4Millán-Zambrano G, Burton A, Bannister AJ, Schneider R (2022) Histone post-translational modifications — cause and consequence of genome function. Nat Rev Genet 23:563–58035338361 10.1038/s 41576-022-00468-7 · doi ↗ · pubmed ↗
- 5Huang H, Sabari BR, Garcia BA, Allis CD, Zhao Y (2014) Snap Shot: Histone Modifications. Cell 159, 458–458.e 125303536 10.1016/j.cell.2014.09.037PMC 4324475 · doi ↗ · pubmed ↗
- 6Shvedunova M, Akhtar A (2022) Modulation of cellular processes by histone and non-histone protein acetylation. Nat Rev Mol Cell Biol 23:329–34935042977 10.1038/s 41580-021-00441-y · doi ↗ · pubmed ↗
- 7Gräff J, Tsai L-H (2013) Histone acetylation: molecular mnemonics on the chromatin. Nat Rev Neurosci 14:97–11123324667 10.1038/nrn 3427 · doi ↗ · pubmed ↗
- 8Narita T, Weinert BT, Choudhary C (2019) Functions and mechanisms of non-histone protein acetylation. Nat Rev Mol Cell Biol 20:156–17430467427 10.1038/s 41580-018-0081-3 · doi ↗ · pubmed ↗
