A transcriptome dataset from porcine stem cells with differing adipogenic capacity
Thomas Thrower, Susanna E. Riley, Katharina Grabowski, Cristina L. Esteves, F. Xavier Donadeu

TL;DR
This paper presents RNA sequencing data from pig stem cells with different fat-forming abilities to help improve cultivated meat production.
Contribution
The study provides a transcriptome dataset from porcine stem cells with differing adipogenic capacity, revealing differentially expressed genes.
Findings
PCA plots showed partial overlap in gene expression between high and low adipogenic cell populations.
30 genes were upregulated and 67 downregulated in high adipogenic cells, including known adipogenic genes like PPARG and FABP4.
The dataset is publicly available and could enhance understanding of pre-adipocyte biology in livestock.
Abstract
Mesenchymal stem cells (MSCs) are multipotent cells that can be readily harvested from animal body tissues and grown in culture. MSC cultures contain fat stem cells (pre-adipocytes) in addition to other mesenchymal progenitor cell types. Farm animal MSCs provide a cell source of choice for cultivated fat production, an important sector within the wider cultivated meat industry. However, MSCs are highly heterogenous by nature, containing only a fraction of bona-fide progenitor cells capable of differentiating selectively into adipocytes thus limiting the potential for industrial cultivated fat applications. Elucidating the molecular signatures of pre-adipocytes from farm animal species would facilitate selective enrichment of MSCs to enable efficient scale-up culture of fat. Here we describe bulk RNA sequencing datasets from clonal cell populations obtained by single-cell fluorescence…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Mapping and Diversity in Plants and Animals · Pluripotent Stem Cells Research · Mesenchymal stem cell research
Specifications TableSubjectBiology BiologySpecific subject areaPre-adipocyte biologyType of dataTable, Graph, FigureRaw and processed bulk RNA-seq dataData collectionTotal RNA was extracted using TRIzol and QIAGEN RNAeasy mini kit from clonal cell populations in exponential growth phase.RNA sequencing was performed with the Illumina NovaSeq 6000 platform.Software used for bioinformatic analyses were bcl2fastq, FastQC, Trimmomatic, STAR aligner, Subread, DESeq2Data source locationThe Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Edinburgh, UKData accessibilityRepository name: NCBI’s Gene Expression Omnibus (GEO)Data identification number: Series GSE271977, BioProject PRJNA1134234Direct URL to data: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE271977Instructions for accessing these data: Data not publicly available, will be made available prior to publication.To review GEO accession GSE271977: https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fgeo%2Fquery%2Facc.cgi%3Facc%3DGSE271977&data=05%7C02%7C%7Cc8ddb3412699402776af08dca674e352%7C2e9f06b016694589878910a06934dc61%7C0%7C0%7C638568268159536488%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Iu0TVXYeTxG%2BEhJRFK9BGCvXMIuTECN23HMX6dA8%2FQM%3D&reserved=0Enter token: oxibuqkmzfcrhstRelated research articleNone
Value of the Data
1
- •By identifying differences in gene expression between clones with high and low adipogenic potential, this transcriptome dataset provides novel insight into the molecular mechanisms underlying fat cell differentiation, allowing for increased understanding of fat development in livestock.
- •The information contained herein will be particularly useful to those interested in development of biomarker-based strategies for progenitor cell enrichment of MSC populations or in development of specialised media formulations for selective, high-efficiency culture of pre-adipocytes from livestock species.
- •The dataset provides valuable information beyond that already available in humans and rodents, allowing for increased understanding of adipogenesis across species as well as applications in the emerging field of cultured meat.
- •The raw and processed files are organised in a manner that allows for their use in workflows.
Background
2
Controlling cell differentiation is at the core of any processes seeking to harness the potential of stem cells, be it in regenerative medicine or other less conventional applications such as cultured fat production. MSC cultures are heterogeneous in nature, with highly variable differentiation capacity among individual populations that depends on multiple factors associated with animal and tissue source as well as culture conditions [1,2]. This is further compounded by limited understanding of MSC biology in livestock compared to model species. Identifying transcriptomic signatures associated with adipogenic capacity can provide important cues towards maximising the efficiency of MSC differentiation to the adipose lineage. Towards this aim, we compared RNA sequencing profiles between proliferating porcine MSC clonal populations with proven high and low adipogenic capacity. Bioinformatics analyses of data thus obtained revealed unique transcriptome signatures associated with adipose progenitor commitment (the process by which MSCs lose their broad differentiation capacity and irreversibly commit to the adipose lineage), providing novel insight on regulation of stem cell fate and fat tissue development in the pig.
Data Description
3
A total of 120 clones were successfully grown from MSCs derived from 4 different animals (17-32 clones per anima). Clones were induced to differentiate into adipocytes (see next section for details) which were visualised by staining with the lipid dye, Oil Red O, followed by visual scoring of dye intensity (Fig. 1). Top and bottom scoring clones (high and low adipogenic capacity, respectively, n=3 each) were selected from each animal (total, 24 clones) and their RNA extracted and sequenced (see next section for details). Raw fastq files for each sample and the gene counts with associated Ensembl ID and HGNC symbols are deposited in GEO (see record details above).Fig. 1. Representative images of cultures of MSC clones with high and low adipogenic capacity. Two sets of clones from different animals are shown at either side of the broken line. MSCs were differentiated and then stained with Oil Red O to identify fat-laden adipocytes (in red)., P and C in sample labels refer to pig number and clone number, respectively. Scale bar=50 µM.Fig 1 dummy alt text
A description of each sample, along with barcode sequences, number of reads, and mean quality scores are provided in Table 1. Both number of reads and quality scores were consistently high across samples. Fig. 2A shows the results of principal component analysis (PCA), indicating relatively low PC variances and some overlap between groups. A volcano plot (Fig. 2B) representing fold change (FC) in expression values of high vs low adipogenic samples against adjusted p-value (adjP) for all mapped genes (Supplementary file 1) shows a slight bias towards gene downregulation, so that out of a total of 97 differentially expressed genes (DEGs, adjP<0.05) with |log2FC| ≥ 1, 30 were upregulated and 67 were downregulated (red dots in Fig. 1B). A heatmap of all DEGs with |log2FC| ≥ 1 (Fig. 2C) shows distinct separation between groups and the upregulation of many transcripts corresponding to adipogenic genes such asFABP4, PPARG, CD36, ANPEP and PLIN1 [1,3].Table 1IDs, barcode sequence, read number and quality score of samples included in this dataset. Samples with a quality score of >30 are of high quality (<0.1% error) [4]. In Sample ID, P and C refer to pig number and clone number, respectively.Table 1 dummy alt textSample IDAdipogenic capacityBarcode Sequence# ReadsMean Quality ScoreP16C1HighCCCTCGTA+TAGAAGAG32,674,72735.73P16C9HighTTGTGCCC+TTTCCATC22,347,13035.87P16C11LowTGTCCTCT+CGACGAAC19,071,57835.80P16C14LowTATTGTTC+AAACACAC32,679,27135.89P16C22LowGTTGTGTG+GTCGGACG23,137,99635.76P16C26HighCGATATGG+GGGACGTA26,957,55435.87P19C14LowTATTTACC+GACGTGAG24,755,26435.81P19C16HighCACCAAGC+AAAGGGAA20,315,02835.79P19C21LowTATATGGA+CTTGGCAG24,229,76735.65P19C29HighGTATAGTC+TTGGTCTC32,905,79835.89P19C30LowCGGAGAGG+CGAGCGTC23,382,46635.89P19C37HighTTTGGGAT+GTTCACGT22,319,97535.90P34C1LowCGTCTTGG+GACTGGCG19,924,86335.82P34C2HighTTTCTCTA+CTCGACGT24,364,29635.78P34C7HighCCAGCGAT+TCTAGTCA30,740,91835.73P34C16LowCGTAGCGA+TTGTTTAG24,080,34835.81P34C20HighTGGGAGTG+CCCGTCTA18,977,12235.81P34C21LowGTTAACAT+TGCGGCGT26,205,36435.81P35C8LowCGAACCAC+TCAATCTT38,981,29335.92P35C9HighTAGTCACA+GAGGTATA39,100,24635.70P35C15LowGTAGATGC+TGTTGCTC26,518,72235.92P35C28HighAGAAGTGG+ATCAAGAG22,900,38835.86P35C34LowTACCGCTC+CGTCACTG19,405,73535.87P35C39HighCGTGGATT+CTAAGAGT27,166,53535.85Fig. 2(A) Principal component (PC) analysis showing MSC clones with low and high adipogenic capacity (blue and red, respectively) clustered according to degree of similarity in their gene expression profiles. Each point represents one sample. In ID names, P and C refer to pig and clone number, respectively. (B) Volcano plot representing log2 fold change (FC) expression values in high vs low adipogenic samples against -log10 (adjusted p-value). Each dot represent a different transcript. Unmapped genes were excluded. Vertical and horizontal lines denote FC and adjusted p-value (adjP) cut-offs, respectively, such that differentially expressed genes (DEGs, adjP < 0.05) are represented by Red (|logFC| ≥ 1) and Blue (|logFC| < 1), whereas genes with non-significant fold change (adjP ≥ 0.05) are shown by Green (|logFC| ≥ 1) and Grey (|logFC| < 1). (C) Heatmap of DEGs between high and low adipogenic samples showing distinct separation between the two groups. Gene IDs are shown in the right and sample IDs at the bottom. DEGs with adjP ≤ 0.05 and an |logFC| ≥ 1 are shown, excluding unmapped genes, and counts were normalised to the mean count for each gene. Negative values (blue) represent decreased gene expression and positive values (red) represent increased gene expression.Fig 2 dummy alt text
Experimental Design, Materials and Methods
4
Derivation of MSC clones
4.1
MSCs derived from adipose tissue of 4 Large White x Landrace piglets [1] were grown in flasks coated with 0.1% gelatine (w/v, Sigma-Aldrich, G1393) in High Glucose DMEM media (Gibco, 41966) containing 10% Foetal Bovine Serum (v/v, Gibco, A5256801), 1% Penicillin-Streptomycin (v/v, Gibco, 15140), and 5ng/µl human basic fibroblast growth factor (PeproTech, AF-100-18B), at 39°C with 5% CO_2_. MSCs were sorted into single cells using a BD FACSAria^TM^ Fusion flow cytometer and grown with 40% MSC-conditioned media (v/v) in 96-well plates. A total of 120 clones (across the 4 animals) reached confluence and were transferred to T75 flasks by detachment with Trypsin-EDTA (Gibco, 25200-056) following twice washing with PBS. Upon reaching 70% confluence cells in T75 flasks were split into three equal fractions and either frozen as cell stocks, resuspended in TRIzol (Invitrogen, AM9738) for RNA extraction (see below), or seeded (20,000/cm^2^) in quadruplicate for adipogenic differentiation.
Adipogenic differentiation
4.2
Upon reaching confluence cells were differentiated by incubation with induction media (DMEM high glucose, 10% FBS (v/v), 1% penicillin/streptomycin (v/v), 1µM dexamethasone (Sigma-Aldrich, D4902), 0.5mM IBMX (Sigma-Aldrich, 41095), 2µM insulin (BioXtra, I9278-5ML), and 100µM indomethacin (STEMCELL technologies, 73942)) for 5 days followed by maintenance media (DMEM high glucose, 10% FBS, 1% penicillin/streptomycin, and 2µM insulin) for a further 6 days. Differentiated cells were fixed in 4% PFA (v/v, VWR chemicals, VWR-P38-Sh) for 15 minutes followed by 3 × 5 min washes with PBS at room temperature, and then incubated with freshly made 0.4% Oil Red O (ORO) solution (v/v, diluted in isopropanol and filtered) for 15 minutes at room temperature and subsequently washed 5-10 x with milliQ water. ORO stain was then eluted by addition of 100% iso-propanol to plates at 0.263 ml/cm^2^ followed by incubation for 10 minutes at room temperature. Eluate was homogenised by gentle pipetting and transferred to a round bottom 96-well plate, whereby absorbance at 510 nm was read by BioTek synergy HTX reader. ORO scores thus obtained were subsequently normalised to non-differentiated cell control, and used to classify clones into high and low adipogenic categories based on the number of adipocytes present. Clones that did not produce any adipocytes were not taken into account.
RNA sequencing and data analyses
4.3
A 20% volume of 1-bromo-3-chloropropane (v/v, Sigma-Aldrich, B9673) was added to TRIzol samples, thoroughly mixed for 15 seconds, incubated at room temperature for 3 minutes, and centrifuged at 12,000 x g for 15 minutes at 4°C. The supernatant was extracted and RNA extracted using Qiagen RNeasy Mini kit (Qiagen, 74104) according to manufacturer’s instructions, with elution in nuclease-free water (Qiagen, 1039498). RNA quality and concentration were determined using a Nanodrop spectrophotometer (ND-1000), and samples were stored at -80°C before sequencing by Genewiz UK (Azenta Life Sciences).
In brief, RNA sequencing libraries were prepared using NEBNext Ultra II Directional RNA Library Prep Kit followed by validation using a DNA Kit on an Agilent 5600 Fragment Analyzer (Agilent Technologies), and quantification using a Qubit 4.0 Fluorometer (Invitrogen). Sequencing was performed on the Illumina NovaSeq 6000 platform with a 2 × 150bp Paired End configuration (v1.5). Image analysis and base calling were conducted using NovaSeq Control Software (v1.7). The raw sequencing data was converted to a fastq format and de-multiplexed, permitting one mismatch in index sequence identification, using Illumina bcl2fastq (v2.20). Raw data quality control was conducted by FastQC [5], followed by removal of low-quality nucleotides and adapters by Trimmomatic (v0.36) [6]. Reads were aligned to the Sus scrofa 10.2 reference genome (Ensembl) using the STAR aligner (v2.5.2b) [7], and counts generated using the counts feature of the Subread package (v1.5.2) [8], including only unique reads within exons.
Differential expression analysis was performed using DESeq2 by SER (v1.42.1) [9], and DEGs defined as genes with |logFC| ≥ 1 and adjP ≤ 0.05. Data was processed and visualised using R (v4.3.3) [10] with R packages dplyr (v1.1.4) [11], ggplot2 (v3.5.0) [12], ggthemes (v5.1.0) [13], pheatmap (v1.0.12) [14], RColorBrewer (v1.1-3) [15], rlog (v0.1.0) [16], and tidyverse (v2.0.0) [17].
Limitations
The authors note that screening for high and low adipogenic populations in this study was limited to clones with high proliferative capacity following single seeding, i.e., clones that did not become confluent within 2 weeks after initial seeding were not considered. Thus, because fate commitment in MSCs is known to be associated with loss of proliferative capacity, our strategy may have missed a subset of bona-fide adipogenic progenitors, and the data needs to be interpreted accordingly. In addition, the fact that gene expression profiles overlapped between the two experimental groups (Fig. 2A), likely limited experimental power and the number of differentially expressed transcripts we were able to identify in this study. Finally, data were annotated using a previous version of the pig genome which may have led to some differentially expressed genes being missed.
Ethics Statement
The authors have read and follow the ethical requirements for publication in Data in Brief, and confirm that the current work does not involve human subjects, animal experiments, or any data collected from social media platforms.
CRediT authorship contribution statement
Thomas Thrower: Conceptualization, Investigation, Formal analysis, Writing – review & editing. Susanna E. Riley: Formal analysis, Visualization, Data curation, Writing – original draft, Writing – review & editing. Katharina Grabowski: Investigation, Formal analysis, Writing – review & editing. Cristina L. Esteves: Conceptualization, Writing – review & editing. F. Xavier Donadeu: Writing – review & editing, Conceptualization, Supervision, Project administration, Funding acquisition.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Thrower T.Riley S.Lee S.Esteves C.L.Donadeu F.X.A unique spontaneously immortalised cell line from pig with enhanced adipogenic capacity NPJ Sci. Food.9120255210.1038/s 41538-025-00413-y 40254637 PMC 12010005 · doi ↗ · pubmed ↗
- 2Sugii S.Wong C.Y.Q.Lwin A.K.O.Chew L.J.M.Alternative fat: redefining adipocytes for biomanufacturing cultivated meat Trends Biotechnol.415202368670010.1016/j.tibtech.2022.08.00536117023 · doi ↗ · pubmed ↗
- 3Cawthorn W.P.Scheller E.L.Mac Dougald O.A.Adipose tissue stem cells meet preadipocyte commitment: going back to the future J. Lipid Res.532201222724610.1194/jlr.R 02108922140268 PMC 3269153 · doi ↗ · pubmed ↗
- 4Edgar R.Domrachev M.Lash A.E.Gene Expression Omnibus: NCBI gene expression and hybridization array data repository Nucleic Acids Res,30120022072101175229510.1093/nar/30.1.207PMC 99122 · doi ↗ · pubmed ↗
- 5Andrews S. Fast QC: a quality control tool for high throughput sequence data. Babraham Bioinformatics 2010 Available from: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
- 6Bolger A.M.Lohse M.Usadel B.Trimmomatic: a flexible trimmer for Illumina sequence data Bioinformatics 30152014211421202469540410.1093/bioinformatics/btu 170PMC 4103590 · doi ↗ · pubmed ↗
- 7Dobin A.Davis C.A.Schlesinger F.Drenkow J.Zaleski C.Jha S.STAR: ultrafast universal RNA-seq aligner Bioinformatics 291201315212310488610.1093/bioinformatics/bts 635PMC 3530905 · doi ↗ · pubmed ↗
- 8Liao Y.Smyth G.K.Shi W.The subread aligner: fast, accurate and scalable read mapping by seed-and-vote Nucleic Acids Res.41102013 e 1082355874210.1093/nar/gkt 214PMC 3664803 · doi ↗ · pubmed ↗
