# Biological Functional Class Enrichment Analysis with R, an Annotated Tutorial for Bench Scientists

**Authors:** Kejin Hu

PMC · DOI: 10.3390/mps9010028 · Methods and Protocols · 2026-02-19

## TL;DR

This paper provides an accessible R tutorial for bench scientists to perform biological functional class enrichment analysis using tools like GO, KEGG, and Reactome.

## Contribution

The paper introduces an annotated R tutorial with detailed scripts for FunCEA, tailored for biomedical researchers with minimal computational expertise.

## Key findings

- The tutorial covers two popular FunCEA methods: over-representation analysis and functional class scoring.
- It provides annotated R code for visualizations such as dot plots, term-gene network plots, and GSEA plots.
- The tutorial uses freely available R packages and databases, eliminating the need for commercial software.

## Abstract

High-throughput sequencing generally results in a list of genes. Which functional groups of genes among the DEGs are meaningful underlying factors to the differential biological/biomedical conditions under investigation? The process to find answers to this question can be called biological functional class enrichment analysis (FunCEA). R is a robust platform for FunCEA due to its accessibility by general users and availability of well-developed R packages for enrichment analysis and visualization, as well as for knowledge databases. Bench scientists in biomedical sciences need accessible and easy-to-understand protocols for FunCEA. This R tutorial provides detailed R scripts or command lines for FunCEA, as well as for data processing and visualization of the enrichment results. It keeps bench scientists in mind and provides supportive and apprehensible descriptions of the R scripts for each task (enrichment analysis, enrichment data processing, and visualization). It describes detailed procedures for the two popular FunCEA methods, the so-called over-representation analysis (ORA) and functional class scoring (FCS). The introduced FunCEA here uses three basic knowledge databases: gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and reactome. R codes for various visualizations (dot plot, term-gene network plot, enrichment map plot, ridge plot, and GSEA plot) are presented and annotated. Since all analyses are conducted in R, no commercial software is needed, yet clusterProfiler can directly access the latest KEGG knowledge database.

## Full-text entities

- **Genes:** BRD2 (bromodomain containing 2) [NCBI Gene 6046] {aka BRD2-IT1, D6S113E, FSH, FSHRG1, FSRG1, NAT}, POU5F1 (POU class 5 homeobox 1) [NCBI Gene 5460] {aka OCT3, OCT4, OCT4Borf1, OTF-3, OTF3, OTF4}, NANOG (Nanog homeobox) [NCBI Gene 79923], KLF4 (KLF transcription factor 4) [NCBI Gene 9314] {aka EZF, GKLF}, MYC (MYC proto-oncogene, bHLH transcription factor) [NCBI Gene 4609] {aka MRTL, MYCC, bHLHe39, c-Myc}, SOX2 (SRY-box transcription factor 2) [NCBI Gene 6657] {aka ANOP3, MCOPS3}
- **Diseases:** NULL (MESH:C564833), IDs (OMIM:614156), MF (MESH:C567116), IRIS (MESH:C535535), FALSE (MESH:D017541), injury to (MESH:D014947), TRUE (MESH:C565693)
- **Chemicals:** ethanol (MESH:D000431), cholesterol (MESH:D002784), lipid (MESH:D008055), sterol (MESH:D013261), steroid (MESH:D013256), ethanol",]["geneID (-)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12943126/full.md

## Figures

19 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12943126/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/PMC12943126/full.md

---
Source: https://tomesphere.com/paper/PMC12943126