Last updated: 2025-03-06
Checks: 7 0
Knit directory: muse/
This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20200712)
was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 1fa9528. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish
or
wflow_git_commit
). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .Rproj.user/
Ignored: data/1M_neurons_filtered_gene_bc_matrices_h5.h5
Ignored: data/293t/
Ignored: data/293t_3t3_filtered_gene_bc_matrices.tar.gz
Ignored: data/293t_filtered_gene_bc_matrices.tar.gz
Ignored: data/5k_Human_Donor1_PBMC_3p_gem-x_5k_Human_Donor1_PBMC_3p_gem-x_count_sample_filtered_feature_bc_matrix.h5
Ignored: data/5k_Human_Donor2_PBMC_3p_gem-x_5k_Human_Donor2_PBMC_3p_gem-x_count_sample_filtered_feature_bc_matrix.h5
Ignored: data/5k_Human_Donor3_PBMC_3p_gem-x_5k_Human_Donor3_PBMC_3p_gem-x_count_sample_filtered_feature_bc_matrix.h5
Ignored: data/5k_Human_Donor4_PBMC_3p_gem-x_5k_Human_Donor4_PBMC_3p_gem-x_count_sample_filtered_feature_bc_matrix.h5
Ignored: data/Parent_SC3v3_Human_Glioblastoma_filtered_feature_bc_matrix.tar.gz
Ignored: data/brain_counts/
Ignored: data/cl.obo
Ignored: data/cl.owl
Ignored: data/jurkat/
Ignored: data/jurkat:293t_50:50_filtered_gene_bc_matrices.tar.gz
Ignored: data/jurkat_293t/
Ignored: data/jurkat_filtered_gene_bc_matrices.tar.gz
Ignored: data/pbmc20k/
Ignored: data/pbmc20k_seurat/
Ignored: data/pbmc3k/
Ignored: data/pbmc4k_filtered_gene_bc_matrices.tar.gz
Ignored: data/pbmc_1k_v3_raw_feature_bc_matrix.h5
Ignored: data/refdata-gex-GRCh38-2020-A.tar.gz
Ignored: data/seurat_1m_neuron.rds
Ignored: data/t_3k_filtered_gene_bc_matrices.tar.gz
Ignored: r_packages_4.4.1/
Untracked files:
Untracked: analysis/bioc_scrnaseq.Rmd
Untracked: data/97516b79-8d08-46a6-b329-5d0a25b0be98.h5ad
Untracked: data/pbmc_1k_v3_filtered_feature_bc_matrix.h5
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/cluster_profiler.Rmd
) and
HTML (docs/cluster_profiler.html
) files. If you’ve
configured a remote Git repository (see ?wflow_git_remote
),
click on the hyperlinks in the table below to view the files as they
were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 1fa9528 | Dave Tang | 2025-03-06 | Universal enrichment analyser |
html | 2c080dc | Dave Tang | 2024-11-01 | Build site. |
Rmd | 124a8d9 | Dave Tang | 2024-11-01 | Using clusterProfiler |
Install.
if (!require("BiocManager", quietly = TRUE))
install.packages("BiocManager")
install.packages('ggarchery')
install.packages('ggtangle')
BiocManager::install("clusterProfiler")
BiocManager::install("org.Hs.eg.db")
BiocManager::install("GO.db")
Load libraries.
suppressPackageStartupMessages(library(tidyverse))
suppressPackageStartupMessages(library(ggarchery))
suppressPackageStartupMessages(library(clusterProfiler))
suppressPackageStartupMessages(library(org.Hs.eg.db))
suppressPackageStartupMessages(library(GO.db))
Use An example differential gene expression results table.
de_res <- read_csv("https://raw.githubusercontent.com/davetang/muse/refs/heads/main/data/13970886_edger_res.csv", show_col_types = FALSE)
head(de_res)
# A tibble: 6 × 6
ensembl_gene_id logFC logCPM F PValue adjusted_pvalue
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ENSG00000000003 2.73 4.83 4.28 0.0684 0.109
2 ENSG00000000005 -7.00 0.541 17.6 0.00216 0.0138
3 ENSG00000000419 0.120 5.34 0.114 0.743 0.776
4 ENSG00000000457 -0.708 5.31 3.35 0.0993 0.145
5 ENSG00000000460 -0.897 3.95 2.66 0.136 0.186
6 ENSG00000000938 1.54 5.60 1.86 0.205 0.258
The {clusterProfiler} package uses the enrichGO()
function for performing a Gene
Ontology over-representation test. The input for gene
is a vector of Entrez Gene IDs.
Our example data uses Ensembl gene IDs, so we need to convert these into Entrez Gene IDs. We can use the {org.Hs.eg.db} package provided by Bioconductor, which provides genome wide annotation for human, primarily based on mapping using Entrez Gene identifiers.
ensembl_to_entrez <- AnnotationDbi::select(
org.Hs.eg.db,
keys = de_res$ensembl_gene_id,
columns = c("ENSEMBL", "ENTREZID"),
keytype = "ENSEMBL"
)
'select()' returned 1:many mapping between keys and columns
de_res |>
dplyr::inner_join(ensembl_to_entrez, by = dplyr::join_by(ensembl_gene_id == ENSEMBL)) |>
dplyr::select(ensembl_gene_id, ENTREZID, dplyr::everything()) -> de_res
head(de_res)
# A tibble: 6 × 7
ensembl_gene_id ENTREZID logFC logCPM F PValue adjusted_pvalue
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ENSG00000000003 7105 2.73 4.83 4.28 0.0684 0.109
2 ENSG00000000005 64102 -7.00 0.541 17.6 0.00216 0.0138
3 ENSG00000000419 8813 0.120 5.34 0.114 0.743 0.776
4 ENSG00000000457 57147 -0.708 5.31 3.35 0.0993 0.145
5 ENSG00000000460 55732 -0.897 3.95 2.66 0.136 0.186
6 ENSG00000000938 2268 1.54 5.60 1.86 0.205 0.258
The example data contains results of a comparison between cancer samples and normal samples; positive fold change indicates that genes were expressed higher in cancer. There are a lot of genes up-regulated in cancer.
de_res |>
dplyr::filter(logFC > 0, adjusted_pvalue < 0.05) |>
nrow()
[1] 1440
Instead of examining the list of up-regulated genes individually, which would take a long time, we can see whether these genes have common Gene Ontology terms more often than expected.
First we’ll get the top 500 most significantly up-regulated genes.
de_res |>
dplyr::filter(logFC > 0, adjusted_pvalue < 0.05) |>
dplyr::slice_min(order_by = adjusted_pvalue, n = 500) |>
dplyr::filter(!is.na(ENTREZID)) |>
dplyr::pull(ENTREZID) -> sig_up_genes
Create the background, i.e., the universe, so we can test for over-representation.
de_res |>
dplyr::filter(!is.na(ENTREZID)) |>
dplyr::pull(ENTREZID) -> the_universe
Gene Ontology terms are grouped into three ontologies:
Here, we’ll perform an over-representation analysis using biological processes.
ego_bp <- enrichGO(
gene = sig_up_genes,
universe = the_universe,
OrgDb = org.Hs.eg.db,
ont = "BP",
pAdjustMethod = "BH",
pvalueCutoff = 0.01,
qvalueCutoff = 0.05,
minGSSize = 10,
maxGSSize = 500,
readable = TRUE
)
head(ego_bp)
ID
GO:0016064 GO:0016064
GO:0019724 GO:0019724
GO:0002460 GO:0002460
GO:0002449 GO:0002449
GO:0002443 GO:0002443
GO:0006959 GO:0006959
Description
GO:0016064 immunoglobulin mediated immune response
GO:0019724 B cell mediated immunity
GO:0002460 adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains
GO:0002449 lymphocyte mediated immunity
GO:0002443 leukocyte mediated immunity
GO:0006959 humoral immune response
GeneRatio BgRatio RichFactor FoldEnrichment zScore pvalue
GO:0016064 30/385 196/17518 0.15306122 6.964484 12.587673 4.569415e-17
GO:0019724 30/385 199/17518 0.15075377 6.859492 12.461452 7.046005e-17
GO:0002460 37/385 382/17518 0.09685864 4.407194 10.092931 3.590714e-14
GO:0002449 35/385 368/17518 0.09510870 4.327569 9.670801 3.134308e-13
GO:0002443 38/385 466/17518 0.08154506 3.710406 8.889621 3.440135e-12
GO:0006959 30/385 299/17518 0.10033445 4.565348 9.321312 4.278292e-12
p.adjust qvalue
GO:0016064 1.319717e-13 1.180762e-13
GO:0019724 1.319717e-13 1.180762e-13
GO:0002460 4.483604e-11 4.011520e-11
GO:0002449 2.935280e-10 2.626220e-10
GO:0002443 2.577349e-09 2.305977e-09
GO:0006959 2.671080e-09 2.389839e-09
geneID
GO:0016064 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IGHV3-23/IGHV2-5/CD27/IGHV3-30/IGHV3-48/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/IGHV1-69D/FOXP3/IGKC/IGHV3-15/TREM2/CD28
GO:0019724 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IGHV3-23/IGHV2-5/CD27/IGHV3-30/IGHV3-48/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/IGHV1-69D/FOXP3/IGKC/IGHV3-15/TREM2/CD28
GO:0002460 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IL4I1/IRF4/JAK3/IGHV3-23/IGHV2-5/CD27/IGHV3-30/CCL19/IGHV3-48/CXCL13/LILRB4/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/IGHV1-69D/FOXP3/IGKC/IGHV3-15/PLA2G4A/TREM2/CD28
GO:0002449 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IL4I1/IGHV3-23/IGHV2-5/CD27/IGHV3-30/SLAMF7/IGHV3-48/LILRB4/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/IGHV1-69D/CD2/FOXP3/LGALS9/IGKC/IGHV3-15/TREM2/CD28
GO:0002443 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IL4I1/JAK3/IGHV3-23/IGHV2-5/CD27/IGHV3-30/SLAMF7/IGHV3-48/LILRB4/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/CCL3/IGHV1-69D/CD84/CD2/FOXP3/LGALS9/IGKC/IGHV3-15/TREM2/CD28
GO:0006959 WFDC2/IGHG4/IGHG2/IGHG3/IGKV3-20/CXCL14/POU2AF1/MMP7/CCL22/BPIFA1/CXCL9/CCL19/CXCL13/CXCL1/LTF/CXCL10/MS4A1/IGHM/IGHG1/POU2F2/CCL7/TNFRSF21/ADM/IGHA2/C4A/CCL3/CXCL8/CCL8/TREM2/CD28
Count
GO:0016064 30
GO:0019724 30
GO:0002460 37
GO:0002449 35
GO:0002443 38
GO:0006959 30
Bar plot showing each enriched GO term coloured by the adjusted p-value.
barplot(ego_bp, showCategory=10)
Version | Author | Date |
---|---|---|
2c080dc | Dave Tang | 2024-11-01 |
Dot plot showing each enriched GO term with associated statistics.
dotplot(ego_bp, showCategory=10)
Version | Author | Date |
---|---|---|
2c080dc | Dave Tang | 2024-11-01 |
Heat plot showing the enriched GO terms on the y-axis and the genes on the x-axis. Genes with the associated GO term are highlighted.
heatplot(ego_bp, showCategory=10)
Version | Author | Date |
---|---|---|
2c080dc | Dave Tang | 2024-11-01 |
goplot
shows the gene ontology graph with the enriched
GO terms highlighted.
goplot(ego_bp)
Warning: ggrepel: 1 unlabeled data points (too many overlaps). Consider
increasing max.overlaps
Version | Author | Date |
---|---|---|
2c080dc | Dave Tang | 2024-11-01 |
Another nice feature of {clusterProfiler} is that you can plot multiple gene lists together. We can create a list of down-regulated genes.
de_res |>
dplyr::filter(logFC < 0, adjusted_pvalue < 0.05) |>
dplyr::slice_min(order_by = adjusted_pvalue, n = 500) |>
dplyr::filter(!is.na(ENTREZID)) |>
dplyr::pull(ENTREZID) -> sig_down_genes
Perform GO enrichment on two gene lists.
my_gene_list <- list(
up_gene = sig_up_genes,
down_gene = sig_down_genes
)
ego_bp_both <- compareCluster(
geneCluster = my_gene_list,
fun = "enrichGO",
universe = the_universe,
OrgDb = org.Hs.eg.db,
keyType = "ENTREZID",
ont = "BP",
pvalueCutoff = 0.01,
pAdjustMethod = "BH",
qvalueCutoff = 0.05,
minGSSize = 10,
maxGSSize = 500,
readable = TRUE
)
head(as.data.frame(ego_bp_both))
Cluster ID
1 up_gene GO:0016064
2 up_gene GO:0019724
3 up_gene GO:0002460
4 up_gene GO:0002449
5 up_gene GO:0002443
6 up_gene GO:0006959
Description
1 immunoglobulin mediated immune response
2 B cell mediated immunity
3 adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains
4 lymphocyte mediated immunity
5 leukocyte mediated immunity
6 humoral immune response
GeneRatio BgRatio RichFactor FoldEnrichment zScore pvalue
1 30/385 196/17518 0.15306122 6.964484 12.587673 4.569415e-17
2 30/385 199/17518 0.15075377 6.859492 12.461452 7.046005e-17
3 37/385 382/17518 0.09685864 4.407194 10.092931 3.590714e-14
4 35/385 368/17518 0.09510870 4.327569 9.670801 3.134308e-13
5 38/385 466/17518 0.08154506 3.710406 8.889621 3.440135e-12
6 30/385 299/17518 0.10033445 4.565348 9.321312 4.278292e-12
p.adjust qvalue
1 1.319717e-13 1.180762e-13
2 1.319717e-13 1.180762e-13
3 4.483604e-11 4.011520e-11
4 2.935280e-10 2.626220e-10
5 2.577349e-09 2.305977e-09
6 2.671080e-09 2.389839e-09
geneID
1 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IGHV3-23/IGHV2-5/CD27/IGHV3-30/IGHV3-48/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/IGHV1-69D/FOXP3/IGKC/IGHV3-15/TREM2/CD28
2 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IGHV3-23/IGHV2-5/CD27/IGHV3-30/IGHV3-48/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/IGHV1-69D/FOXP3/IGKC/IGHV3-15/TREM2/CD28
3 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IL4I1/IRF4/JAK3/IGHV3-23/IGHV2-5/CD27/IGHV3-30/CCL19/IGHV3-48/CXCL13/LILRB4/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/IGHV1-69D/FOXP3/IGKC/IGHV3-15/PLA2G4A/TREM2/CD28
4 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IL4I1/IGHV3-23/IGHV2-5/CD27/IGHV3-30/SLAMF7/IGHV3-48/LILRB4/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/IGHV1-69D/CD2/FOXP3/LGALS9/IGKC/IGHV3-15/TREM2/CD28
5 IGLC1/IGHG4/IGHG2/IGHG3/IGHV5-51/IGLL5/IGHV4-39/IGLC3/IL4I1/JAK3/IGHV3-23/IGHV2-5/CD27/IGHV3-30/SLAMF7/IGHV3-48/LILRB4/BATF/IGHM/IGHG1/IGLC2/IGHV4-61/IGHA2/IGHV3-7/IGHV3-21/IGHV3-33/IGHV4-59/C4A/CCL3/IGHV1-69D/CD84/CD2/FOXP3/LGALS9/IGKC/IGHV3-15/TREM2/CD28
6 WFDC2/IGHG4/IGHG2/IGHG3/IGKV3-20/CXCL14/POU2AF1/MMP7/CCL22/BPIFA1/CXCL9/CCL19/CXCL13/CXCL1/LTF/CXCL10/MS4A1/IGHM/IGHG1/POU2F2/CCL7/TNFRSF21/ADM/IGHA2/C4A/CCL3/CXCL8/CCL8/TREM2/CD28
Count
1 30
2 30
3 37
4 35
5 38
6 30
Dot plot with enriched GO terms by gene list.
dotplot(ego_bp_both, showCategory = 15)
Version | Author | Date |
---|---|---|
2c080dc | Dave Tang | 2024-11-01 |
enricher()
is a universal enrichment analyzer.
TERM2NAME is needed to map the GOID to its term (description).
An easy way of obtaining this mapping is by extracting the information from the Bioconductor GO annotation database package ({GO.db}).
# extract a named vector of all terms
goterms <- Term(GOTERM)
#convert into a data frame
term2name <- data.frame(
"term"=names(goterms),
"name"=goterms
)
dim(term2name)
[1] 40940 2
head(term2name)
term name
GO:0000001 GO:0000001 mitochondrion inheritance
GO:0000002 GO:0000002 mitochondrial genome maintenance
GO:0000006 GO:0000006 high-affinity zinc transmembrane transporter activity
GO:0000007 GO:0000007 low-affinity zinc ion transmembrane transporter activity
GO:0000009 GO:0000009 alpha-1,6-mannosyltransferase activity
GO:0000010 GO:0000010 heptaprenyl diphosphate synthase activity
term2name
can then be used when calling
enrichr()
, by specifying TERM2NAME=term2name
.
Be sure, though, to check that term2name
contains all GOIDs
present in your TERM2GENE mapping.
Demo input.
demo <- readr::read_csv("data/nfurzeri_gene_id_to_go_id.csv.gz", show_col_types = FALSE)
head(demo)
# A tibble: 6 × 2
ensembl_gene_id go_id
<chr> <chr>
1 ENSNFUG00015000040 <NA>
2 ENSNFUG00015000041 GO:0007156
3 ENSNFUG00015000041 GO:0005886
4 ENSNFUG00015000041 GO:0005737
5 ENSNFUG00015000041 GO:0050808
6 ENSNFUG00015000041 GO:0007411
Prepare data frame of 2 column with term and gene.
demo |>
dplyr::filter(!is.na(go_id)) |>
dplyr::rename(term = go_id, gene = ensembl_gene_id) |>
dplyr::select(term, gene) |>
dplyr::arrange(term) -> term2gene
head(term2gene)
# A tibble: 6 × 2
term gene
<chr> <chr>
1 GO:0000002 ENSNFUG00015006992
2 GO:0000002 ENSNFUG00015020070
3 GO:0000002 ENSNFUG00015013825
4 GO:0000002 ENSNFUG00015019642
5 GO:0000009 ENSNFUG00015008531
6 GO:0000012 ENSNFUG00015012664
Test with random genes.
set.seed(1984)
my_genes <- sample(x = unique(term2gene$gene), size = 100, replace = FALSE)
res <- enricher(
gene = my_genes,
TERM2GENE = term2gene,
TERM2NAME = term2name
)
res
#
# over-representation test
#
#...@organism UNKNOWN
#...@ontology UNKNOWN
#...@gene chr [1:100] "ENSNFUG00015003803" "ENSNFUG00015000174" "ENSNFUG00015015379" ...
#...pvalues adjusted by 'BH' with cutoff <0.05
#...0 enriched terms found
#...Citation
S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize multiomics data. Nature Protocols. 2024, 19(11):3292-3320
Test with genes associated with same term.
term2gene |>
dplyr::group_by(term) |>
dplyr::summarise(n = n()) |>
dplyr::arrange(-n) -> dev_null
my_term <- "GO:0006397"
dplyr::filter(term2name, term == my_term)
term name
GO:0006397 GO:0006397 mRNA processing
term2gene |>
dplyr::filter(term == my_term) |>
dplyr::pull(gene) |>
head(75) -> enriched_genes
set.seed(1984)
my_genes <- sample(x = unique(term2gene$gene), size = 25, replace = FALSE)
my_genes <- union(enriched_genes, my_genes)
res <- enricher(
gene = my_genes,
TERM2GENE = term2gene,
TERM2NAME = term2name
)
res |>
as.data.frame()
ID Description
GO:0006397 GO:0006397 mRNA processing
GO:0008380 GO:0008380 RNA splicing
GO:0005681 GO:0005681 spliceosomal complex
GO:0000398 GO:0000398 mRNA splicing, via spliceosome
GO:0071011 GO:0071011 precatalytic spliceosome
GO:0046540 GO:0046540 U4/U6 x U5 tri-snRNP complex
GO:0000381 GO:0000381 regulation of alternative mRNA splicing, via spliceosome
GO:0071013 GO:0071013 catalytic step 2 spliceosome
GO:0005682 GO:0005682 U5 snRNP
GO:1990904 GO:1990904 ribonucleoprotein complex
GO:0016607 GO:0016607 nuclear speck
GO:0005688 GO:0005688 U6 snRNP
GO:0003729 GO:0003729 mRNA binding
GO:0005685 GO:0005685 U1 snRNP
GO:0000387 GO:0000387 spliceosomal snRNP assembly
GO:0043484 GO:0043484 regulation of RNA splicing
GO:0004527 GO:0004527 exonuclease activity
GO:0005686 GO:0005686 U2 snRNP
GO:0003730 GO:0003730 mRNA 3'-UTR binding
GO:0005654 GO:0005654 nucleoplasm
GO:0000932 GO:0000932 P-body
GO:0080090 GO:0080090 regulation of primary metabolic process
GO:0030968 GO:0030968 endoplasmic reticulum unfolded protein response
GO:0005687 GO:0005687 U4 snRNP
GO:0004521 GO:0004521 RNA endonuclease activity
GeneRatio BgRatio RichFactor FoldEnrichment zScore
GO:0006397 75/100 125/18585 0.60000000 111.510000 91.180214
GO:0008380 43/100 89/18585 0.48314607 89.792697 61.757998
GO:0005681 22/100 54/18585 0.40740741 75.716667 40.441272
GO:0000398 21/100 71/18585 0.29577465 54.969718 33.511130
GO:0071011 7/100 16/18585 0.43750000 81.309375 23.636973
GO:0046540 9/100 43/18585 0.20930233 38.898837 18.299604
GO:0000381 7/100 20/18585 0.35000000 65.047500 21.078008
GO:0071013 7/100 20/18585 0.35000000 65.047500 21.078008
GO:0005682 5/100 19/18585 0.26315789 48.907895 15.366841
GO:1990904 10/100 228/18585 0.04385965 8.151316 7.991207
GO:0016607 5/100 32/18585 0.15625000 29.039062 11.675924
GO:0005688 4/100 15/18585 0.26666667 49.560000 13.838156
GO:0003729 7/100 104/18585 0.06730769 12.509135 8.656782
GO:0005685 4/100 26/18585 0.15384615 28.592308 10.355171
GO:0000387 3/100 12/18585 0.25000000 46.462500 11.586780
GO:0043484 3/100 14/18585 0.21428571 39.825000 10.688523
GO:0004527 3/100 19/18585 0.15789474 29.344737 9.091801
GO:0005686 3/100 30/18585 0.10000000 18.585000 7.089774
GO:0003730 3/100 31/18585 0.09677419 17.985484 6.961453
GO:0005654 5/100 129/18585 0.03875969 7.203488 5.200222
GO:0000932 3/100 35/18585 0.08571429 15.930000 6.502521
GO:0080090 2/100 12/18585 0.16666667 30.975000 7.639565
GO:0030968 2/100 14/18585 0.14285714 26.550000 7.033916
GO:0005687 2/100 15/18585 0.13333333 24.780000 6.776593
GO:0004521 2/100 17/18585 0.11764706 21.864706 6.330150
pvalue p.adjust qvalue
GO:0006397 1.059713e-152 1.653153e-150 1.372050e-150
GO:0008380 2.787322e-77 2.174111e-75 1.804424e-75
GO:0005681 6.857863e-37 3.566089e-35 2.959709e-35
GO:0000398 1.048636e-31 4.089682e-30 3.394270e-30
GO:0071011 1.159786e-12 3.298732e-11 2.737814e-11
GO:0046540 1.268743e-12 3.298732e-11 2.737814e-11
GO:0000381 7.722209e-12 1.505831e-10 1.249779e-10
GO:0071013 7.722209e-12 1.505831e-10 1.249779e-10
GO:0005682 4.465615e-08 7.740399e-07 6.424217e-07
GO:1990904 4.185349e-07 6.529145e-06 5.418926e-06
GO:0016607 7.316677e-07 1.037638e-05 8.611974e-06
GO:0005688 1.029158e-06 1.337905e-05 1.110407e-05
GO:0003729 1.464263e-06 1.757115e-05 1.458335e-05
GO:0005685 1.077073e-05 1.200167e-04 9.960900e-05
GO:0000387 3.210311e-05 3.338724e-04 2.771006e-04
GO:0043484 5.270183e-05 5.138429e-04 4.264688e-04
GO:0004527 1.375797e-04 1.262496e-03 1.047821e-03
GO:0005686 5.522099e-04 4.785820e-03 3.972036e-03
GO:0003730 6.089961e-04 5.000179e-03 4.149946e-03
GO:0005654 6.625769e-04 5.168100e-03 4.289314e-03
GO:0000932 8.730213e-04 6.485301e-03 5.382538e-03
GO:0080090 1.826457e-03 1.295124e-02 1.074900e-02
GO:0030968 2.500695e-03 1.696123e-02 1.407714e-02
GO:0005687 2.875322e-03 1.868959e-02 1.551160e-02
GO:0004521 3.698227e-03 2.307694e-02 1.915293e-02
geneID
GO:0006397 ENSNFUG00015000047/ENSNFUG00015000176/ENSNFUG00015000433/ENSNFUG00015000717/ENSNFUG00015000995/ENSNFUG00015002878/ENSNFUG00015003278/ENSNFUG00015003412/ENSNFUG00015003645/ENSNFUG00015006374/ENSNFUG00015008466/ENSNFUG00015010592/ENSNFUG00015011476/ENSNFUG00015011629/ENSNFUG00015012572/ENSNFUG00015015893/ENSNFUG00015018395/ENSNFUG00015021027/ENSNFUG00015023111/ENSNFUG00015024532/ENSNFUG00015000276/ENSNFUG00015000347/ENSNFUG00015000470/ENSNFUG00015000633/ENSNFUG00015000679/ENSNFUG00015001197/ENSNFUG00015001330/ENSNFUG00015002687/ENSNFUG00015003553/ENSNFUG00015003613/ENSNFUG00015004037/ENSNFUG00015004391/ENSNFUG00015008722/ENSNFUG00015009124/ENSNFUG00015010820/ENSNFUG00015011580/ENSNFUG00015013273/ENSNFUG00015014123/ENSNFUG00015014399/ENSNFUG00015014448/ENSNFUG00015019426/ENSNFUG00015020527/ENSNFUG00015020545/ENSNFUG00015021436/ENSNFUG00015004299/ENSNFUG00015006219/ENSNFUG00015006538/ENSNFUG00015007623/ENSNFUG00015008859/ENSNFUG00015008871/ENSNFUG00015008954/ENSNFUG00015013147/ENSNFUG00015014506/ENSNFUG00015014956/ENSNFUG00015015816/ENSNFUG00015017783/ENSNFUG00015019674/ENSNFUG00015019903/ENSNFUG00015020693/ENSNFUG00015021220/ENSNFUG00015022748/ENSNFUG00015023101/ENSNFUG00015023143/ENSNFUG00015004762/ENSNFUG00015005084/ENSNFUG00015005095/ENSNFUG00015006742/ENSNFUG00015008749/ENSNFUG00015009426/ENSNFUG00015013451/ENSNFUG00015014267/ENSNFUG00015014760/ENSNFUG00015014763/ENSNFUG00015018080/ENSNFUG00015018929
GO:0008380 ENSNFUG00015000047/ENSNFUG00015000433/ENSNFUG00015002878/ENSNFUG00015003412/ENSNFUG00015008466/ENSNFUG00015010592/ENSNFUG00015011629/ENSNFUG00015012572/ENSNFUG00015015893/ENSNFUG00015018395/ENSNFUG00015021027/ENSNFUG00015023111/ENSNFUG00015000276/ENSNFUG00015000347/ENSNFUG00015000470/ENSNFUG00015001197/ENSNFUG00015002687/ENSNFUG00015004391/ENSNFUG00015009124/ENSNFUG00015010820/ENSNFUG00015011580/ENSNFUG00015013273/ENSNFUG00015014123/ENSNFUG00015014448/ENSNFUG00015019426/ENSNFUG00015020527/ENSNFUG00015008859/ENSNFUG00015008871/ENSNFUG00015013147/ENSNFUG00015014506/ENSNFUG00015019674/ENSNFUG00015019903/ENSNFUG00015020693/ENSNFUG00015021220/ENSNFUG00015022748/ENSNFUG00015023101/ENSNFUG00015005084/ENSNFUG00015008749/ENSNFUG00015014267/ENSNFUG00015014760/ENSNFUG00015014763/ENSNFUG00015018080/ENSNFUG00015018929
GO:0005681 ENSNFUG00015000433/ENSNFUG00015002878/ENSNFUG00015010592/ENSNFUG00015011629/ENSNFUG00015012572/ENSNFUG00015015893/ENSNFUG00015018395/ENSNFUG00015021027/ENSNFUG00015023111/ENSNFUG00015001197/ENSNFUG00015002687/ENSNFUG00015014123/ENSNFUG00015019426/ENSNFUG00015014506/ENSNFUG00015019903/ENSNFUG00015020693/ENSNFUG00015022748/ENSNFUG00015023101/ENSNFUG00015005084/ENSNFUG00015014760/ENSNFUG00015014763/ENSNFUG00015018929
GO:0000398 ENSNFUG00015000433/ENSNFUG00015002878/ENSNFUG00015003412/ENSNFUG00015010592/ENSNFUG00015011629/ENSNFUG00015012572/ENSNFUG00015015893/ENSNFUG00015023111/ENSNFUG00015001197/ENSNFUG00015002687/ENSNFUG00015014123/ENSNFUG00015008871/ENSNFUG00015014506/ENSNFUG00015019903/ENSNFUG00015020693/ENSNFUG00015022748/ENSNFUG00015005084/ENSNFUG00015008749/ENSNFUG00015014267/ENSNFUG00015014760/ENSNFUG00015014763
GO:0071011 ENSNFUG00015000433/ENSNFUG00015011629/ENSNFUG00015018395/ENSNFUG00015014123/ENSNFUG00015019674/ENSNFUG00015005084/ENSNFUG00015014763
GO:0046540 ENSNFUG00015002878/ENSNFUG00015011629/ENSNFUG00015014123/ENSNFUG00015008871/ENSNFUG00015019674/ENSNFUG00015019903/ENSNFUG00015020693/ENSNFUG00015005084/ENSNFUG00015014763
GO:0000381 ENSNFUG00015024532/ENSNFUG00015000470/ENSNFUG00015011580/ENSNFUG00015014123/ENSNFUG00015014448/ENSNFUG00015008859/ENSNFUG00015013147
GO:0071013 ENSNFUG00015010592/ENSNFUG00015018395/ENSNFUG00015001197/ENSNFUG00015004391/ENSNFUG00015019674/ENSNFUG00015005084/ENSNFUG00015014267
GO:0005682 ENSNFUG00015018395/ENSNFUG00015001197/ENSNFUG00015008871/ENSNFUG00015019674/ENSNFUG00015020693
GO:1990904 ENSNFUG00015000047/ENSNFUG00015018395/ENSNFUG00015024532/ENSNFUG00015008722/ENSNFUG00015010820/ENSNFUG00015014956/ENSNFUG00015019674/ENSNFUG00015021220/ENSNFUG00015005084/ENSNFUG00015014763
GO:0016607 ENSNFUG00015003412/ENSNFUG00015006374/ENSNFUG00015000276/ENSNFUG00015009124/ENSNFUG00015019426
GO:0005688 ENSNFUG00015002878/ENSNFUG00015011629/ENSNFUG00015019903/ENSNFUG00015005084
GO:0003729 ENSNFUG00015000047/ENSNFUG00015000995/ENSNFUG00015008466/ENSNFUG00015000347/ENSNFUG00015000470/ENSNFUG00015014448/ENSNFUG00015008954
GO:0005685 ENSNFUG00015008466/ENSNFUG00015010592/ENSNFUG00015018395/ENSNFUG00015019674
GO:0000387 ENSNFUG00015010592/ENSNFUG00015018395/ENSNFUG00015019674
GO:0043484 ENSNFUG00015000995/ENSNFUG00015014448/ENSNFUG00015018080
GO:0004527 ENSNFUG00015014399/ENSNFUG00015005095/ENSNFUG00015000334
GO:0005686 ENSNFUG00015000433/ENSNFUG00015018395/ENSNFUG00015019674
GO:0003730 ENSNFUG00015024532/ENSNFUG00015008722/ENSNFUG00015004762
GO:0005654 ENSNFUG00015021027/ENSNFUG00015023111/ENSNFUG00015010820/ENSNFUG00015019426/ENSNFUG00015018836
GO:0000932 ENSNFUG00015002878/ENSNFUG00015006219/ENSNFUG00015005084
GO:0080090 ENSNFUG00015011476/ENSNFUG00015021436
GO:0030968 ENSNFUG00015011476/ENSNFUG00015021436
GO:0005687 ENSNFUG00015018395/ENSNFUG00015014123
GO:0004521 ENSNFUG00015011476/ENSNFUG00015021436
Count
GO:0006397 75
GO:0008380 43
GO:0005681 22
GO:0000398 21
GO:0071011 7
GO:0046540 9
GO:0000381 7
GO:0071013 7
GO:0005682 5
GO:1990904 10
GO:0016607 5
GO:0005688 4
GO:0003729 7
GO:0005685 4
GO:0000387 3
GO:0043484 3
GO:0004527 3
GO:0005686 3
GO:0003730 3
GO:0005654 5
GO:0000932 3
GO:0080090 2
GO:0030968 2
GO:0005687 2
GO:0004521 2
Check some of the results.
go_term <- "GO:0007156"
dplyr::filter(term2name, term == go_term)
term
GO:0007156 GO:0007156
name
GO:0007156 homophilic cell adhesion via plasma membrane adhesion molecules
dplyr::filter(term2gene, gene %in% my_genes, term == go_term) |>
nrow()
[1] 1
dplyr::filter(term2gene, term == go_term) |>
nrow()
[1] 130
dotplot(res, showCategory=10)
sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] GO.db_3.20.0 org.Hs.eg.db_3.20.0 AnnotationDbi_1.68.0
[4] IRanges_2.40.1 S4Vectors_0.44.0 Biobase_2.66.0
[7] BiocGenerics_0.52.0 clusterProfiler_4.14.6 ggarchery_0.4.3
[10] lubridate_1.9.4 forcats_1.0.0 stringr_1.5.1
[13] dplyr_1.1.4 purrr_1.0.4 readr_2.1.5
[16] tidyr_1.3.1 tibble_3.2.1 ggplot2_3.5.1
[19] tidyverse_2.0.0 workflowr_1.7.1
loaded via a namespace (and not attached):
[1] DBI_1.2.3 gson_0.1.0 rlang_1.1.5
[4] magrittr_2.0.3 DOSE_4.0.0 git2r_0.35.0
[7] compiler_4.4.1 RSQLite_2.3.9 getPass_0.2-4
[10] png_0.1-8 callr_3.7.6 vctrs_0.6.5
[13] reshape2_1.4.4 pkgconfig_2.0.3 crayon_1.5.3
[16] fastmap_1.2.0 XVector_0.46.0 labeling_0.4.3
[19] utf8_1.2.4 promises_1.3.2 rmarkdown_2.29
[22] tzdb_0.4.0 enrichplot_1.26.6 UCSC.utils_1.2.0
[25] ps_1.9.0 bit_4.5.0.1 xfun_0.51
[28] zlibbioc_1.52.0 cachem_1.1.0 aplot_0.2.5
[31] GenomeInfoDb_1.42.3 jsonlite_1.9.1 blob_1.2.4
[34] later_1.4.1 BiocParallel_1.40.0 parallel_4.4.1
[37] R6_2.6.1 bslib_0.9.0 stringi_1.8.4
[40] RColorBrewer_1.1-3 jquerylib_0.1.4 GOSemSim_2.32.0
[43] Rcpp_1.0.14 knitr_1.49 ggtangle_0.0.6
[46] R.utils_2.13.0 igraph_2.1.4 httpuv_1.6.15
[49] Matrix_1.7-0 splines_4.4.1 timechange_0.3.0
[52] tidyselect_1.2.1 qvalue_2.38.0 rstudioapi_0.17.1
[55] yaml_2.3.10 codetools_0.2-20 curl_6.2.1
[58] processx_3.8.6 lattice_0.22-6 plyr_1.8.9
[61] treeio_1.30.0 withr_3.0.2 KEGGREST_1.46.0
[64] evaluate_1.0.3 gridGraphics_0.5-1 Biostrings_2.74.1
[67] ggtree_3.14.0 pillar_1.10.1 whisker_0.4.1
[70] ggfun_0.1.8 generics_0.1.3 vroom_1.6.5
[73] rprojroot_2.0.4 hms_1.1.3 tidytree_0.4.6
[76] munsell_0.5.1 scales_1.3.0 glue_1.8.0
[79] lazyeval_0.2.2 tools_4.4.1 data.table_1.17.0
[82] fgsea_1.32.2 fs_1.6.5 fastmatch_1.1-6
[85] cowplot_1.1.3 grid_4.4.1 ape_5.8-1
[88] colorspace_2.1-1 nlme_3.1-164 patchwork_1.3.0
[91] GenomeInfoDbData_1.2.13 cli_3.6.4 gtable_0.3.6
[94] R.methodsS3_1.8.2 yulab.utils_0.2.0 sass_0.4.9
[97] digest_0.6.37 ggrepel_0.9.6 ggplotify_0.1.2
[100] farver_2.1.2 memoise_2.0.1 htmltools_0.5.8.1
[103] R.oo_1.27.0 lifecycle_1.0.4 httr_1.4.7
[106] bit64_4.6.0-1