Loading required package: Signac
The joint analysis of multiple single-cell datasets is achieved by identifying cross-dataset pairs of cells that have shared cell populations, named as anchors. This helps address two problems:
#Data normalization and variable feature selection
lung.list <- SplitObject(lung, split.by = "orig.ident")
lung.list <- lapply(X = lung.list, FUN = function(x) {
x <- NormalizeData(x, verbose = FALSE)
x <- FindVariableFeatures(x, verbose = FALSE)
})
#select integration features
features <- SelectIntegrationFeatures(object.list = lung.list)
lung.list <- lapply(X = lung.list, FUN = function(x) {
x <- ScaleData(x, features = features, verbose = FALSE)
x <- RunPCA(x, features = features, verbose = FALSE)
})
#iteratively find anchors that link cells across two datasets
anchors <- FindIntegrationAnchors(object.list = lung.list, reference = c(1, 2), reduction = "rpca",
dims = 1:20)#1:50
lung.integrated <- IntegrateData(anchorset = anchors, dims = 1:20)
Major cell types
T cell subsets
immune.combined
An object of class Seurat 321136 features across 53647 samples within 4 assays Active assay: peaks (133665 features, 133665 variable features) 3 other assays present: RNA, ATAC, integrated 3 dimensional reductions calculated: pca, umap, lsi
Differential expression test was performed on normalized values stored in immune.combine[["RNA"]]@data
. Here are the normalization steps:
Volcano plots and QQ plots comparing against null
p1 + p2
eg = bitr(rownames(sig.genes), fromType = "SYMBOL", toType="ENTREZID", OrgDb = "org.Hs.eg.db")
all.eg = bitr(rownames(DEG.test), fromType = "SYMBOL", toType="ENTREZID", OrgDb = "org.Hs.eg.db")
go_list <- enrichGO(gene = eg$ENTREZID,
universe = names(all.eg$ENTREZID),
OrgDb = org.Hs.eg.db,
ont = "BP",
pAdjustMethod = "BH",
pvalueCutoff = 0.01,
qvalueCutoff = 0.05,
readable = TRUE)
library(forcats)
library(enrichplot)
go_list<-mutate(go_list, richFactor = Count / as.numeric(sub("/\\d+", "", BgRatio)))
ggplot(go_list, showCategory = 20,
aes(richFactor, fct_reorder(Description, richFactor))) +
geom_segment(aes(xend=0, yend = Description)) +
geom_point(aes(color=qvalue, size = Count)) +
scale_color_viridis_c(guide=guide_colorbar(reverse=TRUE)) +
scale_size_continuous(range=c(2, 10)) +
theme_minimal() + theme(axis.text.y=element_text(size=12)) +
xlab("rich factor") +
ylab(NULL)
Color scheme:
Red: negative log2FC
blue: positive log2FC
grey: data points not passing bonferroni corrected cutoff 0.05
[[1]] NULL
[[1]] NULL
Warning message: “ggrepel: 167 unlabeled data points (too many overlaps). Consider increasing max.overlaps”
Warning message: “ggrepel: 94 unlabeled data points (too many overlaps). Consider increasing max.overlaps”
Warning message: “ggrepel: 88 unlabeled data points (too many overlaps). Consider increasing max.overlaps”
Warning message: “ggrepel: 237 unlabeled data points (too many overlaps). Consider increasing max.overlaps”
Warning message: “ggrepel: 221 unlabeled data points (too many overlaps). Consider increasing max.overlaps” Warning message: “ggrepel: 295 unlabeled data points (too many overlaps). Consider increasing max.overlaps” Warning message: “ggrepel: 63 unlabeled data points (too many overlaps). Consider increasing max.overlaps” Warning message: “ggrepel: 39 unlabeled data points (too many overlaps). Consider increasing max.overlaps”
[[1]] NULL
Identify the known pathways that are enriched with the differentially expressed genes in lungs
[[1]] NULL
[[1]] NULL
[[1]] NULL