pySCENIC (https://pyscenic.readthedocs.io/) provides a faster implementation of SCENIC (in Python) that can be easily paralelized with Dask (e.g. to take advantage distributed systems). The results of pySCENIC can easily be imported into R.
The main results of SCENIC analysis are stored in the loom file:
Regulons
Regulon activity (AUC matrix and thresholds)
Embeddings (e.g. t-SNE and UMAP on the regulon activity)
The motif enrichment analysis and co-expression modules (e.g. GRNBoost/GENIE3 output) are stored in independent text files (mostly due to their bigger size).
# Output directories (adjust to yours):
pyScenicDir <- "pySCENIC_example/output"
library(SCENIC)
SCopeLoomR
provides functions to import the regulons, AUC, and embeddings from the loom file:
library(SCopeLoomR)
pyScenicLoomFile <- file.path(pyScenicDir, "SCENIC.loom")
loom <- open_loom(pyScenicLoomFile, mode="r")
# Read information from loom file:
regulons_incidMat <- get_regulons(loom)
regulons <- regulonsToGeneLists(regulons_incidMat)
regulonsAUC <- get_regulonsAuc(loom)
regulonsAucThresholds <- get_regulonThresholds(loom)
embeddings <- get_embeddings(loom)
In addition, the loom file also stores the expression matrix, and allows to save information about the cells (e.g. cell annotations), or results from previous clustering analyses:
exprMat <- get_dgem(loom)
cellInfo <- get_cellAnnotation(loom)
clusterings <- get_clusterings_withName(loom)
close_loom(loom)
The motif enrichment results provided by pySCENIC are in a slightly different format than the table provided by RcisTarget (e.g. R implementation of SCENIC) but they are equivalent. This table can be read from the text file and visualized/explored in a similar way:
# Read table
motifsDf <- data.table::fread(file.path(pyScenicDir, "motifs.tsv"), header = T, sep="\t")
maxRows <- 20 # (low value only for the tutorial)
# Visualize
tableSubset <- motifsDf[TF=="Dlx5"]
tableSubset <- tableSubset[1:maxRows,]
colsToShow <- colnames(motifsDf)[-c(2,9:11)]
viewMotifs(tableSubset, colsToShow=colsToShow)
GRNBoost_linkList <- importArboreto(file.path(pyScenicDir, "adjacencies.tsv"))
head(GRNBoost_linkList)
## TF Target Importance
## 1 Mef2c Camk2n1 183.0498
## 2 Mef2c Arpp19 153.1191
## 3 Olig1 Tspan2 147.6836
## 4 Olig1 Cnp 143.0022
## 5 Olig1 Plp1 134.5772
## 6 Mef2c Arpp21 132.2724
There are also some options to save/import only partial results (e.g. not using a .loom file):
Regulons and co-expression modules can be exported as .gmt files, which can be loaded with GSEABase
:
library(GSEABase)
regulons <- getGmt(file.path(pyScenicDir, "regulons.gmt"))
regulons <- geneIds(regulons)
names(regulons)[5:10]
## [1] "Ahr (9g)" "Alx4 (4g)" "Ar (17g)" "Arid5b (33g)"
## [5] "Arnt (68g)" "Arnt2 (861g)"
regulons[3]
## $`Acaa1a (43g)`
## [1] "Tox" "Fxr2" "Wnt3" "Rhoa"
## [5] "2610100L16Rik" "Plp1" "Hes5" "Atf4"
## [9] "Tcf7l2" "Ptma" "Peli2" "Xrcc6"
## [13] "Tnrc6a" "Sys1" "Yif1a" "Hnrnpf"
## [17] "Hsd17b12" "Kctd15" "Fabp5" "H2afv"
## [21] "Arrdc3" "Chd4" "Fntb" "Pik3c2b"
## [25] "Hist1h1c" "Klf7" "Atp1b3" "Sh3gl1"
## [29] "Ehd1" "Fam173b" "Abhd1" "Cyp51"
## [33] "Car14" "Mir1897" "Hhip" "Pigu"
## [37] "Eif1" "Sema3c" "Chd7" "Rab33a"
## [41] "Eef2" "Hsp90b1" "Chn2"
The “AUC matrix” provided by pySCENIC can be saved as text and imported with importAUCfromText
:
regulonAUC <- importAUCfromText(file.path(pyScenicDir, "aucMatrix.tsv"))
regulonAUC
## AUC for 550 regulons (rows) and 3005 cells (columns).
##
## Top-left corner of the AUC matrix:
## cells
## regulons 1772071015_C02 1772071017_G12 1772071017_A05
## 1810024B03Rik (52g) 0.00000000 0.00000000 0.000000000
## Abl1 (13g) 0.00000000 0.02803002 0.009673736
## Acaa1a (43g) 0.05773684 0.04434502 0.050155141
## Acaa1b (60g) 0.21207155 0.23960251 0.190507385
## Ahr (9g) 0.00000000 0.00000000 0.000000000
## Alx4 (4g) 0.00000000 0.00000000 0.000000000
## cells
## regulons 1772071014_B06 1772067065_H06
## 1810024B03Rik (52g) 0.00000000 0.00000000
## Abl1 (13g) 0.00000000 0.05237933
## Acaa1a (43g) 0.05552126 0.05285452
## Acaa1b (60g) 0.24700614 0.20984224
## Ahr (9g) 0.00000000 0.00000000
## Alx4 (4g) 0.00000000 0.00000000