Last updated: 2022-06-14

Checks: 7 0

Knit directory: rotation2/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20220607) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 6226526. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Unstaged changes:
    Modified:   .RData
    Modified:   .Rhistory

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/project_1.Rmd) and HTML (docs/project_1.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 6226526 chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html dd2046d chenh19 2022-06-14 Build site.
Rmd 71f5d04 chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html 2c4cab1 chenh19 2022-06-14 Build site.
Rmd 6e73c04 chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html d46eaab chenh19 2022-06-14 Build site.
Rmd be56d9d chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html 02a0c26 chenh19 2022-06-14 Build site.
Rmd e2c15c0 chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html abad46e chenh19 2022-06-14 Build site.
Rmd 68948e3 chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html 741027b chenh19 2022-06-14 Build site.
Rmd 065d7e9 chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html bb15812 chenh19 2022-06-14 Build site.
Rmd b6e0993 chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html 93a27ae chenh19 2022-06-14 Build site.
Rmd b4a7331 chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html 27121a9 chenh19 2022-06-14 Build site.
Rmd fce1ffd chenh19 2022-06-14 wflow_publish("analysis/*.Rmd")
html 44517c1 chenh19 2022-06-13 Build site.
Rmd cc6a40a chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html da08f11 chenh19 2022-06-13 Build site.
Rmd 05ccc35 chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html 572f6ba chenh19 2022-06-13 Build site.
html b8870d3 chenh19 2022-06-13 Build site.
html 719925e chenh19 2022-06-13 Build site.
html e7541fa chenh19 2022-06-13 Build site.
html 9d9615d chenh19 2022-06-13 Build site.
Rmd 04feaa7 chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html bbd8978 chenh19 2022-06-13 Build site.
Rmd 0ec2bfa chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html e5a5b52 chenh19 2022-06-13 Build site.
Rmd c43ae1f chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html 4d8bd72 chenh19 2022-06-13 Build site.
html 3373521 chenh19 2022-06-13 Build site.
html af21ea8 chenh19 2022-06-13 Build site.
Rmd 6e56d75 chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html f653f7b chenh19 2022-06-13 Build site.
Rmd 2723e7f chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html d69c892 chenh19 2022-06-13 Build site.
html 34d877d chenh19 2022-06-13 Build site.
html e72400b chenh19 2022-06-13 Build site.
html c411223 chenh19 2022-06-13 Build site.
html 1daccd2 chenh19 2022-06-13 Build site.
Rmd 63f46d2 chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html 26adb45 chenh19 2022-06-13 Build site.
html a6022a8 chenh19 2022-06-13 Build site.
Rmd 1215832 chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html 9abc4b8 chenh19 2022-06-13 Build site.
Rmd 7efcfe0 chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html f18d385 chenh19 2022-06-13 Build site.
Rmd a7c1ce0 chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html e991f56 chenh19 2022-06-13 Build site.
html 3c9b1d9 chenh19 2022-06-13 Build site.
Rmd ae1553a chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html 34e0d02 chenh19 2022-06-13 Build site.
Rmd e69aa83 chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html 9be31af chenh19 2022-06-13 Build site.
Rmd ead84c2 chenh19 2022-06-13 wflow_publish("analysis/*.Rmd")
html 0f41de8 chenh19 2022-06-13 Build site.
html 31ad035 chenh19 2022-06-13 Build site.
html bdf3b44 chenh19 2022-06-13 Build site.
html 8d0890c chenh19 2022-06-13 Build site.
Rmd 26f455b chenh19 2022-06-13 update
html 26f455b chenh19 2022-06-13 update

1. Understand STING-seq

a. Read the Morris paper

Morris et al. 2021: STING-Seq

Some key points:

Some key ideas:

  • STING-seq: Systematic Targeting and Inbition of Noncoding GWAS loci with scRNA-seq
  • prioritizes candidate cis-regulatory elements (cCREs, 1kb<distance to TSS<1Mb) using fine-mapped GWAS
  • selected 88 variants (in 56 loci) with enhancer activity
  • dual CRISPR inhibition: dCas9 as the GPS, MeCP2 and KRAB as the repressors
  • confirming dual CRISPRi efficacy: gRNAs target TSS of MRPS23, CTSB, FSCN1
  • CRIPSRi on the 88 variants: two gRNAs for each variant, both within 200bp of the variant
  • ECCITE-seq: captures gRNAs and epitopes

Some data processing steps and results:

  • QC: remove cells with low total reads or excessive mitochondrial reads, gRNA assignment UMI>5 (9,343 cells after QC)
  • Kallisto: counting
  • Seurat: QC and reference mapping? read more on the official website
  • SCEPTRE: gRNA_to_gene-expression pairwise test
  • non-targeting gRNA-gene pairs: not significant (negative ctrl)
  • TSS-targeting gRNA-gene pairs: expression significantly decreased (positive ctrl)
  • 37 of the 88 variants were significant
  • Trans-regulatory elements: I’ll come back later

Note:

2. Prelim QC for STING-seq data

Read a RNA-seq analysis pipeline review

Yalamanchili et al. 2017: RNA-seq analysis pipeline

Some key points:

Protocol-1 (differential expression of genes):

  • demuxed raw reads (FastQC)
  • trimming reads (awk)
  • aligning reads (TopHat2)
  • counting reads (HTSeq; may filter out genes with low counts before next step)
  • detect DE using counted reads (DEseq2)
  • more QC (PCA/correlation heatmap)

Protocol-2 (differential usage fo isoforms):

  • Protocol-1
  • counting isoforms (Kallisto)
  • detect DU using counted isoforms (Sleuth)
  • more QC (aslo PCA/correlation heatmap)

Protocol-3 (crypic splicing):

  • Protocol-1
  • detect differential junstions (CrypSplice)

b. Download data

rsync -a hchen131@midway2.rcc.uchicago.edu:/project2/xuanyao/data/Morris_2021 ~/Desktop

c. Perform FastQC on all fastq files

# install fastqc
sudo apt-get update && sudo apt-get install fastqc -y

# run fastqc in parallel
cd ~/rotation2/
[ ! -d ../Morris_2021/fastqc_results/ ] && mkdir ../Morris_2021/fastqc_results/
fastqc --threads 16 ../Morris_2021/STINGseq_Morris_2021_raw/*.fastq.gz --outdir=../Morris_2021/fastqc_results/

Sample SRR14141135:

Sample SRR14141136:

Sample SRR14141137:

Sample SRR14141138:

Sample SRR14141139:

Sample SRR14141140:

Sample SRR14141141:

Sample SRR14141142:

Sample SRR14141143:

Sample SRR14141144:

Sample SRR14141145:

Sample SRR14141146:

d. A brief summary

  • samples: 12
  • length: 26bp or 57bp (trimmed?)
  • depth: 30-35x
  • overall quality: good (within ~40 bp)

3. Analyze QC’ed STING-seq data

Note:

  • cDNA, HTO (Hashing antibody-oligos), GDO (gRNAs)
  • ECCITE-seq:

a. Install R packages

sudo apt-get update && sudo apt-get install libgeos-dev -y
sudo -i R
installe.packages("Seurat")
q()

b. Import QC’ed data

f_exp_matrix <- Seurat::ReadMtx(mtx = "../Morris_2021/GSM5225857_cDNA.mtx", 
                                features = "../Morris_2021/GSM5225857_cDNA.features.txt", 
                                cells = "../Morris_2021/GSM5225857_cDNA.barcodes.txt", 
                                cell.column=1, 
                                feature.column=1, 
                                mtx.transpose=T)
f_grna_matrix <- Seurat::ReadMtx(mtx = "../Morris_2021/GSM5225858_GDO.mtx", 
                                 features = "../Morris_2021/GSM5225858_GDO.features.txt", 
                                 cells = "../Morris_2021/GSM5225858_GDO.barcodes.txt", 
                                 cell.column=1, 
                                 feature.column=1, 
                                 mtx.transpose=T)

sessionInfo()
R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] workflowr_1.7.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.8.3     bslib_0.3.1      compiler_4.2.0   pillar_1.7.0    
 [5] later_1.3.0      git2r_0.30.1     jquerylib_0.1.4  tools_4.2.0     
 [9] getPass_0.2-2    digest_0.6.29    jsonlite_1.8.0   evaluate_0.15   
[13] tibble_3.1.7     lifecycle_1.0.1  pkgconfig_2.0.3  rlang_1.0.2     
[17] cli_3.3.0        rstudioapi_0.13  yaml_2.3.5       xfun_0.31       
[21] fastmap_1.1.0    httr_1.4.3       stringr_1.4.0    knitr_1.39      
[25] sass_0.4.1       fs_1.5.2         vctrs_0.4.1      rprojroot_2.0.3 
[29] glue_1.6.2       R6_2.5.1         processx_3.6.0   fansi_1.0.3     
[33] rmarkdown_2.14   callr_3.7.0      magrittr_2.0.3   whisker_0.4     
[37] ps_1.7.0         promises_1.2.0.1 htmltools_0.5.2  ellipsis_0.3.2  
[41] httpuv_1.6.5     utf8_1.2.2       stringi_1.7.6    crayon_1.5.1