Last updated: 2022-05-12
Checks: 5 2
Knit directory: cTWAS_analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20211220)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.
absolute | relative |
---|---|
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/data/ | data |
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/code/ctwas_config.R | code/ctwas_config.R |
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 011327d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .ipynb_checkpoints/
Ignored: data/AF/
Untracked files:
Untracked: G_list.RData
Untracked: Rplot.png
Untracked: SCZ_annotation.xlsx
Untracked: analysis/.ipynb_checkpoints/
Untracked: code/.ipynb_checkpoints/
Untracked: code/AF_out/
Untracked: code/Autism_out/
Untracked: code/BMI_S_out/
Untracked: code/BMI_out/
Untracked: code/Glucose_out/
Untracked: code/LDL_S_out/
Untracked: code/SCZ_2014_EUR_out/
Untracked: code/SCZ_2018_S_out/
Untracked: code/SCZ_2018_out/
Untracked: code/SCZ_2020_Single_out/
Untracked: code/SCZ_2020_out/
Untracked: code/SCZ_S_out/
Untracked: code/SCZ_out/
Untracked: code/T2D_out/
Untracked: code/ctwas_config.R
Untracked: code/mapping.R
Untracked: code/out/
Untracked: code/process_scz_2018_snps.R
Untracked: code/run_AF_analysis.sbatch
Untracked: code/run_AF_analysis.sh
Untracked: code/run_AF_ctwas_rss_LDR.R
Untracked: code/run_Autism_analysis.sbatch
Untracked: code/run_Autism_analysis.sh
Untracked: code/run_Autism_ctwas_rss_LDR.R
Untracked: code/run_BMI_analysis.sbatch
Untracked: code/run_BMI_analysis.sh
Untracked: code/run_BMI_analysis_S.sbatch
Untracked: code/run_BMI_analysis_S.sh
Untracked: code/run_BMI_ctwas_rss_LDR.R
Untracked: code/run_BMI_ctwas_rss_LDR_S.R
Untracked: code/run_Glucose_analysis.sbatch
Untracked: code/run_Glucose_analysis.sh
Untracked: code/run_Glucose_ctwas_rss_LDR.R
Untracked: code/run_LDL_analysis_S.sbatch
Untracked: code/run_LDL_analysis_S.sh
Untracked: code/run_LDL_ctwas_rss_LDR_S.R
Untracked: code/run_SCZ_2014_EUR_analysis.sbatch
Untracked: code/run_SCZ_2014_EUR_analysis.sh
Untracked: code/run_SCZ_2014_EUR_ctwas_rss_LDR.R
Untracked: code/run_SCZ_2018_analysis.sbatch
Untracked: code/run_SCZ_2018_analysis.sh
Untracked: code/run_SCZ_2018_analysis_S.sbatch
Untracked: code/run_SCZ_2018_analysis_S.sh
Untracked: code/run_SCZ_2018_ctwas_rss_LDR.R
Untracked: code/run_SCZ_2018_ctwas_rss_LDR_S.R
Untracked: code/run_SCZ_2020_Single_analysis.sbatch
Untracked: code/run_SCZ_2020_Single_analysis.sh
Untracked: code/run_SCZ_2020_Single_ctwas_rss_LDR.R
Untracked: code/run_SCZ_2020_analysis.sbatch
Untracked: code/run_SCZ_2020_analysis.sh
Untracked: code/run_SCZ_2020_ctwas_rss_LDR.R
Untracked: code/run_SCZ_analysis.sbatch
Untracked: code/run_SCZ_analysis.sh
Untracked: code/run_SCZ_analysis_S.sbatch
Untracked: code/run_SCZ_analysis_S.sh
Untracked: code/run_SCZ_ctwas_rss_LDR.R
Untracked: code/run_SCZ_ctwas_rss_LDR_S.R
Untracked: code/run_T2D_analysis.sbatch
Untracked: code/run_T2D_analysis.sh
Untracked: code/run_T2D_ctwas_rss_LDR.R
Untracked: code/wflow_build.R
Untracked: code/wflow_build.sbatch
Untracked: data/.ipynb_checkpoints/
Untracked: data/BMI/
Untracked: data/GO_Terms/
Untracked: data/PGC3_SCZ_wave3_public.v2.tsv
Untracked: data/SCZ/
Untracked: data/SCZ_2014_EUR/
Untracked: data/SCZ_2018/
Untracked: data/SCZ_2018_S/
Untracked: data/SCZ_2020/
Untracked: data/SCZ_2020_Single/
Untracked: data/SCZ_S/
Untracked: data/Supplementary Table 15 - MAGMA.xlsx
Untracked: data/Supplementary Table 20 - Prioritised Genes.xlsx
Untracked: data/T2D/
Untracked: data/UKBB/
Untracked: data/UKBB_SNPs_Info.text
Untracked: data/gene_OMIM.txt
Untracked: data/gene_pip_0.8.txt
Untracked: data/mashr_Heart_Atrial_Appendage.db
Untracked: data/mashr_sqtl/
Untracked: data/scz_2018.RDS
Untracked: data/summary_known_genes_annotations.xlsx
Untracked: data/untitled.txt
Untracked: top_genes_32.txt
Untracked: top_genes_37.txt
Untracked: top_genes_43.txt
Untracked: top_genes_81.txt
Untracked: z_snp_pos_SCZ.RData
Untracked: z_snp_pos_SCZ_2014_EUR.RData
Untracked: z_snp_pos_SCZ_2018.RData
Untracked: z_snp_pos_SCZ_2020.RData
Unstaged changes:
Deleted: analysis/BMI_S_results.Rmd
Modified: analysis/SCZ_2018_Brain_Amygdala_S.Rmd
Modified: analysis/SCZ_2018_Brain_Anterior_cingulate_cortex_BA24_S.Rmd
Modified: analysis/SCZ_2018_Brain_Caudate_basal_ganglia_S.Rmd
Modified: analysis/SCZ_2018_Brain_Cerebellar_Hemisphere_S.Rmd
Modified: analysis/SCZ_2018_Brain_Cerebellum_S.Rmd
Modified: analysis/SCZ_2018_Brain_Cortex_S.Rmd
Modified: analysis/SCZ_2018_Brain_Frontal_Cortex_BA9_S.Rmd
Modified: analysis/SCZ_2018_Brain_Hippocampus_S.Rmd
Modified: analysis/SCZ_2018_Brain_Hypothalamus_S.Rmd
Modified: analysis/SCZ_2018_Brain_Nucleus_accumbens_basal_ganglia_S.Rmd
Modified: analysis/SCZ_2018_Brain_Putamen_basal_ganglia_S.Rmd
Modified: analysis/SCZ_2018_Brain_Spinal_cord_cervical_c-1_S.Rmd
Modified: analysis/SCZ_2018_Brain_Substantia_nigra_S.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/SCZ_2018_Brain_Hippocampus_S.Rmd
) and HTML (docs/SCZ_2018_Brain_Hippocampus_S.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
html | 011327d | sq-96 | 2022-05-12 | update |
Rmd | 6c6abbd | sq-96 | 2022-05-12 | update |
library(reticulate)
use_python("/scratch/midway2/shengqian/miniconda3/envs/PythonForR/bin/python",required=T)
#number of imputed weights
nrow(qclist_all)
[1] 17848
#number of imputed weights by chromosome
table(qclist_all$chr)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1685 1258 1054 701 707 931 1057 622 737 814 1072 981 359 635 616 686
17 18 19 20 21 22
1225 243 1275 611 30 549
#number of imputed weights without missing variants
sum(qclist_all$nmiss==0)
[1] 15888
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.8902
finish
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
gene snp
0.0079441 0.0003125
gene snp
13.20 10.19
[1] 105318
[1] 7021 6309950
gene snp
0.00699 0.19087
[1] 0.01552 1.07050
genename region_tag susie_pip mu2 PVE z num_intron num_sqtl
3251 LRP8 1_33 1.1226 33.65 0.0003019 4.820 5 5
2405 GIGYF1 7_62 1.0221 34.53 0.0002916 5.266 3 3
2126 FAM177A1 14_9 1.0157 24.42 0.0002097 -4.872 11 13
3236 LPCAT4 15_10 0.9460 25.64 0.0002120 4.892 3 3
3155 LINC00320 21_6 0.9267 29.45 0.0002312 5.336 3 3
587 B3GAT1 11_84 0.8969 23.77 0.0001658 -4.448 8 12
4180 PAK6 15_14 0.8895 30.73 0.0002308 5.588 1 1
1450 CRTAP 3_24 0.8847 20.12 0.0001488 3.929 2 2
3327 MAD1L1 7_3 0.8812 69.62 0.0003687 8.182 6 7
6037 THAP8 19_25 0.8507 19.76 0.0001358 -3.847 2 2
4225 PCBP2 12_33 0.8426 26.30 0.0001773 -4.953 2 2
1660 DGKZ 11_28 0.8422 48.30 0.0003253 7.216 2 2
6256 TPGS2 18_19 0.8198 28.26 0.0001709 -4.088 4 4
1322 COA8 14_54 0.7858 46.02 0.0002582 -7.407 7 10
4042 NT5C2 10_66 0.7796 48.83 0.0002566 -8.541 11 13
112 ACTR1B 2_57 0.7780 20.17 0.0001136 3.978 5 5
298 ANAPC7 12_67 0.7579 38.23 0.0001921 6.385 4 4
5322 SF3B1 2_117 0.7509 46.51 0.0002443 -7.053 2 2
1799 DPYSL3 5_86 0.7392 22.59 0.0001172 -4.157 1 1
6768 ZDHHC20 13_2 0.7325 23.83 0.0001200 -4.615 2 2
genename region_tag susie_pip mu2 PVE z num_intron num_sqtl
3327 MAD1L1 7_3 0.8812 69.62 0.0003687 8.182 6 7
1660 DGKZ 11_28 0.8422 48.30 0.0003253 7.216 2 2
3251 LRP8 1_33 1.1226 33.65 0.0003019 4.820 5 5
2405 GIGYF1 7_62 1.0221 34.53 0.0002916 5.266 3 3
1322 COA8 14_54 0.7858 46.02 0.0002582 -7.407 7 10
4042 NT5C2 10_66 0.7796 48.83 0.0002566 -8.541 11 13
5322 SF3B1 2_117 0.7509 46.51 0.0002443 -7.053 2 2
3155 LINC00320 21_6 0.9267 29.45 0.0002312 5.336 3 3
4180 PAK6 15_14 0.8895 30.73 0.0002308 5.588 1 1
3236 LPCAT4 15_10 0.9460 25.64 0.0002120 4.892 3 3
2126 FAM177A1 14_9 1.0157 24.42 0.0002097 -4.872 11 13
298 ANAPC7 12_67 0.7579 38.23 0.0001921 6.385 4 4
4225 PCBP2 12_33 0.8426 26.30 0.0001773 -4.953 2 2
6256 TPGS2 18_19 0.8198 28.26 0.0001709 -4.088 4 4
587 B3GAT1 11_84 0.8969 23.77 0.0001658 -4.448 8 12
1450 CRTAP 3_24 0.8847 20.12 0.0001488 3.929 2 2
2318 FXR1 3_111 0.5794 44.40 0.0001380 6.837 4 4
6037 THAP8 19_25 0.8507 19.76 0.0001358 -3.847 2 2
6768 ZDHHC20 13_2 0.7325 23.83 0.0001200 -4.615 2 2
1667 DHPS 19_10 0.7106 24.82 0.0001190 -4.396 1 1
[1] 0.01837
genename region_tag susie_pip mu2 PVE z num_intron num_sqtl
3286 LSM2 6_26 9.745e-05 222.29 2.004e-11 -11.599 1 1
603 BAG6 6_26 1.162e-04 221.94 2.767e-11 -11.590 6 7
1604 DDR1 6_25 1.826e-01 105.59 3.276e-05 11.175 2 2
823 C6orf136 6_24 1.015e-01 82.59 8.072e-06 -11.031 2 2
2263 FLOT1 6_24 2.106e-01 81.23 3.399e-05 -10.981 6 6
724 BTN3A2 6_20 1.394e-01 92.71 8.680e-06 -10.665 5 5
4603 PPT2 6_26 3.691e-05 152.97 1.889e-12 -10.061 5 5
1892 EGFL8 6_26 2.777e-05 142.24 1.029e-12 -9.625 3 4
1014 CCHCR1 6_25 3.239e-02 68.64 3.490e-07 -9.508 6 9
2509 GPSM3 6_26 2.168e-06 124.08 5.539e-15 9.377 1 1
7016 ZSCAN31 6_22 1.725e-02 58.35 9.581e-08 -9.321 2 2
2674 HLA-DMA 6_27 1.319e-01 69.70 4.883e-06 8.727 6 10
4042 NT5C2 10_66 7.796e-01 48.83 2.566e-04 -8.541 11 13
3327 MAD1L1 7_3 8.812e-01 69.62 3.687e-04 8.182 6 7
6349 TSNARE1 8_93 3.299e-02 53.87 3.712e-07 7.961 4 6
4314 PGBD1 6_22 3.918e-02 40.35 3.080e-07 -7.746 2 2
722 BTN2A1 6_20 4.151e-02 51.43 3.563e-07 -7.727 3 3
7014 ZSCAN26 6_22 3.226e-02 38.78 2.883e-07 7.631 4 4
723 BTN3A1 6_20 5.359e-02 47.80 4.827e-07 7.490 4 4
700 BRD2 6_27 2.136e-01 46.77 1.498e-05 7.455 6 7
#number of genes for gene set enrichment
length(genes)
[1] 50
Uploading data to Enrichr... Done.
Querying GO_Biological_Process_2021... Done.
Querying GO_Cellular_Component_2021... Done.
Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Cellular_Component_2021"
Term Overlap Adjusted.P.value
1 protein phosphatase type 2A complex (GO:0000159) 2/17 0.04714
2 microtubule cytoskeleton (GO:0015630) 5/331 0.04714
Genes
1 PPP2R5B;PPP2R2A
2 DYNC1I2;ACTR1B;ANAPC7;KIF21B;MAD1L1
[1] "GO_Molecular_Function_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
Description FDR Ratio
129 Abnormality of head or neck 0.006894 2/22
7 Adenocarcinoma of prostate 0.016828 2/22
29 Measles 0.016828 1/22
48 Electroencephalogram abnormal 0.016828 1/22
97 Sporadic Breast Carcinoma 0.016828 1/22
102 Primary peritoneal carcinoma 0.016828 1/22
104 Osteogenesis Imperfecta Type VII 0.016828 1/22
106 BREAST-OVARIAN CANCER, FAMILIAL, SUSCEPTIBILITY TO, 1 0.016828 1/22
107 BREAST CANCER, FAMILIAL, SUSCEPTIBILITY TO, 1 0.016828 1/22
108 OVARIAN CANCER, FAMILIAL, SUSCEPTIBILITY TO, 1 0.016828 1/22
BgRatio
129 5/9703
7 20/9703
29 1/9703
48 1/9703
97 1/9703
102 1/9703
104 1/9703
106 1/9703
107 1/9703
108 1/9703
Warning: replacing previous import 'lifecycle::last_warnings' by
'rlang::last_warnings' when loading 'hms'
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet, minNum =
minNum, : No significant gene set is identified based on FDR 0.05!
NULL
Warning: ggrepel: 3 unlabeled data points (too many overlaps). Consider
increasing max.overlaps
#number of genes in known annotations
print(length(known_annotations))
[1] 130
#number of genes in known annotations with imputed expression
print(sum(known_annotations %in% ctwas_gene_res$genename))
[1] 54
#significance threshold for TWAS
print(sig_thresh)
[1] 4.49
#number of ctwas genes
length(ctwas_genes)
[1] 13
#number of TWAS genes
length(twas_genes)
[1] 129
#show novel genes (ctwas genes with not in TWAS genes)
ctwas_gene_res[ctwas_gene_res$genename %in% novel_genes,report_cols]
genename region_tag susie_pip mu2 PVE z num_intron num_sqtl
587 B3GAT1 11_84 0.8969 23.77 0.0001658 -4.448 8 12
1450 CRTAP 3_24 0.8847 20.12 0.0001488 3.929 2 2
6037 THAP8 19_25 0.8507 19.76 0.0001358 -3.847 2 2
6256 TPGS2 18_19 0.8198 28.26 0.0001709 -4.088 4 4
#sensitivity / recall
print(sensitivity)
ctwas TWAS
0.03077 0.13846
#specificity
print(specificity)
ctwas TWAS
0.9987 0.9841
#precision / PPV
print(precision)
ctwas TWAS
0.3077 0.1395
sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readxl_1.4.0 forcats_0.5.1 stringr_1.4.0 purrr_0.3.4
[5] readr_1.4.0 tidyr_1.1.3 tidyverse_1.3.1 tibble_3.1.7
[9] WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0 cowplot_1.1.1
[13] ggplot2_3.3.5 dplyr_1.0.7 reticulate_1.20 workflowr_1.6.2
loaded via a namespace (and not attached):
[1] fs_1.5.0 lubridate_1.7.10 doParallel_1.0.16 httr_1.4.2
[5] rprojroot_2.0.2 tools_4.1.0 backports_1.2.1 doRNG_1.8.2
[9] bslib_0.2.5.1 utf8_1.2.1 R6_2.5.0 vipor_0.4.5
[13] DBI_1.1.1 colorspace_2.0-2 withr_2.4.2 ggrastr_1.0.1
[17] tidyselect_1.1.1 curl_4.3.2 compiler_4.1.0 git2r_0.28.0
[21] rvest_1.0.0 cli_3.0.0 Cairo_1.5-15 xml2_1.3.2
[25] labeling_0.4.2 sass_0.4.0 scales_1.1.1 systemfonts_1.0.4
[29] apcluster_1.4.9 digest_0.6.27 rmarkdown_2.9 svglite_2.0.0
[33] pkgconfig_2.0.3 htmltools_0.5.1.1 dbplyr_2.1.1 highr_0.9
[37] rlang_1.0.2 rstudioapi_0.13 jquerylib_0.1.4 farver_2.1.0
[41] generics_0.1.0 jsonlite_1.7.2 magrittr_2.0.1 Matrix_1.3-3
[45] ggbeeswarm_0.6.0 Rcpp_1.0.7 munsell_0.5.0 fansi_0.5.0
[49] lifecycle_1.0.0 stringi_1.6.2 whisker_0.4 yaml_2.2.1
[53] plyr_1.8.6 grid_4.1.0 ggrepel_0.9.1 parallel_4.1.0
[57] promises_1.2.0.1 crayon_1.4.1 lattice_0.20-44 haven_2.4.1
[61] hms_1.1.0 knitr_1.33 pillar_1.7.0 igraph_1.2.6
[65] rjson_0.2.20 rngtools_1.5 reshape2_1.4.4 codetools_0.2-18
[69] reprex_2.0.0 glue_1.4.2 evaluate_0.14 data.table_1.14.0
[73] modelr_0.1.8 png_0.1-7 vctrs_0.3.8 httpuv_1.6.1
[77] foreach_1.5.1 cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1
[81] xfun_0.24 broom_0.7.8 later_1.2.0 iterators_1.0.13
[85] beeswarm_0.4.0 ellipsis_0.3.2