Last updated: 2022-05-12
Checks: 5 2
Knit directory: cTWAS_analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20211220)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.
absolute | relative |
---|---|
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/data/ | data |
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/code/ctwas_config.R | code/ctwas_config.R |
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 011327d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .ipynb_checkpoints/
Ignored: data/AF/
Untracked files:
Untracked: G_list.RData
Untracked: Rplot.png
Untracked: SCZ_annotation.xlsx
Untracked: analysis/.ipynb_checkpoints/
Untracked: code/.ipynb_checkpoints/
Untracked: code/AF_out/
Untracked: code/Autism_out/
Untracked: code/BMI_S_out/
Untracked: code/BMI_out/
Untracked: code/Glucose_out/
Untracked: code/LDL_S_out/
Untracked: code/SCZ_2014_EUR_out/
Untracked: code/SCZ_2018_S_out/
Untracked: code/SCZ_2018_out/
Untracked: code/SCZ_2020_Single_out/
Untracked: code/SCZ_2020_out/
Untracked: code/SCZ_S_out/
Untracked: code/SCZ_out/
Untracked: code/T2D_out/
Untracked: code/ctwas_config.R
Untracked: code/mapping.R
Untracked: code/out/
Untracked: code/process_scz_2018_snps.R
Untracked: code/run_AF_analysis.sbatch
Untracked: code/run_AF_analysis.sh
Untracked: code/run_AF_ctwas_rss_LDR.R
Untracked: code/run_Autism_analysis.sbatch
Untracked: code/run_Autism_analysis.sh
Untracked: code/run_Autism_ctwas_rss_LDR.R
Untracked: code/run_BMI_analysis.sbatch
Untracked: code/run_BMI_analysis.sh
Untracked: code/run_BMI_analysis_S.sbatch
Untracked: code/run_BMI_analysis_S.sh
Untracked: code/run_BMI_ctwas_rss_LDR.R
Untracked: code/run_BMI_ctwas_rss_LDR_S.R
Untracked: code/run_Glucose_analysis.sbatch
Untracked: code/run_Glucose_analysis.sh
Untracked: code/run_Glucose_ctwas_rss_LDR.R
Untracked: code/run_LDL_analysis_S.sbatch
Untracked: code/run_LDL_analysis_S.sh
Untracked: code/run_LDL_ctwas_rss_LDR_S.R
Untracked: code/run_SCZ_2014_EUR_analysis.sbatch
Untracked: code/run_SCZ_2014_EUR_analysis.sh
Untracked: code/run_SCZ_2014_EUR_ctwas_rss_LDR.R
Untracked: code/run_SCZ_2018_analysis.sbatch
Untracked: code/run_SCZ_2018_analysis.sh
Untracked: code/run_SCZ_2018_analysis_S.sbatch
Untracked: code/run_SCZ_2018_analysis_S.sh
Untracked: code/run_SCZ_2018_ctwas_rss_LDR.R
Untracked: code/run_SCZ_2018_ctwas_rss_LDR_S.R
Untracked: code/run_SCZ_2020_Single_analysis.sbatch
Untracked: code/run_SCZ_2020_Single_analysis.sh
Untracked: code/run_SCZ_2020_Single_ctwas_rss_LDR.R
Untracked: code/run_SCZ_2020_analysis.sbatch
Untracked: code/run_SCZ_2020_analysis.sh
Untracked: code/run_SCZ_2020_ctwas_rss_LDR.R
Untracked: code/run_SCZ_analysis.sbatch
Untracked: code/run_SCZ_analysis.sh
Untracked: code/run_SCZ_analysis_S.sbatch
Untracked: code/run_SCZ_analysis_S.sh
Untracked: code/run_SCZ_ctwas_rss_LDR.R
Untracked: code/run_SCZ_ctwas_rss_LDR_S.R
Untracked: code/run_T2D_analysis.sbatch
Untracked: code/run_T2D_analysis.sh
Untracked: code/run_T2D_ctwas_rss_LDR.R
Untracked: code/wflow_build.R
Untracked: code/wflow_build.sbatch
Untracked: data/.ipynb_checkpoints/
Untracked: data/BMI/
Untracked: data/GO_Terms/
Untracked: data/PGC3_SCZ_wave3_public.v2.tsv
Untracked: data/SCZ/
Untracked: data/SCZ_2014_EUR/
Untracked: data/SCZ_2018/
Untracked: data/SCZ_2018_S/
Untracked: data/SCZ_2020/
Untracked: data/SCZ_2020_Single/
Untracked: data/SCZ_S/
Untracked: data/Supplementary Table 15 - MAGMA.xlsx
Untracked: data/Supplementary Table 20 - Prioritised Genes.xlsx
Untracked: data/T2D/
Untracked: data/UKBB/
Untracked: data/UKBB_SNPs_Info.text
Untracked: data/gene_OMIM.txt
Untracked: data/gene_pip_0.8.txt
Untracked: data/mashr_Heart_Atrial_Appendage.db
Untracked: data/mashr_sqtl/
Untracked: data/scz_2018.RDS
Untracked: data/summary_known_genes_annotations.xlsx
Untracked: data/untitled.txt
Untracked: top_genes_32.txt
Untracked: top_genes_37.txt
Untracked: top_genes_43.txt
Untracked: top_genes_81.txt
Untracked: z_snp_pos_SCZ.RData
Untracked: z_snp_pos_SCZ_2014_EUR.RData
Untracked: z_snp_pos_SCZ_2018.RData
Untracked: z_snp_pos_SCZ_2020.RData
Unstaged changes:
Deleted: analysis/BMI_S_results.Rmd
Modified: analysis/SCZ_2018_Brain_Amygdala_S.Rmd
Modified: analysis/SCZ_2018_Brain_Anterior_cingulate_cortex_BA24_S.Rmd
Modified: analysis/SCZ_2018_Brain_Caudate_basal_ganglia_S.Rmd
Modified: analysis/SCZ_2018_Brain_Cerebellar_Hemisphere_S.Rmd
Modified: analysis/SCZ_2018_Brain_Cerebellum_S.Rmd
Modified: analysis/SCZ_2018_Brain_Cortex_S.Rmd
Modified: analysis/SCZ_2018_Brain_Frontal_Cortex_BA9_S.Rmd
Modified: analysis/SCZ_2018_Brain_Hippocampus_S.Rmd
Modified: analysis/SCZ_2018_Brain_Hypothalamus_S.Rmd
Modified: analysis/SCZ_2018_Brain_Nucleus_accumbens_basal_ganglia_S.Rmd
Modified: analysis/SCZ_2018_Brain_Putamen_basal_ganglia_S.Rmd
Modified: analysis/SCZ_2018_Brain_Spinal_cord_cervical_c-1_S.Rmd
Modified: analysis/SCZ_2018_Brain_Substantia_nigra_S.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/SCZ_2018_Brain_Cortex_S.Rmd
) and HTML (docs/SCZ_2018_Brain_Cortex_S.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
html | 011327d | sq-96 | 2022-05-12 | update |
Rmd | 6c6abbd | sq-96 | 2022-05-12 | update |
library(reticulate)
use_python("/scratch/midway2/shengqian/miniconda3/envs/PythonForR/bin/python",required=T)
#number of imputed weights
nrow(qclist_all)
[1] 23372
#number of imputed weights by chromosome
table(qclist_all$chr)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
2106 1670 1401 900 973 1205 1349 834 978 1043 1384 1297 477 810 795 919
17 18 19 20 21 22
1718 307 1661 776 45 724
#number of imputed weights without missing variants
sum(qclist_all$nmiss==0)
[1] 20390
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.8724
finish
Attaching package: 'dplyr'
The following objects are masked from 'package:stats':
filter, lag
The following objects are masked from 'package:base':
intersect, setdiff, setequal, union
gene snp
0.0082338 0.0003052
gene snp
10.48 10.44
[1] 105318
[1] 7921 6309950
gene snp
0.006487 0.190885
[1] 0.02274 1.09222
genename region_tag susie_pip mu2 PVE z num_intron num_sqtl
3637 LRP8 1_33 1.2198 32.31 0.0003555 -5.003 10 10
6207 SLC8B1 12_68 1.2142 22.26 0.0002788 -4.047 10 10
3503 LINC00320 21_6 1.0957 28.55 0.0003103 5.336 5 5
5376 R3HDM2 12_36 1.0921 42.83 0.0004461 -6.634 7 9
2336 FAM177A1 14_9 0.9825 23.31 0.0001838 4.849 11 14
3423 LAMA5 20_36 0.9539 28.66 0.0002078 -4.269 16 21
4682 PAK6 15_14 0.9137 29.77 0.0002303 5.588 3 3
983 CAMKK2 12_74 0.9028 35.27 0.0002014 4.060 8 10
5328 PTPRF 1_27 0.8882 35.71 0.0002541 6.680 6 6
2665 GIGYF1 7_62 0.8820 26.71 0.0001958 -5.266 3 3
1600 CRTAP 3_24 0.8650 19.74 0.0001395 3.929 2 2
592 ATP2B2 3_8 0.8284 25.09 0.0001526 4.229 5 7
5364 PYROXD2 10_62 0.8166 23.48 0.0001259 3.755 11 15
1165 CD46 1_105 0.8156 19.39 0.0001137 -3.933 11 14
6316 SNRPA1 15_50 0.8033 20.88 0.0001262 -4.098 2 3
1899 DNAJB1 19_12 0.7951 20.11 0.0001097 4.009 5 7
4540 NTRK3 15_41 0.7867 23.89 0.0001304 4.457 3 3
317 ANAPC7 12_67 0.7853 39.18 0.0002002 6.385 6 6
115 ACTR1B 2_57 0.7830 20.03 0.0001145 3.978 5 5
4465 NPIPB14P 16_37 0.7707 18.23 0.0000945 3.742 13 17
genename region_tag susie_pip mu2 PVE z num_intron num_sqtl
5376 R3HDM2 12_36 1.0921 42.83 0.0004461 -6.634 7 9
3637 LRP8 1_33 1.2198 32.31 0.0003555 -5.003 10 10
3503 LINC00320 21_6 1.0957 28.55 0.0003103 5.336 5 5
6207 SLC8B1 12_68 1.2142 22.26 0.0002788 -4.047 10 10
666 BAG6 6_26 0.2068 637.42 0.0002588 11.590 9 9
417 APOM 6_26 0.2068 637.42 0.0002588 11.590 2 2
5328 PTPRF 1_27 0.8882 35.71 0.0002541 6.680 6 6
4682 PAK6 15_14 0.9137 29.77 0.0002303 5.588 3 3
1465 COA8 14_54 0.7266 43.77 0.0002146 7.429 4 7
5969 SF3B1 2_117 0.7200 45.12 0.0002121 7.053 3 3
3423 LAMA5 20_36 0.9539 28.66 0.0002078 -4.269 16 21
983 CAMKK2 12_74 0.9028 35.27 0.0002014 4.060 8 10
317 ANAPC7 12_67 0.7853 39.18 0.0002002 6.385 6 6
2665 GIGYF1 7_62 0.8820 26.71 0.0001958 -5.266 3 3
3191 IRF3 19_34 0.7132 39.57 0.0001907 -6.461 2 2
1825 DGKZ 11_28 0.6445 46.77 0.0001845 7.216 2 2
2336 FAM177A1 14_9 0.9825 23.31 0.0001838 4.849 11 14
256 AKT3 1_128 0.7349 34.40 0.0001688 -6.291 4 4
2616 GATAD2A 19_15 0.6131 46.80 0.0001634 -6.668 5 5
592 ATP2B2 3_8 0.8284 25.09 0.0001526 4.229 5 7
[1] 0.01944
genename region_tag susie_pip mu2 PVE z num_intron num_sqtl
4849 PGBD1 6_22 6.697e-02 158.26 2.166e-06 -13.087 3 5
417 APOM 6_26 2.068e-01 637.42 2.588e-04 11.590 2 2
666 BAG6 6_26 2.068e-01 637.42 2.588e-04 11.590 9 9
7388 VARS1 6_26 1.246e-01 638.44 9.411e-05 -11.548 1 1
1767 DDR1 6_25 1.135e-01 101.67 1.195e-05 -11.175 2 2
912 C6orf136 6_24 8.036e-02 79.81 4.893e-06 11.031 2 2
2492 FLOT1 6_24 1.604e-01 78.48 1.914e-05 10.981 5 6
811 BTN3A2 6_20 8.836e-02 91.92 4.950e-06 -10.759 5 5
808 BTN2A1 6_20 1.160e-01 83.45 4.270e-06 10.185 6 7
5168 PPT2 6_26 9.858e-12 474.58 4.379e-25 -10.061 5 5
2079 EGFL8 6_26 4.045e-12 473.96 7.332e-26 10.036 6 7
5236 PRRT1 6_26 3.440e-12 472.86 5.315e-26 -10.018 1 1
5674 RNF5 6_26 7.171e-13 467.32 2.282e-27 -9.714 1 1
2785 GPSM3 6_26 2.278e-13 424.06 2.090e-28 9.377 2 2
1121 CCHCR1 6_25 9.482e-02 62.94 2.689e-06 -9.376 11 15
7913 ZSCAN31 6_22 3.256e-02 55.91 2.849e-07 9.321 3 4
2960 HLA-DMB 6_27 4.018e-02 68.96 8.983e-07 8.860 2 2
7910 ZSCAN23 6_22 1.033e-02 45.88 4.652e-08 -8.541 1 1
4526 NT5C2 10_66 5.965e-01 47.55 1.509e-04 -8.511 11 15
7699 ZNF192P1 6_22 9.360e-03 51.75 4.305e-08 8.024 1 2
#number of genes for gene set enrichment
length(genes)
[1] 79
Uploading data to Enrichr... Done.
Querying GO_Biological_Process_2021... Done.
Querying GO_Cellular_Component_2021... Done.
Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Cellular_Component_2021"
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"
Term Overlap Adjusted.P.value
1 cadherin binding (GO:0045296) 7/322 0.02739
Genes
1 CAST;DNAJB1;DLG1;PDXDC1;PAK6;CD46;KTN1
Description FDR Ratio
38 Measles 0.03514 1/43
73 Congenital absent nipple 0.03514 1/43
117 Congenital absence of breast with absent nipple 0.03514 1/43
155 Sporadic Breast Carcinoma 0.03514 1/43
159 Primary peritoneal carcinoma 0.03514 1/43
167 Osteogenesis Imperfecta Type VII 0.03514 1/43
168 Familial encephalopathy with neuroserpin inclusion bodies 0.03514 1/43
173 Retinitis Pigmentosa 46 0.03514 1/43
174 BREAST-OVARIAN CANCER, FAMILIAL, SUSCEPTIBILITY TO, 1 0.03514 1/43
175 BREAST CANCER, FAMILIAL, SUSCEPTIBILITY TO, 1 0.03514 1/43
BgRatio
38 1/9703
73 1/9703
117 1/9703
155 1/9703
159 1/9703
167 1/9703
168 1/9703
173 1/9703
174 1/9703
175 1/9703
Warning: replacing previous import 'lifecycle::last_warnings' by
'rlang::last_warnings' when loading 'hms'
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet, minNum =
minNum, : No significant gene set is identified based on FDR 0.05!
NULL
Warning: ggrepel: 42 unlabeled data points (too many overlaps). Consider
increasing max.overlaps
#number of genes in known annotations
print(length(known_annotations))
[1] 130
#number of genes in known annotations with imputed expression
print(sum(known_annotations %in% ctwas_gene_res$genename))
[1] 55
#significance threshold for TWAS
print(sig_thresh)
[1] 4.516
#number of ctwas genes
length(ctwas_genes)
[1] 15
#number of TWAS genes
length(twas_genes)
[1] 154
#show novel genes (ctwas genes with not in TWAS genes)
ctwas_gene_res[ctwas_gene_res$genename %in% novel_genes,report_cols]
genename region_tag susie_pip mu2 PVE z num_intron num_sqtl
592 ATP2B2 3_8 0.8284 25.09 0.0001526 4.229 5 7
983 CAMKK2 12_74 0.9028 35.27 0.0002014 4.060 8 10
1165 CD46 1_105 0.8156 19.39 0.0001137 -3.933 11 14
1600 CRTAP 3_24 0.8650 19.74 0.0001395 3.929 2 2
3423 LAMA5 20_36 0.9539 28.66 0.0002078 -4.269 16 21
5364 PYROXD2 10_62 0.8166 23.48 0.0001259 3.755 11 15
6207 SLC8B1 12_68 1.2142 22.26 0.0002788 -4.047 10 10
6316 SNRPA1 15_50 0.8033 20.88 0.0001262 -4.098 2 3
#sensitivity / recall
print(sensitivity)
ctwas TWAS
0.02308 0.11538
#specificity
print(specificity)
ctwas TWAS
0.9985 0.9823
#precision / PPV
print(precision)
ctwas TWAS
0.2000 0.0974
sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readxl_1.4.0 forcats_0.5.1 stringr_1.4.0 purrr_0.3.4
[5] readr_1.4.0 tidyr_1.1.3 tidyverse_1.3.1 tibble_3.1.7
[9] WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0 cowplot_1.1.1
[13] ggplot2_3.3.5 dplyr_1.0.7 reticulate_1.20 workflowr_1.6.2
loaded via a namespace (and not attached):
[1] fs_1.5.0 lubridate_1.7.10 doParallel_1.0.16 httr_1.4.2
[5] rprojroot_2.0.2 tools_4.1.0 backports_1.2.1 doRNG_1.8.2
[9] bslib_0.2.5.1 utf8_1.2.1 R6_2.5.0 vipor_0.4.5
[13] DBI_1.1.1 colorspace_2.0-2 withr_2.4.2 ggrastr_1.0.1
[17] tidyselect_1.1.1 curl_4.3.2 compiler_4.1.0 git2r_0.28.0
[21] rvest_1.0.0 cli_3.0.0 Cairo_1.5-15 xml2_1.3.2
[25] labeling_0.4.2 sass_0.4.0 scales_1.1.1 systemfonts_1.0.4
[29] apcluster_1.4.9 digest_0.6.27 rmarkdown_2.9 svglite_2.0.0
[33] pkgconfig_2.0.3 htmltools_0.5.1.1 dbplyr_2.1.1 highr_0.9
[37] rlang_1.0.2 rstudioapi_0.13 jquerylib_0.1.4 farver_2.1.0
[41] generics_0.1.0 jsonlite_1.7.2 magrittr_2.0.1 Matrix_1.3-3
[45] ggbeeswarm_0.6.0 Rcpp_1.0.7 munsell_0.5.0 fansi_0.5.0
[49] lifecycle_1.0.0 stringi_1.6.2 whisker_0.4 yaml_2.2.1
[53] plyr_1.8.6 grid_4.1.0 ggrepel_0.9.1 parallel_4.1.0
[57] promises_1.2.0.1 crayon_1.4.1 lattice_0.20-44 haven_2.4.1
[61] hms_1.1.0 knitr_1.33 pillar_1.7.0 igraph_1.2.6
[65] rjson_0.2.20 rngtools_1.5 reshape2_1.4.4 codetools_0.2-18
[69] reprex_2.0.0 glue_1.4.2 evaluate_0.14 data.table_1.14.0
[73] modelr_0.1.8 png_0.1-7 vctrs_0.3.8 httpuv_1.6.1
[77] foreach_1.5.1 cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1
[81] xfun_0.24 broom_0.7.8 later_1.2.0 iterators_1.0.13
[85] beeswarm_0.4.0 ellipsis_0.3.2