Last updated: 2022-02-21
Checks: 6 1
Knit directory: cTWAS_analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20211220)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.
absolute | relative |
---|---|
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/data/ | data |
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/code/ctwas_config.R | code/ctwas_config.R |
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version bbf6737. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .ipynb_checkpoints/
Untracked files:
Untracked: Rplot.png
Untracked: analysis/Glucose_Adipose_Subcutaneous.Rmd
Untracked: analysis/Glucose_Adipose_Visceral_Omentum.Rmd
Untracked: analysis/Splicing_Test.Rmd
Untracked: code/.ipynb_checkpoints/
Untracked: code/AF_out/
Untracked: code/BMI_S_out/
Untracked: code/BMI_out/
Untracked: code/Glucose_out/
Untracked: code/LDL_S_out/
Untracked: code/T2D_out/
Untracked: code/ctwas_config.R
Untracked: code/mapping.R
Untracked: code/out/
Untracked: code/run_AF_analysis.sbatch
Untracked: code/run_AF_analysis.sh
Untracked: code/run_AF_ctwas_rss_LDR.R
Untracked: code/run_BMI_analysis.sbatch
Untracked: code/run_BMI_analysis.sh
Untracked: code/run_BMI_analysis_S.sbatch
Untracked: code/run_BMI_analysis_S.sh
Untracked: code/run_BMI_ctwas_rss_LDR.R
Untracked: code/run_BMI_ctwas_rss_LDR_S.R
Untracked: code/run_Glucose_analysis.sbatch
Untracked: code/run_Glucose_analysis.sh
Untracked: code/run_Glucose_ctwas_rss_LDR.R
Untracked: code/run_LDL_analysis_S.sbatch
Untracked: code/run_LDL_analysis_S.sh
Untracked: code/run_LDL_ctwas_rss_LDR_S.R
Untracked: code/run_T2D_analysis.sbatch
Untracked: code/run_T2D_analysis.sh
Untracked: code/run_T2D_ctwas_rss_LDR.R
Untracked: data/.ipynb_checkpoints/
Untracked: data/AF/
Untracked: data/BMI/
Untracked: data/BMI_S/
Untracked: data/Glucose/
Untracked: data/LDL_S/
Untracked: data/T2D/
Untracked: data/TEST/
Untracked: data/UKBB/
Untracked: data/UKBB_SNPs_Info.text
Untracked: data/gene_OMIM.txt
Untracked: data/gene_pip_0.8.txt
Untracked: data/mashr_Heart_Atrial_Appendage.db
Untracked: data/mashr_sqtl/
Untracked: data/summary_known_genes_annotations.xlsx
Untracked: data/untitled.txt
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/BMI_Brain_Cortex.Rmd
) and HTML (docs/BMI_Brain_Cortex.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
html | 9824912 | sq-96 | 2022-02-20 | Build site. |
Rmd | 43d1820 | sq-96 | 2022-02-20 | update |
html | 1bdb351 | sq-96 | 2022-02-14 | Build site. |
html | 376c5ad | sq-96 | 2022-02-14 | Build site. |
Rmd | 13a0188 | sq-96 | 2022-02-14 | update |
html | 91f38fa | sq-96 | 2022-02-13 | Build site. |
Rmd | eb13ecf | sq-96 | 2022-02-13 | update |
html | e6bc169 | sq-96 | 2022-02-13 | Build site. |
Rmd | 87fee8b | sq-96 | 2022-02-13 | update |
#number of imputed weights
nrow(qclist_all)
[1] 11768
#number of imputed weights by chromosome
table(qclist_all$chr)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1184 820 686 454 547 667 562 437 448 483 708 635 232 389 392 543
17 18 19 20 21 22
711 182 893 367 129 299
#number of imputed weights without missing variants
sum(qclist_all$nmiss==0)
[1] 9186
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.7806
#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
gene snp
0.0081589 0.0002882
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
gene snp
24.27 17.33
#report sample size
print(sample_size)
[1] 336107
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1] 11768 7535010
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
gene snp
0.006933 0.111980
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1] 0.1376 17.8636
genename region_tag susie_pip mu2 PVE z num_eqtl
7845 PPM1M 3_36 1.0000 515.52 1.534e-03 4.386 2
766 MAPK6 15_21 0.9899 27658.60 8.146e-02 -4.662 1
3469 CCND2 12_4 0.9597 28.83 8.231e-05 -5.088 2
6507 TMEM219 16_24 0.9251 602.84 1.659e-03 12.063 1
9172 EFEMP2 11_36 0.9141 105.05 2.857e-04 -8.201 1
10518 MAPK11 22_24 0.9042 26.69 7.180e-05 -4.904 1
13972 NOL12 22_15 0.9030 62.83 1.688e-04 -4.505 2
646 NTHL1 16_2 0.8587 31.15 7.958e-05 5.296 1
6673 TADA1 1_82 0.7945 23.83 5.633e-05 -4.112 3
3936 XPO5 6_33 0.7575 36.46 8.218e-05 5.843 1
6116 ECE2 3_113 0.7479 30.25 6.730e-05 -5.315 1
10911 SKOR1 15_31 0.7418 54.44 1.202e-04 -9.754 1
4803 YWHAQ 2_6 0.7373 26.28 5.766e-05 4.911 1
13989 HIST1H2BE 6_20 0.7371 29.08 6.378e-05 -6.515 1
1413 CBX5 12_33 0.7103 25.61 5.411e-05 -4.691 1
8996 MRPL36 5_2 0.7098 22.91 4.838e-05 -4.294 2
8986 KCNK3 2_16 0.7052 48.31 1.014e-04 6.753 1
4701 CSNK1G2 19_2 0.6893 31.16 6.390e-05 -5.493 1
7344 NLRX1 11_71 0.6785 27.73 5.598e-05 5.171 2
8133 METTL3 14_2 0.6713 24.53 4.899e-05 -4.435 1
genename region_tag susie_pip mu2 PVE z num_eqtl
11 SEMA3F 3_35 0.000e+00 74412 0.000e+00 7.582 1
11131 C6orf106 6_28 0.000e+00 66708 0.000e+00 -9.175 2
7839 CAMKV 3_35 0.000e+00 53036 0.000e+00 -9.848 1
3949 SPDEF 6_28 0.000e+00 52296 0.000e+00 -9.270 1
642 TAF11 6_28 0.000e+00 50843 0.000e+00 -4.738 1
8020 CCDC171 9_13 0.000e+00 50628 0.000e+00 7.997 1
2226 PIK3R2 19_16 0.000e+00 47199 0.000e+00 -7.140 1
30 RBM5 3_35 0.000e+00 42354 0.000e+00 12.473 1
40 RBM6 3_35 0.000e+00 40962 0.000e+00 12.536 1
143 NADK 1_2 0.000e+00 40719 0.000e+00 5.478 2
7012 ZNF689 16_24 1.958e-13 39221 2.285e-14 6.014 1
7841 MST1R 3_35 0.000e+00 34975 0.000e+00 -12.626 1
2238 TMEM59L 19_16 0.000e+00 29070 0.000e+00 6.060 2
766 MAPK6 15_21 9.899e-01 27659 8.146e-02 -4.662 1
5570 LYSMD2 15_21 0.000e+00 26175 0.000e+00 -4.403 1
5565 MFAP1 15_16 5.408e-05 23671 3.809e-06 4.303 1
4963 HEY2 6_84 0.000e+00 23331 0.000e+00 3.066 1
7835 RNF123 3_35 0.000e+00 23172 0.000e+00 -10.957 1
1475 MAST3 19_16 0.000e+00 22329 0.000e+00 5.994 1
12095 CKMT1A 15_16 0.000e+00 21648 0.000e+00 4.119 2
genename region_tag susie_pip mu2 PVE z num_eqtl
766 MAPK6 15_21 0.98993 27658.60 8.146e-02 -4.662 1
7837 MFSD8 4_84 0.50000 7628.12 1.135e-02 2.512 1
7838 ABHD18 4_84 0.50000 7628.12 1.135e-02 -2.512 1
3146 LANCL1 2_124 0.47751 4728.54 6.718e-03 -3.535 1
291 CPS1 2_124 0.47751 4728.54 6.718e-03 -3.535 1
6507 TMEM219 16_24 0.92505 602.84 1.659e-03 12.063 1
7845 PPM1M 3_36 1.00000 515.52 1.534e-03 4.386 2
11102 TTC30B 2_107 0.38049 762.78 8.635e-04 -3.137 1
9172 EFEMP2 11_36 0.91413 105.05 2.857e-04 -8.201 1
13972 NOL12 22_15 0.90301 62.83 1.688e-04 -4.505 2
6960 GPR61 1_67 0.62221 79.86 1.478e-04 8.755 1
10911 SKOR1 15_31 0.74183 54.44 1.202e-04 -9.754 1
5705 C18orf8 18_12 0.66061 58.70 1.154e-04 7.575 2
14182 DHRS11 17_22 0.61553 62.56 1.146e-04 -8.142 1
8986 KCNK3 2_16 0.70520 48.31 1.014e-04 6.753 1
7478 TAL1 1_29 0.56798 49.14 8.304e-05 -6.866 1
3469 CCND2 12_4 0.95969 28.83 8.231e-05 -5.088 2
3936 XPO5 6_33 0.75750 36.46 8.218e-05 5.843 1
646 NTHL1 16_2 0.85866 31.15 7.958e-05 5.296 1
14186 CTC-543D15.8 19_9 0.02817 937.42 7.858e-05 3.963 1
genename region_tag susie_pip mu2 PVE z num_eqtl
5315 ADCY3 2_15 1.485e-04 273.97 1.211e-07 13.649 1
7841 MST1R 3_35 0.000e+00 34975.11 0.000e+00 -12.626 1
40 RBM6 3_35 0.000e+00 40962.06 0.000e+00 12.536 1
30 RBM5 3_35 0.000e+00 42353.53 0.000e+00 12.473 1
6507 TMEM219 16_24 9.251e-01 602.84 1.659e-03 12.063 1
9439 KCTD13 16_24 1.152e-03 491.21 1.684e-06 11.491 1
7835 RNF123 3_35 0.000e+00 23172.22 0.000e+00 -10.957 1
9556 NUPR1 16_23 2.730e-01 68.38 5.553e-05 -10.540 1
10881 CLN3 16_23 8.431e-02 67.59 1.695e-05 10.453 1
1888 MAPK3 16_24 3.635e-09 951.95 1.029e-11 10.247 2
8399 ZNF646 16_24 1.054e-08 7159.87 2.246e-10 -10.000 1
8398 ZNF668 16_24 1.054e-08 7159.87 2.246e-10 10.000 1
9120 C1QTNF4 11_29 1.746e-03 104.43 5.425e-07 9.950 2
8750 INO80E 16_24 1.977e-10 1116.43 6.568e-13 9.923 2
7839 CAMKV 3_35 0.000e+00 53036.18 0.000e+00 -9.848 1
486 PRSS8 16_24 7.979e-10 6796.28 1.613e-11 -9.765 1
10911 SKOR1 15_31 7.418e-01 54.44 1.202e-04 -9.754 1
11900 LAT 16_23 2.780e-01 55.49 4.590e-05 -9.553 1
2634 MTCH2 11_29 4.161e-05 90.75 1.123e-08 -9.551 1
12917 CTC-467M3.3 5_52 0.000e+00 460.00 0.000e+00 9.482 1
[1] 0.02405
genename region_tag susie_pip mu2 PVE z num_eqtl
5315 ADCY3 2_15 1.485e-04 273.97 1.211e-07 13.649 1
7841 MST1R 3_35 0.000e+00 34975.11 0.000e+00 -12.626 1
40 RBM6 3_35 0.000e+00 40962.06 0.000e+00 12.536 1
30 RBM5 3_35 0.000e+00 42353.53 0.000e+00 12.473 1
6507 TMEM219 16_24 9.251e-01 602.84 1.659e-03 12.063 1
9439 KCTD13 16_24 1.152e-03 491.21 1.684e-06 11.491 1
7835 RNF123 3_35 0.000e+00 23172.22 0.000e+00 -10.957 1
9556 NUPR1 16_23 2.730e-01 68.38 5.553e-05 -10.540 1
10881 CLN3 16_23 8.431e-02 67.59 1.695e-05 10.453 1
1888 MAPK3 16_24 3.635e-09 951.95 1.029e-11 10.247 2
8399 ZNF646 16_24 1.054e-08 7159.87 2.246e-10 -10.000 1
8398 ZNF668 16_24 1.054e-08 7159.87 2.246e-10 10.000 1
9120 C1QTNF4 11_29 1.746e-03 104.43 5.425e-07 9.950 2
8750 INO80E 16_24 1.977e-10 1116.43 6.568e-13 9.923 2
7839 CAMKV 3_35 0.000e+00 53036.18 0.000e+00 -9.848 1
486 PRSS8 16_24 7.979e-10 6796.28 1.613e-11 -9.765 1
10911 SKOR1 15_31 7.418e-01 54.44 1.202e-04 -9.754 1
11900 LAT 16_23 2.780e-01 55.49 4.590e-05 -9.553 1
2634 MTCH2 11_29 4.161e-05 90.75 1.123e-08 -9.551 1
12917 CTC-467M3.3 5_52 0.000e+00 460.00 0.000e+00 9.482 1
#number of genes for gene set enrichment
length(genes)
[1] 50
Uploading data to Enrichr... Done.
Querying GO_Biological_Process_2021... Done.
Querying GO_Cellular_Component_2021... Done.
Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"
Version | Author | Date |
---|---|---|
9824912 | sq-96 | 2022-02-20 |
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Cellular_Component_2021"
Version | Author | Date |
---|---|---|
9824912 | sq-96 | 2022-02-20 |
[1] Term Overlap Adjusted.P.value Genes
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"
Version | Author | Date |
---|---|---|
9824912 | sq-96 | 2022-02-20 |
Term Overlap Adjusted.P.value Genes
1 MAP kinase activity (GO:0004707) 2/14 0.04648 MAPK11;MAPK6
Description FDR
105 Interfrontal craniofaciosynostosis 0.03608
106 Osteoglophonic dwarfism 0.03608
144 Disproportionate tall stature 0.03608
146 Ceroid Lipofuscinosis, Neuronal, 7 0.03608
147 Holoprosencephaly, Ectrodactyly, and Bilateral Cleft Lip-Palate 0.03608
176 CHROMOSOME 8p11 MYELOPROLIFERATIVE SYNDROME 0.03608
179 CUTIS LAXA, AUTOSOMAL RECESSIVE, TYPE IB 0.03608
184 PULMONARY HYPERTENSION, PRIMARY, 4 0.03608
190 MEGALENCEPHALY-POLYMICROGYRIA-POLYDACTYLY-HYDROCEPHALUS SYNDROME 3 0.03608
191 CONE-ROD DYSTROPHY 20 0.03608
Ratio BgRatio
105 1/21 1/9703
106 1/21 1/9703
144 1/21 1/9703
146 1/21 1/9703
147 1/21 1/9703
176 1/21 1/9703
179 1/21 1/9703
184 1/21 1/9703
190 1/21 1/9703
191 1/21 1/9703
Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet, minNum =
minNum, : No significant gene set is identified based on FDR 0.05!
NULL
Warning: ggrepel: 11 unlabeled data points (too many overlaps). Consider
increasing max.overlaps
Version | Author | Date |
---|---|---|
9824912 | sq-96 | 2022-02-20 |
#number of genes in known annotations
print(length(known_annotations))
[1] 41
#number of genes in known annotations with imputed expression
print(sum(known_annotations %in% ctwas_gene_res$genename))
[1] 25
#significance threshold for TWAS
print(sig_thresh)
[1] 4.599
#number of ctwas genes
length(ctwas_genes)
[1] 8
#number of TWAS genes
length(twas_genes)
[1] 283
#show novel genes (ctwas genes with not in TWAS genes)
ctwas_gene_res[ctwas_gene_res$genename %in% novel_genes,report_cols]
genename region_tag susie_pip mu2 PVE z num_eqtl
7845 PPM1M 3_36 1.000 515.52 0.0015338 4.386 2
13972 NOL12 22_15 0.903 62.83 0.0001688 -4.505 2
#sensitivity / recall
print(sensitivity)
ctwas TWAS
0.00000 0.09756
#specificity
print(specificity)
ctwas TWAS
0.9993 0.9762
#precision / PPV
print(precision)
ctwas TWAS
0.00000 0.01413
Version | Author | Date |
---|---|---|
9824912 | sq-96 | 2022-02-20 |
Version | Author | Date |
---|---|---|
9824912 | sq-96 | 2022-02-20 |
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] readxl_1.3.1 forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7
[5] purrr_0.3.4 readr_2.1.1 tidyr_1.1.4 tidyverse_1.3.1
[9] tibble_3.1.6 WebGestaltR_0.4.4 disgenet2r_0.99.2 enrichR_3.0
[13] cowplot_1.0.0 ggplot2_3.3.5 workflowr_1.6.2
loaded via a namespace (and not attached):
[1] fs_1.5.2 lubridate_1.8.0 bit64_4.0.5 doParallel_1.0.16
[5] httr_1.4.2 rprojroot_2.0.2 tools_3.6.1 backports_1.4.1
[9] doRNG_1.8.2 utf8_1.2.2 R6_2.5.1 vipor_0.4.5
[13] DBI_1.1.1 colorspace_2.0-2 withr_2.4.3 ggrastr_1.0.1
[17] tidyselect_1.1.1 bit_4.0.4 curl_4.3.2 compiler_3.6.1
[21] git2r_0.26.1 cli_3.1.0 rvest_1.0.2 Cairo_1.5-12.2
[25] xml2_1.3.3 labeling_0.4.2 scales_1.1.1 apcluster_1.4.8
[29] digest_0.6.29 rmarkdown_2.11 svglite_1.2.2 pkgconfig_2.0.3
[33] htmltools_0.5.2 dbplyr_2.1.1 fastmap_1.1.0 highr_0.9
[37] rlang_0.4.12 rstudioapi_0.13 RSQLite_2.2.8 jquerylib_0.1.4
[41] farver_2.1.0 generics_0.1.1 jsonlite_1.7.2 vroom_1.5.7
[45] magrittr_2.0.1 Matrix_1.2-18 ggbeeswarm_0.6.0 Rcpp_1.0.7
[49] munsell_0.5.0 fansi_0.5.0 gdtools_0.1.9 lifecycle_1.0.1
[53] stringi_1.7.6 whisker_0.3-2 yaml_2.2.1 plyr_1.8.6
[57] grid_3.6.1 blob_1.2.2 ggrepel_0.9.1 parallel_3.6.1
[61] promises_1.0.1 crayon_1.4.2 lattice_0.20-38 haven_2.4.3
[65] hms_1.1.1 knitr_1.36 pillar_1.6.4 igraph_1.2.10
[69] rjson_0.2.20 rngtools_1.5.2 reshape2_1.4.4 codetools_0.2-16
[73] reprex_2.0.1 glue_1.5.1 evaluate_0.14 data.table_1.14.2
[77] modelr_0.1.8 vctrs_0.3.8 tzdb_0.2.0 httpuv_1.5.1
[81] foreach_1.5.1 cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1
[85] cachem_1.0.6 xfun_0.29 broom_0.7.10 later_0.8.0
[89] iterators_1.0.13 beeswarm_0.2.3 memoise_2.0.1 ellipsis_0.3.2