Last updated: 2022-03-14

Checks: 5 2

Knit directory: cTWAS_analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown is untracked by Git. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20211220) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute relative
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/data/ data
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/code/ctwas_config.R code/ctwas_config.R

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 4c71b11. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .ipynb_checkpoints/
    Ignored:    analysis/figure/
    Ignored:    data/AF/

Untracked files:
    Untracked:  Rplot.png
    Untracked:  analysis/.ipynb_checkpoints/
    Untracked:  analysis/SCZ_2014_EUR_Brain_Amygdala.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Anterior_cingulate_cortex_BA24.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Caudate_basal_ganglia.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Cerebellar_Hemisphere.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Cerebellum.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Cortex.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Frontal_Cortex_BA9.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Hippocampus.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Hypothalamus.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Nucleus_accumbens_basal_ganglia.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Putamen_basal_ganglia.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Spinal_cord_cervical_c-1.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Substantia_nigra.Rmd
    Untracked:  analysis/SCZ_2020_Brain_Cortex.Rmd
    Untracked:  analysis/SCZ_2020_Brain_Frontal_Cortex_BA9.Rmd
    Untracked:  analysis/SCZ_2020_Brain_Hypothalamus.Rmd
    Untracked:  analysis/SCZ_2020_Brain_Putamen_basal_ganglia.Rmd
    Untracked:  analysis/SCZ_Cross_Tissue_Analysis.Rmd
    Untracked:  code/.ipynb_checkpoints/
    Untracked:  code/AF_out/
    Untracked:  code/Autism_out/
    Untracked:  code/BMI_S_out/
    Untracked:  code/BMI_out/
    Untracked:  code/Glucose_out/
    Untracked:  code/LDL_S_out/
    Untracked:  code/SCZ_2014_EUR_out/
    Untracked:  code/SCZ_2020_out/
    Untracked:  code/SCZ_S_out/
    Untracked:  code/SCZ_out/
    Untracked:  code/T2D_out/
    Untracked:  code/ctwas_config.R
    Untracked:  code/mapping.R
    Untracked:  code/out/
    Untracked:  code/run_AF_analysis.sbatch
    Untracked:  code/run_AF_analysis.sh
    Untracked:  code/run_AF_ctwas_rss_LDR.R
    Untracked:  code/run_Autism_analysis.sbatch
    Untracked:  code/run_Autism_analysis.sh
    Untracked:  code/run_Autism_ctwas_rss_LDR.R
    Untracked:  code/run_BMI_analysis.sbatch
    Untracked:  code/run_BMI_analysis.sh
    Untracked:  code/run_BMI_analysis_S.sbatch
    Untracked:  code/run_BMI_analysis_S.sh
    Untracked:  code/run_BMI_ctwas_rss_LDR.R
    Untracked:  code/run_BMI_ctwas_rss_LDR_S.R
    Untracked:  code/run_Glucose_analysis.sbatch
    Untracked:  code/run_Glucose_analysis.sh
    Untracked:  code/run_Glucose_ctwas_rss_LDR.R
    Untracked:  code/run_LDL_analysis_S.sbatch
    Untracked:  code/run_LDL_analysis_S.sh
    Untracked:  code/run_LDL_ctwas_rss_LDR_S.R
    Untracked:  code/run_SCZ_2014_EUR_analysis.sbatch
    Untracked:  code/run_SCZ_2014_EUR_analysis.sh
    Untracked:  code/run_SCZ_2014_EUR_ctwas_rss_LDR.R
    Untracked:  code/run_SCZ_2020_analysis.sbatch
    Untracked:  code/run_SCZ_2020_analysis.sh
    Untracked:  code/run_SCZ_2020_ctwas_rss_LDR.R
    Untracked:  code/run_SCZ_analysis.sbatch
    Untracked:  code/run_SCZ_analysis.sh
    Untracked:  code/run_SCZ_analysis_S.sbatch
    Untracked:  code/run_SCZ_analysis_S.sh
    Untracked:  code/run_SCZ_ctwas_rss_LDR.R
    Untracked:  code/run_SCZ_ctwas_rss_LDR_S.R
    Untracked:  code/run_T2D_analysis.sbatch
    Untracked:  code/run_T2D_analysis.sh
    Untracked:  code/run_T2D_ctwas_rss_LDR.R
    Untracked:  code/wflow_build.R
    Untracked:  code/wflow_build.sbatch
    Untracked:  data/.ipynb_checkpoints/
    Untracked:  data/BMI/
    Untracked:  data/PGC3_SCZ_wave3_public.v2.tsv
    Untracked:  data/SCZ/
    Untracked:  data/SCZ_2014_EUR/
    Untracked:  data/SCZ_2020/
    Untracked:  data/SCZ_S/
    Untracked:  data/T2D/
    Untracked:  data/UKBB/
    Untracked:  data/UKBB_SNPs_Info.text
    Untracked:  data/gene_OMIM.txt
    Untracked:  data/gene_pip_0.8.txt
    Untracked:  data/mashr_Heart_Atrial_Appendage.db
    Untracked:  data/mashr_sqtl/
    Untracked:  data/summary_known_genes_annotations.xlsx
    Untracked:  data/untitled.txt

Unstaged changes:
    Modified:   analysis/SCZ_Brain_Amygdala.Rmd
    Modified:   analysis/SCZ_Brain_Anterior_cingulate_cortex_BA24.Rmd
    Modified:   analysis/SCZ_Brain_Caudate_basal_ganglia.Rmd
    Modified:   analysis/SCZ_Brain_Cerebellar_Hemisphere.Rmd
    Modified:   analysis/SCZ_Brain_Cerebellum.Rmd
    Modified:   analysis/SCZ_Brain_Cortex.Rmd
    Modified:   analysis/SCZ_Brain_Frontal_Cortex_BA9.Rmd
    Modified:   analysis/SCZ_Brain_Hippocampus.Rmd
    Modified:   analysis/SCZ_Brain_Hypothalamus.Rmd
    Modified:   analysis/SCZ_Brain_Nucleus_accumbens_basal_ganglia.Rmd
    Modified:   analysis/SCZ_Brain_Putamen_basal_ganglia.Rmd
    Modified:   analysis/SCZ_Brain_Spinal_cord_cervical_c-1.Rmd
    Modified:   analysis/SCZ_Brain_Substantia_nigra.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


There are no past versions. Publish this analysis with wflow_publish() to start tracking its development.


Weight QC

#number of imputed weights
nrow(qclist_all)
[1] 9924
#number of imputed weights by chromosome
table(qclist_all$chr)

  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20 
978 708 563 386 486 590 469 358 376 379 592 578 217 311 325 437 581 153 766 299 
 21  22 
116 256 
#number of imputed weights without missing variants
sum(qclist_all$nmiss==0)
[1] 7824
#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)
[1] 0.7884

Check convergence of parameters

#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)
     gene       snp 
0.0049685 0.0002741 
#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)
  gene    snp 
12.644  8.111 
#report sample size
print(sample_size)
[1] 77096
#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)
[1]    9924 7352670
#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)
    gene      snp 
0.008087 0.212030 
#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))
[1] 0.03525 1.76138

Genes with highest PIPs

          genename region_tag susie_pip   mu2       PVE      z num_eqtl
10169       ZNF823      19_10    0.9579 30.33 0.0003769  5.485        1
11222   AC012074.2       2_15    0.8420 23.94 0.0002615  4.648        2
5798       ARFGAP2      11_29    0.7653 25.77 0.0002558  4.839        1
5997        TMEM56       1_58    0.7542 24.38 0.0002385 -4.357        1
2362        B3GAT1      11_84    0.6276 26.35 0.0002145 -4.621        2
3172         HSDL2       9_57    0.6099 27.04 0.0002139 -4.378        1
2406           MDK      11_28    0.5655 39.32 0.0002884 -6.344        1
10520        TCTN1      12_67    0.5139 24.23 0.0001615  4.673        1
9014        FAM83H       8_94    0.4727 25.19 0.0001544  4.308        2
3822        SPECC1      17_17    0.4581 31.53 0.0001873  4.357        2
3251        CYSTM1       5_83    0.4432 21.51 0.0001237 -3.750        1
677        PPP2R5B      11_36    0.4411 25.94 0.0001484 -4.610        1
6937      SERPINI1      3_103    0.4351 20.80 0.0001174 -4.030        1
2728        PDCD10      3_103    0.4351 20.80 0.0001174 -4.030        1
11888 RP11-220I1.5       9_28    0.4231 23.87 0.0001310 -4.442        1
454        TRAPPC3       1_22    0.4136 25.69 0.0001378  5.058        1
2983        MAP7D1       1_22    0.4136 25.69 0.0001378  5.058        1
377         CTNNA1       5_82    0.4065 24.74 0.0001304  4.946        1
11703 RP11-65M17.3      11_66    0.4043 23.18 0.0001216  4.340        1
9316         ACOT1      14_34    0.4036 30.20 0.0001581  4.148        2

Genes with largest effect sizes

       genename region_tag susie_pip    mu2       PVE        z num_eqtl
11179 HIST1H2BN       6_21 1.281e-06 998.37 1.659e-08 10.77288        1
6279      MMP16       8_63 0.000e+00 527.12 0.000e+00  3.64776        1
10468   ABHD16A       6_27 3.563e-05 159.90 7.389e-08  8.93412        1
10465      MSH5       6_27 2.583e-05 158.53 5.311e-08  8.87309        1
2719       PCCB       3_84 0.000e+00 148.08 0.000e+00 -4.36131        1
11458       C4A       6_27 5.314e-07 145.82 1.005e-09  8.44500        1
11188    TRIM26       6_24 1.998e-15 137.32 3.559e-18 -5.53874        1
10707     DDAH2       6_27 1.151e-11 134.83 2.014e-14  7.66104        1
1315     ZNF184       6_21 0.000e+00 125.18 0.000e+00  1.10170        2
2033       MPP6       7_21 4.756e-03 113.19 6.982e-06 -3.30235        1
10473      APOM       6_27 9.090e-12 105.50 1.244e-14  7.02253        2
4693       IER3       6_24 4.070e-07  90.59 4.783e-10  7.63213        1
3563  HIST1H2BJ       6_21 0.000e+00  84.55 0.000e+00  0.02624        2
10444      RNF5       6_27 4.741e-14  83.06 5.108e-17  7.92121        1
9573     BTN3A2       6_20 8.678e-03  69.94 7.872e-06  9.17987        3
5592    PPP1R18       6_24 1.332e-15  68.21 1.179e-18  1.41699        2
10458   SLC44A4       6_27 1.993e-12  67.92 1.756e-15  6.41397        1
10443      AGER       6_27 5.118e-14  64.02 4.250e-17 -2.62745        1
10448     FKBPL       6_27 3.605e-12  59.03 2.760e-15 -4.22665        1
8984  HIST1H2BC       6_20 8.247e-03  52.71 5.638e-06 -7.97807        1

Genes with highest PVE

          genename region_tag susie_pip   mu2       PVE      z num_eqtl
10169       ZNF823      19_10    0.9579 30.33 0.0003769  5.485        1
2406           MDK      11_28    0.5655 39.32 0.0002884 -6.344        1
11222   AC012074.2       2_15    0.8420 23.94 0.0002615  4.648        2
5798       ARFGAP2      11_29    0.7653 25.77 0.0002558  4.839        1
5997        TMEM56       1_58    0.7542 24.38 0.0002385 -4.357        1
2362        B3GAT1      11_84    0.6276 26.35 0.0002145 -4.621        2
3172         HSDL2       9_57    0.6099 27.04 0.0002139 -4.378        1
3822        SPECC1      17_17    0.4581 31.53 0.0001873  4.357        2
2782        LMAN2L       2_57    0.3026 42.92 0.0001684 -4.484        2
10520        TCTN1      12_67    0.5139 24.23 0.0001615  4.673        1
9316         ACOT1      14_34    0.4036 30.20 0.0001581  4.148        2
10971    LINC00390      13_17    0.2831 42.17 0.0001549 -3.799        1
9014        FAM83H       8_94    0.4727 25.19 0.0001544  4.308        2
677        PPP2R5B      11_36    0.4411 25.94 0.0001484 -4.610        1
10685        SIPA1      11_36    0.3694 30.57 0.0001465  3.824        1
9695      HIST1H1C       6_20    0.2654 41.65 0.0001434  4.329        2
454        TRAPPC3       1_22    0.4136 25.69 0.0001378  5.058        1
2983        MAP7D1       1_22    0.4136 25.69 0.0001378  5.058        1
437       MPHOSPH9      12_75    0.2358 43.39 0.0001327  6.650        1
11888 RP11-220I1.5       9_28    0.4231 23.87 0.0001310 -4.442        1

Genes with largest z scores

       genename region_tag susie_pip    mu2       PVE      z num_eqtl
11179 HIST1H2BN       6_21 1.281e-06 998.37 1.659e-08 10.773        1
9573     BTN3A2       6_20 8.678e-03  69.94 7.872e-06  9.180        3
10468   ABHD16A       6_27 3.563e-05 159.90 7.389e-08  8.934        1
10465      MSH5       6_27 2.583e-05 158.53 5.311e-08  8.873        1
11458       C4A       6_27 5.314e-07 145.82 1.005e-09  8.445        1
8984  HIST1H2BC       6_20 8.247e-03  52.71 5.638e-06 -7.978        1
10444      RNF5       6_27 4.741e-14  83.06 5.108e-17  7.921        1
5778      CNNM2      10_66 1.121e-01  41.42 6.021e-05 -7.876        1
10707     DDAH2       6_27 1.151e-11 134.83 2.014e-14  7.661        1
4693       IER3       6_24 4.070e-07  90.59 4.783e-10  7.632        1
10473      APOM       6_27 9.090e-12 105.50 1.244e-14  7.023        2
11226   ZSCAN31       6_22 4.605e-03  35.18 2.101e-06 -6.706        2
437    MPHOSPH9      12_75 2.358e-01  43.39 1.327e-04  6.650        1
9777       DPYD       1_60 5.668e-03  39.51 2.904e-06 -6.455        1
10458   SLC44A4       6_27 1.993e-12  67.92 1.756e-15  6.414        1
2406        MDK      11_28 5.655e-01  39.32 2.884e-04 -6.344        1
10231   ZKSCAN8       6_22 2.107e-03  42.30 1.156e-06  6.135        1
8969     HARBI1      11_28 1.319e-01  35.67 6.100e-05  6.084        1
10486    CCHCR1       6_25 3.405e-03  27.92 1.233e-06 -5.868        3
3264      SNX19      11_81 3.857e-02  39.37 1.970e-05  5.837        3

Comparing z scores and PIPs

[1] 0.006046

GO enrichment analysis for genes with PIP>0.5

#number of genes for gene set enrichment
length(genes)
[1] 8
Uploading data to Enrichr... Done.
  Querying GO_Biological_Process_2021... Done.
  Querying GO_Cellular_Component_2021... Done.
  Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"

                                                                                  Term
1                         positive regulation of neutrophil extravasation (GO:2000391)
2                         regulation of inflammatory response to wounding (GO:0106014)
3                negative regulation of regulatory T cell differentiation (GO:0045590)
4                                         Golgi transport vesicle coating (GO:0048200)
5            negative regulation of cardiac muscle cell apoptotic process (GO:0010667)
6                  leukocyte chemotaxis involved in inflammatory response (GO:0002232)
7                                  regulation of hepatocyte proliferation (GO:2000345)
8                                  regulation of neutrophil extravasation (GO:2000389)
9                                           COPI coating of Golgi vesicle (GO:0048205)
10                                            COPI-coated vesicle budding (GO:0035964)
11                        protein localization to ciliary transition zone (GO:1904491)
12                                          regulation of bone remodeling (GO:0046850)
13          negative regulation of striated muscle cell apoptotic process (GO:0010664)
14                 positive regulation of oligodendrocyte differentiation (GO:0048714)
15                                              adrenal gland development (GO:0030325)
16                     chondroitin sulfate proteoglycan metabolic process (GO:0050654)
17                        regulation of epithelial cell apoptotic process (GO:1904035)
18                    regulation of cardiac muscle cell apoptotic process (GO:0010665)
19                                          aminoglycan metabolic process (GO:0006022)
20                          positive regulation of cellular extravasation (GO:0002693)
21                      positive regulation of keratinocyte proliferation (GO:0010838)
22                             regulation of leukocyte cell-cell adhesion (GO:1903037)
23                                positive regulation of neuron migration (GO:2001224)
24         positive regulation of vascular endothelial cell proliferation (GO:1905564)
25          regulation of leukocyte adhesion to vascular endothelial cell (GO:1904994)
26                          negative regulation of T cell differentiation (GO:0045581)
27                           positive regulation of macrophage chemotaxis (GO:0010759)
28                                                              oogenesis (GO:0048477)
29 positive regulation of leukocyte adhesion to vascular endothelial cell (GO:1904996)
30                    positive regulation of smooth muscle cell migration (GO:0014911)
31                                    regulation of cartilage development (GO:0061035)
32                      positive regulation of glial cell differentiation (GO:0045687)
33                           positive regulation of cartilage development (GO:0061036)
34                  regulation of vascular endothelial cell proliferation (GO:1905562)
35                           positive regulation of neutrophil chemotaxis (GO:0090023)
36                          regulation of oligodendrocyte differentiation (GO:0048713)
37                            positive regulation of macrophage migration (GO:1905523)
38                                     regulation of leukocyte chemotaxis (GO:0002688)
39                                   regulation of granulocyte chemotaxis (GO:0071622)
40                                        regulation of tissue remodeling (GO:0034103)
41                      positive regulation of lymphocyte differentiation (GO:0045621)
42                            negative regulation of response to wounding (GO:1903035)
43             positive regulation of neural precursor cell proliferation (GO:2000179)
44                            positive regulation of neutrophil migration (GO:1902624)
45                          positive regulation of granulocyte chemotaxis (GO:0071624)
46                                    regulation of macrophage chemotaxis (GO:0010758)
47                               regulation of mononuclear cell migration (GO:0071675)
48                      regulation of neural precursor cell proliferation (GO:2000177)
49                  positive regulation of morphogenesis of an epithelium (GO:1905332)
50                      lymphocyte activation involved in immune response (GO:0002285)
51                                    negative regulation of ossification (GO:0030279)
52                               regulation of keratinocyte proliferation (GO:0010837)
53                                           endocrine system development (GO:0035270)
54                                    regulation of neutrophil chemotaxis (GO:0090022)
55                        regulation of regulatory T cell differentiation (GO:0045589)
56                              regulation of chondrocyte differentiation (GO:0032330)
57                                                 glial cell development (GO:0021782)
58                  chondroitin sulfate proteoglycan biosynthetic process (GO:0050650)
59                                         regulation of neuron migration (GO:2001222)
60                          T cell activation involved in immune response (GO:0002286)
61                            positive regulation of response to wounding (GO:1903036)
62                             positive regulation of leukocyte migration (GO:0002687)
63                                      proteoglycan biosynthetic process (GO:0030166)
64                    positive regulation of leukocyte cell-cell adhesion (GO:1903039)
65                       positive regulation of interleukin-12 production (GO:0032735)
66                                               female gamete generation (GO:0007292)
67                                         protein localization to cilium (GO:0061512)
68                                             regulation of ossification (GO:0030278)
69     positive regulation of substrate adhesion-dependent cell spreading (GO:1900026)
70                        regulation of actin cytoskeleton reorganization (GO:2000249)
71               negative regulation of epithelial cell apoptotic process (GO:1904036)
72                                   regulation of T cell differentiation (GO:0045580)
73            positive regulation of epithelial to mesenchymal transition (GO:0010718)
74                          positive regulation of T cell differentiation (GO:0045582)
75                                      positive regulation of chemotaxis (GO:0050921)
76                              positive regulation of cell-cell adhesion (GO:0022409)
77  positive regulation of cell morphogenesis involved in differentiation (GO:0010770)
78              regulation of substrate adhesion-dependent cell spreading (GO:1900024)
79                                regulation of interleukin-12 production (GO:0032655)
80                                    glycosaminoglycan metabolic process (GO:0030203)
81                            positive regulation of leukocyte chemotaxis (GO:0002690)
82                                                Notch signaling pathway (GO:0007219)
83                         positive regulation of cell-substrate adhesion (GO:0010811)
84                        negative regulation of neuron apoptotic process (GO:0043524)
85                                                      gland development (GO:0048732)
86                                   negative regulation of cell adhesion (GO:0007162)
87                               positive regulation of T cell activation (GO:0050870)
88                     regulation of epithelial to mesenchymal transition (GO:0010717)
89                  positive regulation of endothelial cell proliferation (GO:0001938)
90                                   positive regulation of cell adhesion (GO:0045785)
91  retrograde vesicle-mediated transport, Golgi to endoplasmic reticulum (GO:0006890)
   Overlap Adjusted.P.value   Genes
1      1/5          0.02399     MDK
2      1/5          0.02399     MDK
3      1/5          0.02399     MDK
4      1/6          0.02399 ARFGAP2
5      1/6          0.02399     MDK
6      1/6          0.02399     MDK
7      1/6          0.02399     MDK
8      1/6          0.02399     MDK
9      1/6          0.02399 ARFGAP2
10     1/6          0.02399 ARFGAP2
11     1/6          0.02399   TCTN1
12     1/7          0.02399     MDK
13     1/8          0.02399     MDK
14     1/8          0.02399     MDK
15    1/10          0.02399     MDK
16    1/10          0.02399  B3GAT1
17    1/10          0.02399     MDK
18    1/11          0.02399     MDK
19    1/11          0.02399  B3GAT1
20    1/11          0.02399     MDK
21    1/11          0.02399     MDK
22    1/12          0.02399     MDK
23    1/13          0.02399     MDK
24    1/13          0.02399     MDK
25    1/13          0.02399     MDK
26    1/14          0.02399     MDK
27    1/14          0.02399     MDK
28    1/16          0.02399     MDK
29    1/16          0.02399     MDK
30    1/17          0.02399     MDK
31    1/18          0.02399     MDK
32    1/18          0.02399     MDK
33    1/18          0.02399     MDK
34    1/18          0.02399     MDK
35    1/19          0.02399     MDK
36    1/19          0.02399     MDK
37    1/19          0.02399     MDK
38    1/19          0.02399     MDK
39    1/20          0.02399     MDK
40    1/20          0.02399     MDK
41    1/20          0.02399     MDK
42    1/21          0.02399     MDK
43    1/22          0.02399     MDK
44    1/22          0.02399     MDK
45    1/22          0.02399     MDK
46    1/22          0.02399     MDK
47    1/23          0.02399     MDK
48    1/23          0.02399     MDK
49    1/23          0.02399     MDK
50    1/24          0.02399     MDK
51    1/24          0.02399     MDK
52    1/24          0.02399     MDK
53    1/24          0.02399     MDK
54    1/25          0.02416     MDK
55    1/26          0.02416     MDK
56    1/26          0.02416     MDK
57    1/26          0.02416     MDK
58    1/27          0.02430  B3GAT1
59    1/28          0.02430     MDK
60    1/28          0.02430     MDK
61    1/28          0.02430     MDK
62    1/29          0.02476     MDK
63    1/31          0.02563  B3GAT1
64    1/31          0.02563     MDK
65    1/34          0.02725     MDK
66    1/34          0.02725     MDK
67    1/35          0.02755   TCTN1
68    1/36          0.02755     MDK
69    1/36          0.02755     MDK
70    1/37          0.02755     MDK
71    1/37          0.02755     MDK
72    1/39          0.02863     MDK
73    1/42          0.03039     MDK
74    1/43          0.03069     MDK
75    1/45          0.03168     MDK
76    1/47          0.03264     MDK
77    1/50          0.03404     MDK
78    1/51          0.03404     MDK
79    1/51          0.03404     MDK
80    1/54          0.03514  B3GAT1
81    1/54          0.03514     MDK
82    1/60          0.03853     MDK
83    1/70          0.04390     MDK
84    1/71          0.04390     MDK
85    1/71          0.04390     MDK
86    1/73          0.04459     MDK
87    1/75          0.04527     MDK
88    1/76          0.04535     MDK
89    1/77          0.04542     MDK
90    1/80          0.04664     MDK
91    1/85          0.04897 ARFGAP2
[1] "GO_Cellular_Component_2021"

[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"

[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)

DisGeNET enrichment analysis for genes with PIP>0.5

                      Description     FDR Ratio BgRatio
31            JOUBERT SYNDROME 13 0.01958   1/5  1/9703
1               Anxiety Disorders 0.06101   1/5 44/9703
2            Diabetic Nephropathy 0.06101   1/5 44/9703
5      Nodular glomerulosclerosis 0.06101   1/5 41/9703
9                Memory Disorders 0.06101   1/5 43/9703
12              Memory impairment 0.06101   1/5 44/9703
17       Anxiety States, Neurotic 0.06101   1/5 44/9703
19 Familial aplasia of the vermis 0.06101   1/5 30/9703
21   Age-Related Memory Disorders 0.06101   1/5 43/9703
22      Memory Disorder, Semantic 0.06101   1/5 43/9703

WebGestalt enrichment analysis for genes with PIP>0.5

Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...
Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet, minNum =
minNum, : No significant gene set is identified based on FDR 0.05!
NULL

PIP Manhattan Plot

Warning: 'timedatectl' indicates the non-existent timezone name 'n/a'
Warning: Your system is mis-configured: '/etc/localtime' is not a symlink
Warning: It is strongly recommended to set envionment variable TZ to 'America/
Chicago' (or equivalent)

Sensitivity, specificity and precision for silver standard genes

#number of genes in known annotations
print(length(known_annotations))
[1] 130
#number of genes in known annotations with imputed expression
print(sum(known_annotations %in% ctwas_gene_res$genename))
[1] 52
#significance threshold for TWAS
print(sig_thresh)
[1] 4.563
#number of ctwas genes
length(ctwas_genes)
[1] 2
#number of TWAS genes
length(twas_genes)
[1] 60
#show novel genes (ctwas genes with not in TWAS genes)
ctwas_gene_res[ctwas_gene_res$genename %in% novel_genes,report_cols]
[1] genename   region_tag susie_pip  mu2        PVE        z          num_eqtl  
<0 rows> (or 0-length row.names)
#sensitivity / recall
print(sensitivity)
   ctwas     TWAS 
0.007692 0.023077 
#specificity
print(specificity)
 ctwas   TWAS 
0.9999 0.9942 
#precision / PPV
print(precision)
ctwas  TWAS 
 0.50  0.05 

cTWAS is more precise than TWAS in distinguishing silver standard and bystander genes

#number of genes in known annotations (with imputed expression)
print(length(known_annotations))
[1] 52
#number of bystander genes (with imputed expression)
print(length(unrelated_genes))
[1] 584
#subset results to genes in known annotations or bystanders
ctwas_gene_res_subset <- ctwas_gene_res[ctwas_gene_res$genename %in% c(known_annotations, unrelated_genes),]

#assign ctwas and TWAS genes
ctwas_genes <- ctwas_gene_res_subset$genename[ctwas_gene_res_subset$susie_pip>0.8]
twas_genes <- ctwas_gene_res_subset$genename[abs(ctwas_gene_res_subset$z)>sig_thresh]

#significance threshold for TWAS
print(sig_thresh)
[1] 4.563
#number of ctwas genes (in known annotations or bystanders)
length(ctwas_genes)
[1] 1
#number of TWAS genes (in known annotations or bystanders)
length(twas_genes)
[1] 8
#sensitivity / recall
sensitivity
  ctwas    TWAS 
0.01923 0.05769 
#specificity / (1 - False Positive Rate)
specificity
 ctwas   TWAS 
1.0000 0.9914 
#precision / PPV / (1 - False Discovery Rate)
precision
ctwas  TWAS 
1.000 0.375 

pip_range <- (0:1000)/1000
sensitivity <- rep(NA, length(pip_range))
specificity <- rep(NA, length(pip_range))

for (index in 1:length(pip_range)){
  pip <- pip_range[index]
  ctwas_genes <- ctwas_gene_res_subset$genename[ctwas_gene_res_subset$susie_pip>=pip]
  sensitivity[index] <- sum(ctwas_genes %in% known_annotations)/length(known_annotations)
  specificity[index] <- sum(!(unrelated_genes %in% ctwas_genes))/length(unrelated_genes)
}

plot(1-specificity, sensitivity, type="l", xlim=c(0,1), ylim=c(0,1), main="", xlab="1 - Specificity", ylab="Sensitivity")
title(expression("ROC Curve for cTWAS (black) and TWAS (" * phantom("red") * ")"))
title(expression(phantom("ROC Curve for cTWAS (black) and TWAS (") * "red" * phantom(")")), col.main="red")

sig_thresh_range <- seq(from=0, to=max(abs(ctwas_gene_res_subset$z)), length.out=length(pip_range))

for (index in 1:length(sig_thresh_range)){
  sig_thresh_plot <- sig_thresh_range[index]
  twas_genes <- ctwas_gene_res_subset$genename[abs(ctwas_gene_res_subset$z)>=sig_thresh_plot]
  sensitivity[index] <- sum(twas_genes %in% known_annotations)/length(known_annotations)
  specificity[index] <- sum(!(unrelated_genes %in% twas_genes))/length(unrelated_genes)
}

lines(1-specificity, sensitivity, xlim=c(0,1), ylim=c(0,1), col="red", lty=1)

abline(a=0,b=1,lty=3)

#add previously computed points from the analysis
ctwas_genes <- ctwas_gene_res_subset$genename[ctwas_gene_res_subset$susie_pip>0.8]
twas_genes <- ctwas_gene_res_subset$genename[abs(ctwas_gene_res_subset$z)>sig_thresh]

points(1-specificity_plot["ctwas"], sensitivity_plot["ctwas"], pch=21, bg="black")
points(1-specificity_plot["TWAS"], sensitivity_plot["TWAS"], pch=21, bg="red")

Undetected silver standard genes have low TWAS z-scores or stronger signal from nearby variants

#table of outcomes for silver standard genes
-sort(-table(silver_standard_case))
silver_standard_case
          Not Imputed Insignificant z-score         Nearby SNP(s) 
                   78                    49                     2 
 Detected (PIP > 0.8) 
                    1 
#show inconclusive genes
silver_standard_case[silver_standard_case=="Inconclusive"]
named character(0)


sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] GenomicRanges_1.36.1 GenomeInfoDb_1.20.0  IRanges_2.18.1      
 [4] S4Vectors_0.22.1     BiocGenerics_0.30.0  biomaRt_2.40.1      
 [7] readxl_1.3.1         forcats_0.5.1        stringr_1.4.0       
[10] dplyr_1.0.7          purrr_0.3.4          readr_2.1.1         
[13] tidyr_1.1.4          tidyverse_1.3.1      tibble_3.1.6        
[16] WebGestaltR_0.4.4    disgenet2r_0.99.2    enrichR_3.0         
[19] cowplot_1.0.0        ggplot2_3.3.5        workflowr_1.7.0     

loaded via a namespace (and not attached):
  [1] ggbeeswarm_0.6.0       colorspace_2.0-2       rjson_0.2.20          
  [4] ellipsis_0.3.2         rprojroot_2.0.2        XVector_0.24.0        
  [7] fs_1.5.2               rstudioapi_0.13        farver_2.1.0          
 [10] ggrepel_0.9.1          bit64_4.0.5            AnnotationDbi_1.46.0  
 [13] fansi_1.0.2            lubridate_1.8.0        xml2_1.3.3            
 [16] codetools_0.2-16       doParallel_1.0.17      cachem_1.0.6          
 [19] knitr_1.36             jsonlite_1.7.2         apcluster_1.4.8       
 [22] Cairo_1.5-12.2         broom_0.7.10           dbplyr_2.1.1          
 [25] compiler_3.6.1         httr_1.4.2             backports_1.4.1       
 [28] assertthat_0.2.1       Matrix_1.2-18          fastmap_1.1.0         
 [31] cli_3.1.0              later_0.8.0            prettyunits_1.1.1     
 [34] htmltools_0.5.2        tools_3.6.1            igraph_1.2.10         
 [37] GenomeInfoDbData_1.2.1 gtable_0.3.0           glue_1.6.2            
 [40] reshape2_1.4.4         doRNG_1.8.2            Rcpp_1.0.8            
 [43] Biobase_2.44.0         cellranger_1.1.0       jquerylib_0.1.4       
 [46] vctrs_0.3.8            svglite_1.2.2          iterators_1.0.14      
 [49] xfun_0.29              ps_1.6.0               rvest_1.0.2           
 [52] lifecycle_1.0.1        rngtools_1.5.2         XML_3.99-0.3          
 [55] zlibbioc_1.30.0        getPass_0.2-2          scales_1.1.1          
 [58] vroom_1.5.7            hms_1.1.1              promises_1.0.1        
 [61] yaml_2.2.1             curl_4.3.2             memoise_2.0.1         
 [64] ggrastr_1.0.1          gdtools_0.1.9          stringi_1.7.6         
 [67] RSQLite_2.2.8          highr_0.9              foreach_1.5.2         
 [70] rlang_1.0.1            pkgconfig_2.0.3        bitops_1.0-7          
 [73] evaluate_0.14          lattice_0.20-38        labeling_0.4.2        
 [76] bit_4.0.4              processx_3.5.2         tidyselect_1.1.1      
 [79] plyr_1.8.6             magrittr_2.0.2         R6_2.5.1              
 [82] generics_0.1.1         DBI_1.1.2              pillar_1.6.4          
 [85] haven_2.4.3            whisker_0.3-2          withr_2.4.3           
 [88] RCurl_1.98-1.5         modelr_0.1.8           crayon_1.5.0          
 [91] utf8_1.2.2             tzdb_0.2.0             rmarkdown_2.11        
 [94] progress_1.2.2         grid_3.6.1             data.table_1.14.2     
 [97] blob_1.2.2             callr_3.7.0            git2r_0.26.1          
[100] reprex_2.0.1           digest_0.6.29          httpuv_1.5.1          
[103] munsell_0.5.0          beeswarm_0.2.3         vipor_0.4.5