SCZ - Brain Putamen basal ganglia

Last updated: 2022-03-14

Checks: 5 2

Knit directory: cTWAS_analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: uncommitted changes

The R Markdown is untracked by Git. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20211220)

The command set.seed(20211220) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: absolute

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute	relative
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/data/	data
/project2/xinhe/shengqian/cTWAS/cTWAS_analysis/code/ctwas_config.R	code/ctwas_config.R

Repository version: 4c71b11

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 4c71b11. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .ipynb_checkpoints/
    Ignored:    data/AF/

Untracked files:
    Untracked:  Rplot.png
    Untracked:  analysis/.ipynb_checkpoints/
    Untracked:  analysis/SCZ_2014_EUR_Brain_Amygdala.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Anterior_cingulate_cortex_BA24.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Caudate_basal_ganglia.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Cerebellar_Hemisphere.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Cerebellum.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Cortex.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Frontal_Cortex_BA9.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Hippocampus.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Hypothalamus.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Nucleus_accumbens_basal_ganglia.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Putamen_basal_ganglia.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Spinal_cord_cervical_c-1.Rmd
    Untracked:  analysis/SCZ_2014_EUR_Brain_Substantia_nigra.Rmd
    Untracked:  analysis/SCZ_2020_Brain_Cortex.Rmd
    Untracked:  analysis/SCZ_2020_Brain_Frontal_Cortex_BA9.Rmd
    Untracked:  analysis/SCZ_2020_Brain_Hypothalamus.Rmd
    Untracked:  analysis/SCZ_2020_Brain_Putamen_basal_ganglia.Rmd
    Untracked:  analysis/SCZ_Cross_Tissue_Analysis.Rmd
    Untracked:  code/.ipynb_checkpoints/
    Untracked:  code/AF_out/
    Untracked:  code/Autism_out/
    Untracked:  code/BMI_S_out/
    Untracked:  code/BMI_out/
    Untracked:  code/Glucose_out/
    Untracked:  code/LDL_S_out/
    Untracked:  code/SCZ_2014_EUR_out/
    Untracked:  code/SCZ_2020_out/
    Untracked:  code/SCZ_S_out/
    Untracked:  code/SCZ_out/
    Untracked:  code/T2D_out/
    Untracked:  code/ctwas_config.R
    Untracked:  code/mapping.R
    Untracked:  code/out/
    Untracked:  code/run_AF_analysis.sbatch
    Untracked:  code/run_AF_analysis.sh
    Untracked:  code/run_AF_ctwas_rss_LDR.R
    Untracked:  code/run_Autism_analysis.sbatch
    Untracked:  code/run_Autism_analysis.sh
    Untracked:  code/run_Autism_ctwas_rss_LDR.R
    Untracked:  code/run_BMI_analysis.sbatch
    Untracked:  code/run_BMI_analysis.sh
    Untracked:  code/run_BMI_analysis_S.sbatch
    Untracked:  code/run_BMI_analysis_S.sh
    Untracked:  code/run_BMI_ctwas_rss_LDR.R
    Untracked:  code/run_BMI_ctwas_rss_LDR_S.R
    Untracked:  code/run_Glucose_analysis.sbatch
    Untracked:  code/run_Glucose_analysis.sh
    Untracked:  code/run_Glucose_ctwas_rss_LDR.R
    Untracked:  code/run_LDL_analysis_S.sbatch
    Untracked:  code/run_LDL_analysis_S.sh
    Untracked:  code/run_LDL_ctwas_rss_LDR_S.R
    Untracked:  code/run_SCZ_2014_EUR_analysis.sbatch
    Untracked:  code/run_SCZ_2014_EUR_analysis.sh
    Untracked:  code/run_SCZ_2014_EUR_ctwas_rss_LDR.R
    Untracked:  code/run_SCZ_2020_analysis.sbatch
    Untracked:  code/run_SCZ_2020_analysis.sh
    Untracked:  code/run_SCZ_2020_ctwas_rss_LDR.R
    Untracked:  code/run_SCZ_analysis.sbatch
    Untracked:  code/run_SCZ_analysis.sh
    Untracked:  code/run_SCZ_analysis_S.sbatch
    Untracked:  code/run_SCZ_analysis_S.sh
    Untracked:  code/run_SCZ_ctwas_rss_LDR.R
    Untracked:  code/run_SCZ_ctwas_rss_LDR_S.R
    Untracked:  code/run_T2D_analysis.sbatch
    Untracked:  code/run_T2D_analysis.sh
    Untracked:  code/run_T2D_ctwas_rss_LDR.R
    Untracked:  code/wflow_build.R
    Untracked:  code/wflow_build.sbatch
    Untracked:  data/.ipynb_checkpoints/
    Untracked:  data/BMI/
    Untracked:  data/PGC3_SCZ_wave3_public.v2.tsv
    Untracked:  data/SCZ/
    Untracked:  data/SCZ_2014_EUR/
    Untracked:  data/SCZ_2020/
    Untracked:  data/SCZ_S/
    Untracked:  data/T2D/
    Untracked:  data/UKBB/
    Untracked:  data/UKBB_SNPs_Info.text
    Untracked:  data/gene_OMIM.txt
    Untracked:  data/gene_pip_0.8.txt
    Untracked:  data/mashr_Heart_Atrial_Appendage.db
    Untracked:  data/mashr_sqtl/
    Untracked:  data/summary_known_genes_annotations.xlsx
    Untracked:  data/untitled.txt

Unstaged changes:
    Modified:   analysis/SCZ_Brain_Amygdala.Rmd
    Modified:   analysis/SCZ_Brain_Anterior_cingulate_cortex_BA24.Rmd
    Modified:   analysis/SCZ_Brain_Caudate_basal_ganglia.Rmd
    Modified:   analysis/SCZ_Brain_Cerebellar_Hemisphere.Rmd
    Modified:   analysis/SCZ_Brain_Cerebellum.Rmd
    Modified:   analysis/SCZ_Brain_Cortex.Rmd
    Modified:   analysis/SCZ_Brain_Frontal_Cortex_BA9.Rmd
    Modified:   analysis/SCZ_Brain_Hippocampus.Rmd
    Modified:   analysis/SCZ_Brain_Hypothalamus.Rmd
    Modified:   analysis/SCZ_Brain_Nucleus_accumbens_basal_ganglia.Rmd
    Modified:   analysis/SCZ_Brain_Putamen_basal_ganglia.Rmd
    Modified:   analysis/SCZ_Brain_Spinal_cord_cervical_c-1.Rmd
    Modified:   analysis/SCZ_Brain_Substantia_nigra.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

There are no past versions. Publish this analysis with wflow_publish() to start tracking its development.

Weight QC

#number of imputed weights
nrow(qclist_all)

[1] 10890

#number of imputed weights by chromosome
table(qclist_all$chr)


   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
1069  765  627  423  520  642  537  391  403  438  640  628  225  357  364  500 
  17   18   19   20   21   22 
 661  173  823  318  118  268

#number of imputed weights without missing variants
sum(qclist_all$nmiss==0)

[1] 8221

#proportion of imputed weights without missing variants
mean(qclist_all$nmiss==0)

[1] 0.7549

Check convergence of parameters

#estimated group prior
estimated_group_prior <- group_prior_rec[,ncol(group_prior_rec)]
names(estimated_group_prior) <- c("gene", "snp")
estimated_group_prior["snp"] <- estimated_group_prior["snp"]*thin #adjust parameter to account for thin argument
print(estimated_group_prior)

     gene       snp 
0.0124812 0.0002562

#estimated group prior variance
estimated_group_prior_var <- group_prior_var_rec[,ncol(group_prior_var_rec)]
names(estimated_group_prior_var) <- c("gene", "snp")
print(estimated_group_prior_var)

 gene   snp 
9.991 8.252

#report sample size
print(sample_size)

[1] 77096

#report group size
group_size <- c(nrow(ctwas_gene_res), n_snps)
print(group_size)

[1]   10890 7352670

#estimated group PVE
estimated_group_pve <- estimated_group_prior_var*estimated_group_prior*group_size/sample_size #check PVE calculation
names(estimated_group_pve) <- c("gene", "snp")
print(estimated_group_pve)

   gene     snp 
0.01761 0.20163

#compare sum(PIP*mu2/sample_size) with above PVE calculation
c(sum(ctwas_gene_res$PVE),sum(ctwas_snp_res$PVE))

[1] 0.1259 1.7400

Genes with highest PIPs

          genename region_tag susie_pip     mu2       PVE      z num_eqtl
905          NT5C2      10_66    1.0000 3129.90 0.0405975 -8.190        1
10867       ZNF823      19_10    0.9810   29.50 0.0003754  5.485        1
4092         FEZF1       7_74    0.9807   27.71 0.0003525 -5.272        1
11990   AC012074.2       2_15    0.9252   22.17 0.0002660  4.623        1
8791         GNG12       1_42    0.9060   22.36 0.0002627  4.530        2
3043         SF3B1      2_117    0.9006   44.05 0.0005145  6.784        1
11497        AS3MT      10_66    0.8748  598.92 0.0067960  8.586        2
10737        PCBP2      12_33    0.8462   21.79 0.0002392  4.496        1
1657      KIAA0391      14_10    0.7730   23.84 0.0002390 -4.760        1
7857       PACSIN3      11_29    0.7560   23.10 0.0002266  4.629        1
7435      SERPINI1      3_103    0.7238   20.15 0.0001891 -4.030        1
6872         CNNM4       2_57    0.7212   22.58 0.0002113 -4.793        1
8900       MAP3K11      11_36    0.7110   22.26 0.0002053 -3.929        2
3935          KLC1      14_54    0.6842   41.27 0.0003663  6.966        1
2590           MDK      11_28    0.6718   38.05 0.0003315 -6.344        1
11110   LIN28B-AS1       6_70    0.6625   23.63 0.0002031 -4.736        2
5277         POC1B      12_54    0.6521   20.40 0.0001725  4.264        1
12516 RP11-65M17.3      11_66    0.6070   20.80 0.0001637  4.301        2
2337        ERLIN1      10_64    0.5760   22.43 0.0001676  4.370        1
700        PPP2R5B      11_36    0.5142   25.23 0.0001683 -4.585        1

Genes with largest effect sizes

         genename region_tag susie_pip     mu2       PVE       z num_eqtl
905         NT5C2      10_66 1.000e+00 3129.90 4.060e-02 -8.1897        1
6164        CNNM2      10_66 7.960e-06 3031.90 3.130e-07 -7.8764        1
11945   HIST1H2BN       6_21 6.776e-07  984.02 8.648e-09 10.7729        1
11497       AS3MT      10_66 8.748e-01  598.92 6.796e-03  8.5861        2
6711        MMP16       8_63 0.000e+00  520.75 0.000e+00  3.6449        1
13230 RP1-86C11.7       6_21 1.513e-12  426.71 8.377e-15  9.0332        1
5144       CALHM2      10_66 4.301e-11  426.15 2.377e-13 -3.3606        1
6156          INA      10_66 1.623e-10  310.71 6.539e-13 -3.6696        1
13650       HCP5B       6_24 7.559e-12  186.42 1.828e-14  2.4792        1
11190        MSH5       6_26 9.461e-03  175.14 2.149e-05  7.4967        2
11197        APOM       6_26 3.765e-05  154.56 7.548e-08  8.9450        1
2908         PCCB       3_84 1.617e-06  138.22 2.900e-09 -5.9913        1
12247         C4A       6_26 1.041e-07  136.75 1.847e-10  8.4587        2
13080       HCG17       6_24 1.471e-13  121.68 2.322e-16  4.0856        3
2196         MPP6       7_21 3.639e-03  110.55 5.218e-06 -3.4121        1
11165      NOTCH4       6_26 3.331e-16  101.41 4.381e-19  3.2643        2
3798    HIST1H2BJ       6_21 0.000e+00   99.23 0.000e+00  0.2007        2
9879       GRIN2A      16_10 7.640e-07   90.84 9.002e-10 -0.9830        2
11801      SAPCD1       6_26 1.435e-12   86.48 1.610e-15 -2.6196        1
10691    HLA-DQA1       6_26 1.796e-13   81.58 1.901e-16  1.7990        2

Genes with highest PVE

        genename region_tag susie_pip     mu2       PVE      z num_eqtl
905        NT5C2      10_66    1.0000 3129.90 0.0405975 -8.190        1
11497      AS3MT      10_66    0.8748  598.92 0.0067960  8.586        2
3043       SF3B1      2_117    0.9006   44.05 0.0005145  6.784        1
10867     ZNF823      19_10    0.9810   29.50 0.0003754  5.485        1
3935        KLC1      14_54    0.6842   41.27 0.0003663  6.966        1
4092       FEZF1       7_74    0.9807   27.71 0.0003525 -5.272        1
2590         MDK      11_28    0.6718   38.05 0.0003315 -6.344        1
11990 AC012074.2       2_15    0.9252   22.17 0.0002660  4.623        1
8791       GNG12       1_42    0.9060   22.36 0.0002627  4.530        2
10737      PCBP2      12_33    0.8462   21.79 0.0002392  4.496        1
1657    KIAA0391      14_10    0.7730   23.84 0.0002390 -4.760        1
7857     PACSIN3      11_29    0.7560   23.10 0.0002266  4.629        1
6872       CNNM4       2_57    0.7212   22.58 0.0002113 -4.793        1
8900     MAP3K11      11_36    0.7110   22.26 0.0002053 -3.929        2
11110 LIN28B-AS1       6_70    0.6625   23.63 0.0002031 -4.736        2
7435    SERPINI1      3_103    0.7238   20.15 0.0001891 -4.030        1
5406       FURIN      15_42    0.4177   32.97 0.0001786 -5.701        1
5277       POC1B      12_54    0.6521   20.40 0.0001725  4.264        1
700      PPP2R5B      11_36    0.5142   25.23 0.0001683 -4.585        1
2337      ERLIN1      10_64    0.5760   22.43 0.0001676  4.370        1

Genes with largest z scores

         genename region_tag susie_pip     mu2       PVE      z num_eqtl
11945   HIST1H2BN       6_21 6.776e-07  984.02 8.648e-09 10.773        1
13230 RP1-86C11.7       6_21 1.513e-12  426.71 8.377e-15  9.033        1
11197        APOM       6_26 3.765e-05  154.56 7.548e-08  8.945        1
11497       AS3MT      10_66 8.748e-01  598.92 6.796e-03  8.586        2
10244      BTN3A2       6_20 1.617e-02   58.46 1.226e-05  8.492        3
12247         C4A       6_26 1.041e-07  136.75 1.847e-10  8.459        2
905         NT5C2      10_66 1.000e+00 3129.90 4.060e-02 -8.190        1
6164        CNNM2      10_66 7.960e-06 3031.90 3.130e-07 -7.876        1
11190        MSH5       6_26 9.461e-03  175.14 2.149e-05  7.497        2
10593        TUBB       6_24 2.334e-08   77.05 2.332e-11 -7.349        1
11957      TRIM26       6_24 5.531e-12   61.93 4.443e-15 -7.107        2
10545     ZKSCAN3       6_22 1.690e-02   40.92 8.968e-06  7.035        1
3935         KLC1      14_54 6.842e-01   41.27 3.663e-04  6.966        1
3043        SF3B1      2_117 9.006e-01   44.05 5.145e-04  6.784        1
10732     ZSCAN26       6_22 9.943e-03   45.38 5.852e-06  6.759        3
13228   U91328.19       6_20 8.134e-02   45.11 4.759e-05 -6.580        1
2590          MDK      11_28 6.718e-01   38.05 3.315e-04 -6.344        1
11209      CCHCR1       6_26 2.433e-10   37.40 1.180e-13 -6.153        3
9596       HARBI1      11_28 1.660e-01   34.49 7.427e-05  6.084        1
12556      APOPT1      14_54 2.347e-02   31.58 9.616e-06 -6.006        2

Comparing z scores and PIPs

#proportion of significant z scores
mean(abs(ctwas_gene_res$z) > sig_thresh)

[1] 0.006979

GO enrichment analysis for genes with PIP>0.5

#number of genes for gene set enrichment
length(genes)

[1] 20

Uploading data to Enrichr... Done.
  Querying GO_Biological_Process_2021... Done.
  Querying GO_Cellular_Component_2021... Done.
  Querying GO_Molecular_Function_2021... Done.
Parsing results... Done.
[1] "GO_Biological_Process_2021"

[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
[1] "GO_Cellular_Component_2021"

[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)
[1] "GO_Molecular_Function_2021"

[1] Term             Overlap          Adjusted.P.value Genes           
<0 rows> (or 0-length row.names)

DisGeNET enrichment analysis for genes with PIP>0.5

                                                 Description      FDR Ratio
59                                  Amaurosis hypertrichosis 0.008233   1/9
60 Familial encephalopathy with neuroserpin inclusion bodies 0.008233   1/9
63                Cone rod dystrophy amelogenesis imperfecta 0.008233   1/9
66                                           Jalili syndrome 0.008233   1/9
68                SPASTIC PARAPLEGIA 45, AUTOSOMAL RECESSIVE 0.008233   1/9
69                                     CONE-ROD DYSTROPHY 20 0.008233   1/9
70  HYPOGONADOTROPIC HYPOGONADISM 22 WITH OR WITHOUT ANOSMIA 0.008233   1/9
71                SPASTIC PARAPLEGIA 62, AUTOSOMAL RECESSIVE 0.008233   1/9
17                       Neoplasms, Glandular and Epithelial 0.010973   1/9
25                                       Glandular Neoplasms 0.010973   1/9
   BgRatio
59  1/9703
60  1/9703
63  1/9703
66  1/9703
68  1/9703
69  1/9703
70  1/9703
71  1/9703
17  2/9703
25  2/9703

WebGestalt enrichment analysis for genes with PIP>0.5

Loading the functional categories...
Loading the ID list...
Loading the reference list...
Performing the enrichment analysis...

Warning in oraEnrichment(interestGeneList, referenceGeneList, geneSet, minNum =
minNum, : No significant gene set is identified based on FDR 0.05!

NULL

PIP Manhattan Plot

Warning: 'timedatectl' indicates the non-existent timezone name 'n/a'

Warning: Your system is mis-configured: '/etc/localtime' is not a symlink

Warning: It is strongly recommended to set envionment variable TZ to 'America/
Chicago' (or equivalent)

Sensitivity, specificity and precision for silver standard genes

#number of genes in known annotations
print(length(known_annotations))

[1] 130

#number of genes in known annotations with imputed expression
print(sum(known_annotations %in% ctwas_gene_res$genename))

[1] 60

#significance threshold for TWAS
print(sig_thresh)

[1] 4.583

#number of ctwas genes
length(ctwas_genes)

[1] 8

#number of TWAS genes
length(twas_genes)

[1] 76

#show novel genes (ctwas genes with not in TWAS genes)
ctwas_gene_res[ctwas_gene_res$genename %in% novel_genes,report_cols]

      genename region_tag susie_pip   mu2       PVE     z num_eqtl
8791     GNG12       1_42    0.9060 22.36 0.0002627 4.530        2
10737    PCBP2      12_33    0.8462 21.79 0.0002392 4.496        1

#sensitivity / recall
print(sensitivity)

  ctwas    TWAS 
0.01538 0.06154

#specificity
print(specificity)

 ctwas   TWAS 
0.9994 0.9937

#precision / PPV
print(precision)

 ctwas   TWAS 
0.2500 0.1053

cTWAS is more precise than TWAS in distinguishing silver standard and bystander genes

#number of genes in known annotations (with imputed expression)
print(length(known_annotations))

[1] 60

#number of bystander genes (with imputed expression)
print(length(unrelated_genes))

[1] 776

#subset results to genes in known annotations or bystanders
ctwas_gene_res_subset <- ctwas_gene_res[ctwas_gene_res$genename %in% c(known_annotations, unrelated_genes),]

#assign ctwas and TWAS genes
ctwas_genes <- ctwas_gene_res_subset$genename[ctwas_gene_res_subset$susie_pip>0.8]
twas_genes <- ctwas_gene_res_subset$genename[abs(ctwas_gene_res_subset$z)>sig_thresh]

#significance threshold for TWAS
print(sig_thresh)

[1] 4.583

#number of ctwas genes (in known annotations or bystanders)
length(ctwas_genes)

[1] 2

#number of TWAS genes (in known annotations or bystanders)
length(twas_genes)

[1] 17

#sensitivity / recall
sensitivity

  ctwas    TWAS 
0.03333 0.13333

#specificity / (1 - False Positive Rate)
specificity

 ctwas   TWAS 
1.0000 0.9884

#precision / PPV / (1 - False Discovery Rate)
precision

 ctwas   TWAS 
1.0000 0.4706

pip_range <- (0:1000)/1000
sensitivity <- rep(NA, length(pip_range))
specificity <- rep(NA, length(pip_range))

for (index in 1:length(pip_range)){
  pip <- pip_range[index]
  ctwas_genes <- ctwas_gene_res_subset$genename[ctwas_gene_res_subset$susie_pip>=pip]
  sensitivity[index] <- sum(ctwas_genes %in% known_annotations)/length(known_annotations)
  specificity[index] <- sum(!(unrelated_genes %in% ctwas_genes))/length(unrelated_genes)
}

plot(1-specificity, sensitivity, type="l", xlim=c(0,1), ylim=c(0,1), main="", xlab="1 - Specificity", ylab="Sensitivity")
title(expression("ROC Curve for cTWAS (black) and TWAS (" * phantom("red") * ")"))
title(expression(phantom("ROC Curve for cTWAS (black) and TWAS (") * "red" * phantom(")")), col.main="red")

sig_thresh_range <- seq(from=0, to=max(abs(ctwas_gene_res_subset$z)), length.out=length(pip_range))

for (index in 1:length(sig_thresh_range)){
  sig_thresh_plot <- sig_thresh_range[index]
  twas_genes <- ctwas_gene_res_subset$genename[abs(ctwas_gene_res_subset$z)>=sig_thresh_plot]
  sensitivity[index] <- sum(twas_genes %in% known_annotations)/length(known_annotations)
  specificity[index] <- sum(!(unrelated_genes %in% twas_genes))/length(unrelated_genes)
}

lines(1-specificity, sensitivity, xlim=c(0,1), ylim=c(0,1), col="red", lty=1)

abline(a=0,b=1,lty=3)

#add previously computed points from the analysis
ctwas_genes <- ctwas_gene_res_subset$genename[ctwas_gene_res_subset$susie_pip>0.8]
twas_genes <- ctwas_gene_res_subset$genename[abs(ctwas_gene_res_subset$z)>sig_thresh]

points(1-specificity_plot["ctwas"], sensitivity_plot["ctwas"], pch=21, bg="black")
points(1-specificity_plot["TWAS"], sensitivity_plot["TWAS"], pch=21, bg="red")

Undetected silver standard genes have low TWAS z-scores or stronger signal from nearby variants

#table of outcomes for silver standard genes
-sort(-table(silver_standard_case))

silver_standard_case
          Not Imputed Insignificant z-score         Nearby SNP(s) 
                   70                    52                     6 
 Detected (PIP > 0.8) 
                    2

#show inconclusive genes
silver_standard_case[silver_standard_case=="Inconclusive"]

named character(0)

sessionInfo()

R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] GenomicRanges_1.36.1 GenomeInfoDb_1.20.0  IRanges_2.18.1      
 [4] S4Vectors_0.22.1     BiocGenerics_0.30.0  biomaRt_2.40.1      
 [7] readxl_1.3.1         forcats_0.5.1        stringr_1.4.0       
[10] dplyr_1.0.7          purrr_0.3.4          readr_2.1.1         
[13] tidyr_1.1.4          tidyverse_1.3.1      tibble_3.1.6        
[16] WebGestaltR_0.4.4    disgenet2r_0.99.2    enrichR_3.0         
[19] cowplot_1.0.0        ggplot2_3.3.5        workflowr_1.7.0     

loaded via a namespace (and not attached):
  [1] ggbeeswarm_0.6.0       colorspace_2.0-2       rjson_0.2.20          
  [4] ellipsis_0.3.2         rprojroot_2.0.2        XVector_0.24.0        
  [7] fs_1.5.2               rstudioapi_0.13        farver_2.1.0          
 [10] ggrepel_0.9.1          bit64_4.0.5            AnnotationDbi_1.46.0  
 [13] fansi_1.0.2            lubridate_1.8.0        xml2_1.3.3            
 [16] codetools_0.2-16       doParallel_1.0.17      cachem_1.0.6          
 [19] knitr_1.36             jsonlite_1.7.2         apcluster_1.4.8       
 [22] Cairo_1.5-12.2         broom_0.7.10           dbplyr_2.1.1          
 [25] compiler_3.6.1         httr_1.4.2             backports_1.4.1       
 [28] assertthat_0.2.1       Matrix_1.2-18          fastmap_1.1.0         
 [31] cli_3.1.0              later_0.8.0            prettyunits_1.1.1     
 [34] htmltools_0.5.2        tools_3.6.1            igraph_1.2.10         
 [37] GenomeInfoDbData_1.2.1 gtable_0.3.0           glue_1.6.2            
 [40] reshape2_1.4.4         doRNG_1.8.2            Rcpp_1.0.8            
 [43] Biobase_2.44.0         cellranger_1.1.0       jquerylib_0.1.4       
 [46] vctrs_0.3.8            svglite_1.2.2          iterators_1.0.14      
 [49] xfun_0.29              ps_1.6.0               rvest_1.0.2           
 [52] lifecycle_1.0.1        rngtools_1.5.2         XML_3.99-0.3          
 [55] zlibbioc_1.30.0        getPass_0.2-2          scales_1.1.1          
 [58] vroom_1.5.7            hms_1.1.1              promises_1.0.1        
 [61] yaml_2.2.1             curl_4.3.2             memoise_2.0.1         
 [64] ggrastr_1.0.1          gdtools_0.1.9          stringi_1.7.6         
 [67] RSQLite_2.2.8          highr_0.9              foreach_1.5.2         
 [70] rlang_1.0.1            pkgconfig_2.0.3        bitops_1.0-7          
 [73] evaluate_0.14          lattice_0.20-38        labeling_0.4.2        
 [76] bit_4.0.4              processx_3.5.2         tidyselect_1.1.1      
 [79] plyr_1.8.6             magrittr_2.0.2         R6_2.5.1              
 [82] generics_0.1.1         DBI_1.1.2              pillar_1.6.4          
 [85] haven_2.4.3            whisker_0.3-2          withr_2.4.3           
 [88] RCurl_1.98-1.5         modelr_0.1.8           crayon_1.5.0          
 [91] utf8_1.2.2             tzdb_0.2.0             rmarkdown_2.11        
 [94] progress_1.2.2         grid_3.6.1             data.table_1.14.2     
 [97] blob_1.2.2             callr_3.7.0            git2r_0.26.1          
[100] reprex_2.0.1           digest_0.6.29          httpuv_1.5.1          
[103] munsell_0.5.0          beeswarm_0.2.3         vipor_0.4.5