Last updated: 2020-07-24

Checks: 6 1

Knit directory: causal-TWAS/

This reproducible R Markdown analysis was created with workflowr (version 1.6.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown is untracked by Git. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20191103) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    data/

Untracked files:
    Untracked:  analysis/simulation-multi-ukbchr22-gtex.adipose.Rmd
    Untracked:  analysis/simulation-multi-ukbchr22-gtex.adipose2.Rmd
    Untracked:  code/run_UKB_process.R
    Untracked:  code/workflow/
    Untracked:  code/wtccc/

Unstaged changes:
    Modified:   README.md
    Modified:   analysis/index.Rmd
    Deleted:    code/ctwas_polygenic_V1.R
    Deleted:    code/ctwas_spikeslab_V1.R
    Deleted:    code/gene_annotation.R
    Modified:   code/input_reformat.R
    Modified:   code/mr.ash2.R
    Modified:   code/mr.ash2_FBM.R
    Deleted:    code/run_WTCCC_data_process.R
    Modified:   code/run_gwas_snp.R
    Modified:   code/run_test_mr.ash2s.R
    Modified:   code/run_test_susie.R
    Deleted:    code/simulate-WTCCC-expr.R
    Deleted:    code/simulate-WTCCC-phenotype.R
    Modified:   code/simulate_phenotype.R
    Deleted:    code/train_expression.R
    Deleted:    code/workflow-WTCCC-polygenic-simulation.ipynb
    Deleted:    code/workflow-ashtest.ipynb
    Deleted:    code/workflow-ashtest2.ipynb
    Deleted:    code/workflow-ashtest3.ipynb
    Deleted:    code/workflow-data.ipynb

Staged changes:
    Deleted:    code/master_run3.sh

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


There are no past versions. Publish this analysis with wflow_publish() to start tracking its development.


Run simulation 9 times for ukb chr 22.

library(mr.ash.alpha)
library(data.table)
suppressMessages({library(plotly)})
library(tidyr)
library(plyr)
simdatadir <- "~/causalTWAS/simulations/simulation_ashtest_20200616/"
outputdir <- "~/causalTWAS/simulations/simulation_ashtest_20200616/"
susiedir <- "~/causalTWAS/simulations/simulation_susietest_20200616/"
tags <- paste0('20200616-8-', 1:9)
tag2s <- c('zeroes-es', 'zerose-es', 'lassoes-es','lassoes-se')
get_files <- function(tag, tag2){
  par <- paste0(outputdir, tag, "-mr.ash2s.", tag2, ".param.txt")
  rpip <- paste0(outputdir, tag, "-mr.ash2s.", tag2, ".rPIP.txt")
  
  gmrash <- paste0(outputdir, tag, "-mr.ash2s.", tag2, ".expr.txt")
  smrash <- paste0(outputdir, tag, "-mr.ash2s.", tag2, ".snp.txt")   
  
  ggwas <- paste0(outputdir, tag, ".exprgwas.txt.gz")
  sgwas <- paste0(outputdir, tag, ".snpgwas.txt.gz")
  
  gsusie <- paste0(susiedir, tag, ".", tag2, ".L3.susieres.expr.txt")
  ssusie <- paste0(susiedir, tag, ".", tag2, ".L3.susieres.snp.txt")
  
  return(tibble::lst(par, rpip, gmrash, ggwas, smrash, sgwas, gsusie, ssusie))
}

Mr.ash2 parameter estimation

Results for 9 simulations runs, using different initiate and update strategy

show_param <- function(tag2){
  f <- lapply(tags, get_files, tag2 = tag2)
  parf <- lapply(f, '[[', "par")
  param <- do.call(rbind, lapply(parf, function(x) t(read.table(x))[2:1,]))
  knitr::kable(param)
}

NULL; expr-snp; expr-snp

show_param(tag2s[1])
gene.pi1 gene.pve snp.pi1 snp.pve
truth 0.0209205 0.0060041 0.0002559 0.0611847
estimated 0.0222753 0.0257131 0.0002721 0.0476191
truth 0.0209205 0.0125625 0.0002559 0.0623990
estimated 0.0293598 0.0148927 0.0001422 0.0565124
truth 0.0209205 0.0143559 0.0002559 0.0651471
estimated 0.0248155 0.0270832 0.0002346 0.0538022
truth 0.0209205 0.0095282 0.0002559 0.0504149
estimated 0.0392044 0.0134942 0.0002832 0.0285669
truth 0.0209205 0.0076604 0.0002559 0.0411680
estimated 0.0096836 0.0117105 0.0002105 0.0474628
truth 0.0209205 0.0088879 0.0002559 0.0465910
estimated 0.0327729 0.0160215 0.0001661 0.0392070
truth 0.0209205 0.0075311 0.0002559 0.0742512
estimated 0.0464974 0.0141466 0.0001922 0.0739697
truth 0.0209205 0.0095491 0.0002559 0.0748342
estimated 0.0183843 0.0170822 0.0002301 0.0774682
truth 0.0209205 0.0124448 0.0002559 0.0395446
estimated 0.0327868 0.0146459 0.0001889 0.0461935

NULL; snp-expr; expr-snp

show_param(tag2s[2])
gene.pi1 gene.pve snp.pi1 snp.pve
truth 0.0209205 0.0060041 0.0002559 0.0611847
estimated 0.0222753 0.0257131 0.0002721 0.0476191
truth 0.0209205 0.0125625 0.0002559 0.0623990
estimated 0.0293589 0.0148945 0.0001422 0.0565125
truth 0.0209205 0.0143559 0.0002559 0.0651471
estimated 0.0248155 0.0270832 0.0002346 0.0538022
truth 0.0209205 0.0095282 0.0002559 0.0504149
estimated 0.0392026 0.0134994 0.0002832 0.0285687
truth 0.0209205 0.0076604 0.0002559 0.0411680
estimated 0.0096836 0.0117105 0.0002105 0.0474627
truth 0.0209205 0.0088879 0.0002559 0.0465910
estimated 0.0327729 0.0160215 0.0001661 0.0392070
truth 0.0209205 0.0075311 0.0002559 0.0742512
estimated 0.0464974 0.0141466 0.0001922 0.0739697
truth 0.0209205 0.0095491 0.0002559 0.0748342
estimated 0.0183844 0.0170820 0.0002301 0.0774678
truth 0.0209205 0.0124448 0.0002559 0.0395446
estimated 0.0327868 0.0146459 0.0001889 0.0461936

lasso; expr-snp; expr-snp

show_param(tag2s[3])
gene.pi1 gene.pve snp.pi1 snp.pve
truth 0.0209205 0.0060041 0.0002559 0.0611847
estimated 0.0069797 0.0020441 0.0002275 0.0614937
truth 0.0209205 0.0125625 0.0002559 0.0623990
estimated 0.0207931 0.0147445 0.0001572 0.0622327
truth 0.0209205 0.0143559 0.0002559 0.0651471
estimated 0.0114924 0.0140703 0.0002667 0.0683366
truth 0.0209205 0.0095282 0.0002559 0.0504149
estimated 0.0136536 0.0040695 0.0003125 0.0504142
truth 0.0209205 0.0076604 0.0002559 0.0411680
estimated 0.0091555 0.0111420 0.0002774 0.0279415
truth 0.0209205 0.0088879 0.0002559 0.0465910
estimated 0.0113293 0.0123651 0.0001963 0.0511911
truth 0.0209205 0.0075311 0.0002559 0.0742512
estimated 0.0122711 0.0120373 0.0002557 0.0965410
truth 0.0209205 0.0095491 0.0002559 0.0748342
estimated 0.0101661 0.0119478 0.0002181 0.0838292
truth 0.0209205 0.0124448 0.0002559 0.0395446
estimated 0.0195901 0.0075159 0.0002253 0.0610403

lasso; expr-snp; snp-expr

show_param(tag2s[4])
gene.pi1 gene.pve snp.pi1 snp.pve
truth 0.0209205 0.0060041 0.0002559 0.0611847
estimated 0.0069853 0.0020446 0.0002259 0.0621952
truth 0.0209205 0.0125625 0.0002559 0.0623990
estimated 0.0212120 0.0139487 0.0001571 0.0622386
truth 0.0209205 0.0143559 0.0002559 0.0651471
estimated 0.0115027 0.0140764 0.0002651 0.0691414
truth 0.0209205 0.0095282 0.0002559 0.0504149
estimated 0.0136857 0.0040785 0.0003073 0.0534981
truth 0.0209205 0.0076604 0.0002559 0.0411680
estimated 0.0092058 0.0110433 0.0002779 0.0277027
truth 0.0209205 0.0088879 0.0002559 0.0465910
estimated 0.0116320 0.0118495 0.0001973 0.0507534
truth 0.0209205 0.0075311 0.0002559 0.0742512
estimated 0.0127229 0.0111653 0.0002554 0.0966526
truth 0.0209205 0.0095491 0.0002559 0.0748342
estimated 0.0102499 0.0117384 0.0002180 0.0838319
truth 0.0209205 0.0124448 0.0002559 0.0395446
estimated 0.0197915 0.0070771 0.0002242 0.0616208

Regional mr.ash2s PIP overview

Take simulation 1 (NULL; expr-snp; expr-snp) as examples. We use region size 500kb and PIP cut off at 0.5 for SUSIE.

f <- get_files(tag= tags[1], tag2 = tag2s[1])
a <- read.table(f[["rpip"]], header = T)
plot(a$p0, a$rPIP, pch =19, col ='salmon', xlab = "position", ylab= "Sum of PIP")
grid()

PIP scatter plot

mr.ash2s PIP vs. susie PIP.

scatter_plot_PIP<- function(tag2){
  f <- lapply(tags, get_files, tag2 = tag2)
  mrashf <- lapply(f, '[[', "gmrash")
  names(mrashf) <- tags
  
  susief <- lapply(f, '[[', "gsusie")
  names(susief) <- tags

  .tagname <- function(x, flist){
    a <- read.table(flist[[x]], header =T)
    a[, "name"] <- paste0(x, ":", a[, "name"])
    a
  }
  mrashres <- do.call(rbind, lapply(tags, .tagname, flist = mrashf))
  susieres <- do.call(rbind, lapply(tags, .tagname, flist = susief))
 
  res <- merge(mrashres, susieres, by = "name", all = T)
  
  res <- res[complete.cases(res),]
  res <- rename(res, c("PIP" = "mr.ash_PIP", "pip" = "SUSIE_PIP", "pip.null" = "SUSIE_PIP_null") )
  res$ifcausal <- mapvalues(res$ifcausal, 
          from=c(0,1), 
          to=c("Non causal", "Causal"))
  
  fig1 <- plot_ly(data = res, x = ~ mr.ash_PIP, y = ~ SUSIE_PIP, color = ~ ifcausal, 
                 colors = c( "salmon", "darkgreen"))
  
  fig2 <- plot_ly(data = res, x = ~ mr.ash_PIP, y = ~ SUSIE_PIP_null, color = ~ ifcausal, 
                 colors = c( "salmon", "darkgreen"))
  
  fig <- subplot(fig1, fig2, titleX = TRUE, titleY = T, margin = 0.1)
  fig
}

NULL; expr-snp; expr-snp

scatter_plot_PIP(tag2s[1])
Warning: `arrange_()` is deprecated as of dplyr 0.7.0.
Please use `arrange()` instead.
See vignette('programming') for more help
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.

NULL; snp-expr; expr-snp

scatter_plot_PIP(tag2s[2])

lasso; expr-snp; expr-snp

scatter_plot_PIP(tag2s[3])

lasso; expr-snp; snp-expr

scatter_plot_PIP(tag2s[4])

ROC curve

ROC_plot<- function(tag2){
  f <- lapply(tags, get_files, tag2 = tag2)
  mrashf <- lapply(f, '[[', "gmrash")
  names(mrashf) <- tags
  
  susief <- lapply(f, '[[', "gsusie")
  names(susief) <- tags
  
  gwasf <- lapply(f, '[[', "ggwas")
  names(gwasf) <- tags

  .tagname <- function(x, flist, colnames = NULL){
    a <- read.table(flist[[x]], header =T)
    if (!is.null(colnames)){
      colnames(a) <- colnames
    }
    a[, "name"] <- paste0(x, ":", a[, "name"])
    a
  }
  mrashres <- do.call(rbind, lapply(tags, .tagname, flist = mrashf))
  susieres <- do.call(rbind, lapply(tags, .tagname, flist = susief))
  gwasres <- do.call(rbind, lapply(tags, .tagname, flist = gwasf, 
                                   colnames =  c("chr", "p0",   "p1", "name", "Estimate", "Std.Error", "t-value", "PVALUE")))

  res <- merge(mrashres, susieres, by = "name", all = T)
  res <- merge(res, gwasres, by = "name", all = T)
  
  res <- res[complete.cases(res),]
  res <- rename(res, c("PIP" = "mr.ash", "pip" = "SUSIE", "PVALUE" = "TWAS") )
  res[,"TWAS"] <- -log10(res[, "TWAS"])
  
  roccolors <-  c("red", "green", "blue")
  methods <- c("mr.ash", "SUSIE", "TWAS")
  plot(0, xlim=c(0,1), ylim=c(0,1), col="white", xlab = "FPR", ylab = "TPR")
  for (i in 1:3){
    method <- methods[i]
    bordered <- res[order(res[,method]),] 
    actuals <- bordered$ifcausal == 1
    sens <- (sum(actuals) - cumsum(actuals))/sum(actuals)
    spec <- cumsum(!actuals)/sum(!actuals)
    lines(1 - spec, sens, type = "l", col = roccolors[i])
    abline(c(0,0),c(1,1))
    auc <- sum(spec*diff(c(0, 1 - sens)))
    cat("AUC for ", method, ": ", auc)
  }
  legend(0.6,0.3, legend= methods, col=roccolors, lty=1, cex=0.8)
  grid()
}

NULL; expr-snp; expr-snp

ROC_plot(tag2s[1])

AUC for  mr.ash :  0.8532693AUC for  SUSIE :  0.8479238AUC for  TWAS :  0.8474589

NULL; snp-expr; expr-snp

ROC_plot(tag2s[2])

AUC for  mr.ash :  0.8532693AUC for  SUSIE :  0.8479238AUC for  TWAS :  0.8474589

lasso; expr-snp; expr-snp

ROC_plot(tag2s[3])

AUC for  mr.ash :  0.7802647AUC for  SUSIE :  0.8690825AUC for  TWAS :  0.8674838

lasso; expr-snp; snp-expr

ROC_plot(tag2s[4])

AUC for  mr.ash :  0.7769784AUC for  SUSIE :  0.8691713AUC for  TWAS :  0.8674838

sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/libopenblas_haswellp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] plyr_1.8.6          tidyr_0.8.3         plotly_4.9.2.9000  
[4] ggplot2_3.3.1       data.table_1.12.7   mr.ash.alpha_0.1-34

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6      highr_0.7         compiler_3.5.1   
 [4] pillar_1.4.4      later_0.7.5       git2r_0.26.1     
 [7] workflowr_1.6.0   tools_3.5.1       digest_0.6.25    
[10] viridisLite_0.3.0 jsonlite_1.6.1    evaluate_0.12    
[13] tibble_3.0.1      lifecycle_0.2.0   gtable_0.2.0     
[16] lattice_0.20-38   pkgconfig_2.0.2   rlang_0.4.6      
[19] Matrix_1.2-15     shiny_1.2.0       crosstalk_1.0.0  
[22] yaml_2.2.0        httr_1.4.1        withr_2.1.2      
[25] stringr_1.4.0     dplyr_1.0.0       knitr_1.20       
[28] htmlwidgets_1.3   generics_0.0.2    fs_1.3.1         
[31] vctrs_0.3.1       tidyselect_1.1.0  rprojroot_1.3-2  
[34] grid_3.5.1        glue_1.4.1        R6_2.3.0         
[37] rmarkdown_1.10    purrr_0.3.4       magrittr_1.5     
[40] backports_1.1.2   scales_1.0.0      promises_1.0.1   
[43] htmltools_0.3.6   ellipsis_0.3.1    xtable_1.8-3     
[46] mime_0.6          colorspace_1.3-2  httpuv_1.4.5     
[49] stringi_1.3.1     lazyeval_0.2.1    munsell_0.5.0    
[52] crayon_1.3.4