Last updated: 2018-07-27
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date 
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
 ✔ Environment: empty 
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
 ✔ Seed: 
set.seed(20180714) 
The command set.seed(20180714) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
 ✔ Session information: recorded 
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
 ✔ Repository version: f497f36 
wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    docs/.DS_Store
    Ignored:    docs/figure/.DS_Store
Untracked files:
    Untracked:  data/greedy19.rds
I previously concluded that using warmstarts is more advantageous than not when ebnm_fn = ebnm_pn. Here I repeat the experiment with ebnm_fn = ebnm_ash.
# devtools::install_github("stephenslab/flashr", ref="trackObj")
devtools::load_all("/Users/willwerscheid/GitHub/flashr")Loading flashr# devtools::install_github("stephenslab/ebnm")
devtools::load_all("/Users/willwerscheid/GitHub/ebnm")Loading ebnmgtex <- readRDS(gzcon(url("https://github.com/stephenslab/gtexresults/blob/master/data/MatrixEQTLSumStats.Portable.Z.rds?raw=TRUE")))
strong <- t(gtex$strong.z)# This block was run in advance.
res.no.warmstart <- flash_add_greedy(strong, Kmax=50, init_fn="udv_svd",
                                     ebnm_fn="ebnm_ash", verbose=TRUE)
res.warmstart <- flash_add_greedy(strong, Kmax=50, init_fn="udv_svd",
                                  ebnm_fn="ebnm_ash", warmstart=TRUE, 
                                  verbose=TRUE)
saveRDS(res.no.warmstart, "../data/warmstart2/nowarmstart.rds")
saveRDS(res.warmstart, "../data/warmstart2/warmstart.rds")res.no.warmstart <- readRDS("./data/warmstart2/nowarmstart.rds")
res.warmstart <- readRDS("./data/warmstart2/warmstart.rds")The total time (in seconds) needed to optimize factors is:
x1 <- unlist(res.no.warmstart$opt_time)
x2 <- unlist(res.warmstart$opt_time)
list(no.warmstart = sum(x1), warmstart = sum(x2))$no.warmstart
[1] 686.3349
$warmstart
[1] 606.5611The time required per factor/loading is as follows.
plot(x1, ylim=c(0, max(x1) + 1), pch=19, col="blue",
     xlab="Factor/loading index", ylab="Optimization time (s)")
points(x2, pch=17, col="red")
legend("topright", c("No warmstart", "Warmstart"),
       pch=c(19, 17), col=c("blue", "red"))
| Version | Author | Date | 
|---|---|---|
| 85127b9 | Jason Willwerscheid | 2018-07-27 | 
So using a warmstart often provides a speed-up, but not nearly as reliably as with ebnm_pn. Indeed, for several factor/loading pairs (17, 19, 22, 24, 26), the warmstart considerably slows things down.
As with ebnm_pn, using warmstarts yields a slightly worse overall objective.
list(no.warmstart = flash_get_objective(strong, res.no.warmstart$f),
     warmstart = flash_get_objective(strong, res.warmstart$f))$no.warmstart
[1] -1252812
$warmstart
[1] -1252855The per-factor difference in objective is as follows.
o1 <- sapply(res.no.warmstart$obj, 
             function(obj) {max(unlist(obj))})
o2 <- sapply(res.warmstart$obj, 
             function(obj) {max(unlist(obj))})
plot(o2 - o1, type='l',
     xlab="Factor/loading index",
     ylab="Diff. in obj. using warmstart",
     main="Difference in overall objective after adding each factor")
| Version | Author | Date | 
|---|---|---|
| 85127b9 | Jason Willwerscheid | 2018-07-27 | 
The advantages of using warmstarts are not nearly as compelling as they were with ebnm_pn. The speed-up is not as reliable, and the sacrifice in the final objective attained is greater. Further, I have not seen Rmosek fail for ebnm_ash in the same way that optim fails for ebnm_pn. I am not convinced that we should automatically use warmstarts for every iteration when ebnm_fn = ebnm_ash. Perhaps we could add a parameter use_warmstarts to the main FLASH functions.
sessionInfo()R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] ebnm_0.1-13   flashr_0.5-12
loaded via a namespace (and not attached):
 [1] Rcpp_0.12.17        pillar_1.2.1        plyr_1.8.4         
 [4] compiler_3.4.3      git2r_0.21.0        workflowr_1.0.1    
 [7] R.methodsS3_1.7.1   R.utils_2.6.0       iterators_1.0.9    
[10] tools_3.4.3         testthat_2.0.0      digest_0.6.15      
[13] tibble_1.4.2        evaluate_0.10.1     memoise_1.1.0      
[16] gtable_0.2.0        lattice_0.20-35     rlang_0.2.0        
[19] Matrix_1.2-12       foreach_1.4.4       commonmark_1.4     
[22] yaml_2.1.17         parallel_3.4.3      withr_2.1.1.9000   
[25] stringr_1.3.0       roxygen2_6.0.1.9000 xml2_1.2.0         
[28] knitr_1.20          devtools_1.13.4     rprojroot_1.3-2    
[31] grid_3.4.3          R6_2.2.2            rmarkdown_1.8      
[34] ggplot2_2.2.1       ashr_2.2-10         magrittr_1.5       
[37] whisker_0.3-2       backports_1.1.2     scales_0.5.0       
[40] codetools_0.2-15    htmltools_0.3.6     MASS_7.3-48        
[43] assertthat_0.2.0    softImpute_1.4      colorspace_1.3-2   
[46] stringi_1.1.6       lazyeval_0.2.1      munsell_0.4.3      
[49] doParallel_1.0.11   pscl_1.5.2          truncnorm_1.0-8    
[52] SQUAREM_2017.10-1   R.oo_1.21.0        This reproducible R Markdown analysis was created with workflowr 1.0.1