Processing math: 100%
  • Background
  • Simulate data
    • Session information

Last updated: 2017-04-14

Code version: 64083de

Background

I wanted to try out David’s vicar package, particularly mouthwash_secondstep, and see how the t distribution vs normal distribution works, particularly with the variance inflation option.

Simulate data

I start with “null” data, but with t errors.

set.seed(1)
n=1000
bhat = rt(n,df=4) # t with 4 df
shat = rep(1,n)
library(ashr)
bhat.ash.t4 = ash(bhat,shat,df = 4)
bhat.ash.norm = ash(bhat,shat)
get_pi0(bhat.ash.norm)
[1] 0.8433816
get_pi0(bhat.ash.t4)
[1] 0.988749
min(get_lfsr(bhat.ash.norm))
[1] 0
min(get_lfsr(bhat.ash.t4))
[1] 0.2116995

So we see that use of a normal likelihood with t data creates “false positives”, not suprisingly.

Now I wanted to check if this also occurs in mouthwash - the idea was that maybe the use of a variance inflation parameter in mouthwash would avoid this behaviour. However, it appeared not to.

library("vicar")
a = matrix(rep(1,n),nrow = n, ncol=1) # just put in an "intercept" confounder with no effect
a_seq = bhat.ash.norm$fitted_g$a
b_seq = bhat.ash.norm$fitted_g$b
lambda_seq = rep(1,length(a_seq))
lambda_seq[1] = 10
bhat.m.norm = mouthwash_second_step(bhat,shat,a,lambda_seq = lambda_seq,a_seq = a_seq, b_seq=b_seq,mixing_dist = "uniform", likelihood="normal", scale_var = FALSE, degrees_freedom = 100000) #I had to set very high df to get it to run
bhat.m.t4 = mouthwash_second_step(bhat,shat,a,lambda_seq = lambda_seq,a_seq = a_seq, b_seq=b_seq,mixing_dist = "uniform", likelihood="t", scale_var = FALSE, degrees_freedom = 4)
bhat.m.norm.c = mouthwash_second_step(bhat,shat,a,lambda_seq = lambda_seq,a_seq = a_seq, b_seq=b_seq,mixing_dist = "uniform", likelihood="normal", scale_var = TRUE, degrees_freedom = 100000) #I had to set very high df to get it to run
bhat.m.t4.c = mouthwash_second_step(bhat,shat,a,lambda_seq = lambda_seq,a_seq = a_seq, b_seq=b_seq,mixing_dist = "uniform", likelihood="t", scale_var = TRUE, degrees_freedom = 4)

bhat.m.norm$pi0
[1] 0.8440323
bhat.m.t4$pi0
[1] 0.988799
bhat.m.norm.c$pi0
[1] 0.6972446
bhat.m.t4.c$pi0
[1] 0.9484263
min(bhat.m.norm$result$lfsr)
[1] 0
min(bhat.m.t4$result$lfsr)
[1] 0.2131923
min(bhat.m.norm.c$result$lfsr)
[1] 0
min(bhat.m.t4.c$result$lfsr)
[1] 0.1497321

So actually the scaling seems to make things worse here. Actually this seems to be because the scaling parameter is estimated to be <1.

bhat.m.norm.c$xi
[1] 0.7074148
bhat.m.t4.c$xi
[1] 0.8854449

Session information

sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ashr_2.1-8      vicar_0.1.6     workflowr_0.4.0 rmarkdown_1.4  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.10      rstudioapi_0.6    knitr_1.15.1     
 [4] magrittr_1.5      REBayes_0.73      MASS_7.3-45      
 [7] etrunct_0.1       doParallel_1.0.10 pscl_1.4.9       
[10] SQUAREM_2016.8-2  lattice_0.20-34   foreach_1.4.3    
[13] stringr_1.2.0     tools_3.3.2       grid_3.3.2       
[16] parallel_3.3.2    git2r_0.18.0      htmltools_0.3.5  
[19] iterators_1.0.8   assertthat_0.2.0  yaml_2.1.14      
[22] rprojroot_1.2     digest_0.6.12     Matrix_1.2-8     
[25] codetools_0.2-15  rsconnect_0.7     evaluate_0.10    
[28] stringi_1.1.2     Rmosek_7.1.2      backports_1.0.5  
[31] truncnorm_1.0-7  

This R Markdown site was created with workflowr