Last updated: 2017-04-14
Code version: 64083de
I wanted to try out David’s vicar package, particularly mouthwash_secondstep
, and see how the t distribution vs normal distribution works, particularly with the variance inflation option.
I start with “null” data, but with t errors.
set.seed(1)
n=1000
bhat = rt(n,df=4) # t with 4 df
shat = rep(1,n)
library(ashr)
bhat.ash.t4 = ash(bhat,shat,df = 4)
bhat.ash.norm = ash(bhat,shat)
get_pi0(bhat.ash.norm)
[1] 0.8433816
get_pi0(bhat.ash.t4)
[1] 0.988749
min(get_lfsr(bhat.ash.norm))
[1] 0
min(get_lfsr(bhat.ash.t4))
[1] 0.2116995
So we see that use of a normal likelihood with t data creates “false positives”, not suprisingly.
Now I wanted to check if this also occurs in mouthwash - the idea was that maybe the use of a variance inflation parameter in mouthwash would avoid this behaviour. However, it appeared not to.
library("vicar")
a = matrix(rep(1,n),nrow = n, ncol=1) # just put in an "intercept" confounder with no effect
a_seq = bhat.ash.norm$fitted_g$a
b_seq = bhat.ash.norm$fitted_g$b
lambda_seq = rep(1,length(a_seq))
lambda_seq[1] = 10
bhat.m.norm = mouthwash_second_step(bhat,shat,a,lambda_seq = lambda_seq,a_seq = a_seq, b_seq=b_seq,mixing_dist = "uniform", likelihood="normal", scale_var = FALSE, degrees_freedom = 100000) #I had to set very high df to get it to run
bhat.m.t4 = mouthwash_second_step(bhat,shat,a,lambda_seq = lambda_seq,a_seq = a_seq, b_seq=b_seq,mixing_dist = "uniform", likelihood="t", scale_var = FALSE, degrees_freedom = 4)
bhat.m.norm.c = mouthwash_second_step(bhat,shat,a,lambda_seq = lambda_seq,a_seq = a_seq, b_seq=b_seq,mixing_dist = "uniform", likelihood="normal", scale_var = TRUE, degrees_freedom = 100000) #I had to set very high df to get it to run
bhat.m.t4.c = mouthwash_second_step(bhat,shat,a,lambda_seq = lambda_seq,a_seq = a_seq, b_seq=b_seq,mixing_dist = "uniform", likelihood="t", scale_var = TRUE, degrees_freedom = 4)
bhat.m.norm$pi0
[1] 0.8440323
bhat.m.t4$pi0
[1] 0.988799
bhat.m.norm.c$pi0
[1] 0.6972446
bhat.m.t4.c$pi0
[1] 0.9484263
min(bhat.m.norm$result$lfsr)
[1] 0
min(bhat.m.t4$result$lfsr)
[1] 0.2131923
min(bhat.m.norm.c$result$lfsr)
[1] 0
min(bhat.m.t4.c$result$lfsr)
[1] 0.1497321
So actually the scaling seems to make things worse here. Actually this seems to be because the scaling parameter is estimated to be <1.
bhat.m.norm.c$xi
[1] 0.7074148
bhat.m.t4.c$xi
[1] 0.8854449
sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ashr_2.1-8 vicar_0.1.6 workflowr_0.4.0 rmarkdown_1.4
loaded via a namespace (and not attached):
[1] Rcpp_0.12.10 rstudioapi_0.6 knitr_1.15.1
[4] magrittr_1.5 REBayes_0.73 MASS_7.3-45
[7] etrunct_0.1 doParallel_1.0.10 pscl_1.4.9
[10] SQUAREM_2016.8-2 lattice_0.20-34 foreach_1.4.3
[13] stringr_1.2.0 tools_3.3.2 grid_3.3.2
[16] parallel_3.3.2 git2r_0.18.0 htmltools_0.3.5
[19] iterators_1.0.8 assertthat_0.2.0 yaml_2.1.14
[22] rprojroot_1.2 digest_0.6.12 Matrix_1.2-8
[25] codetools_0.2-15 rsconnect_0.7 evaluate_0.10
[28] stringi_1.1.2 Rmosek_7.1.2 backports_1.0.5
[31] truncnorm_1.0-7
This R Markdown site was created with workflowr