Last updated: 2018-08-03

library(mashr)

Loading required package: ashr

source('../code/generateDataV.R')
source('../code/estimate_cor.R')
library(knitr)
library(kableExtra)

One example

\[ \hat{\beta}|\beta \sim N_{3}(\hat{\beta}; \beta, \left(\begin{matrix} 1 & 0.7 & 0.1 \\ 0.7 & 1 & 0.5 \\ 0.1 & 0.5 & 1 \end{matrix}\right)) \]

\[ \beta \sim \frac{1}{4}\delta_{0} + \frac{1}{4}N_{3}(0, \left(\begin{matrix} 1 & 0 &0\\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}\right)) + \frac{1}{4}N_{3}(0, \left(\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{matrix}\right)) + \frac{1}{4}N_{3}(0, \left(\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix}\right)) \]

set.seed(1)
Sigma = cbind(c(1,0.7,0.1), c(0.7,1,0.5), c(0.1,0.5,1))
U0 = matrix(0,3,3)
U1 = matrix(0,3,3); U1[1,1] = 1
U2 = diag(3); U2[3,3] = 0
U3 = matrix(1,3,3)
data = generate_data(n=500, p=3, V=Sigma, Utrue = list(U0=U0, U1=U1,U2=U2,U3=U3))

m.data = mash_set_data(data$Bhat, data$Shat)
m.1by1 = mash_1by1(m.data)
strong = get_significant_results(m.1by1)

U.pca = cov_pca(m.data, 3, subset = strong)
U.ed = cov_ed(m.data, U.pca, subset = strong)
U.c = cov_canonical(m.data)

We run the algorithm in estimate cor mle with 3 different initial points for \(\rho\). The estimated correlation is

Vhat.max = estimateV(m.data, c(U.c, U.ed), init_rho = c(-0.5,0,0.5), tol=1e-3)
Vhat.max$V

          [,1]      [,2]      [,3]
[1,] 1.0000000 0.7131869 0.1343278
[2,] 0.7131869 1.0000000 0.5022083
[3,] 0.1343278 0.5022083 1.0000000

The running time for each pairwise correlation is

table = data.frame(t(Vhat.max$ttime))
colnames(table) = c('12','13','23')
table %>% kable() %>% kable_styling()

12	13	23
110.395	56.69	26.003

The time is the total running time with different initial point.

Using the original truncated correlation:

Vhat.tru = estimate_null_correlation(m.data)
Vhat.tru

           [,1]      [,2]       [,3]
[1,] 1.00000000 0.4621643 0.07423134
[2,] 0.46216433 1.0000000 0.41652572
[3,] 0.07423134 0.4165257 1.00000000

Checking the estiamted error:

maxError = c(max(abs(Vhat.max$V - Sigma)),
max(abs(Vhat.tru - Sigma)))
OpError = c(svd(Vhat.max$V - Sigma)$d[1],
svd(Vhat.tru - Sigma)$d[1])
table = data.frame(maxError = maxError, OpError = OpError, row.names = c('mle','trunc'))
table %>% kable() %>% kable_styling()

	maxError	OpError
mle	0.0343278	0.0375554
trunc	0.2378357	0.2609944

In mash model, the model with correlation from mle has larger loglikelihood.

m.data.mle = mash_set_data(data$Bhat, data$Shat, V=Vhat.max$V)
m.model.mle = mash(m.data.mle, c(U.c,U.ed), verbose = FALSE)

m.data.trunc = mash_set_data(data$Bhat, data$Shat, V=Vhat.tru)
m.model.trunc = mash(m.data.trunc, c(U.c,U.ed), verbose = FALSE)

table = data.frame(log_likelihood = c(get_loglik(m.model.mle), get_loglik(m.model.trunc)), row.names = c('mle', 'trunc'))
t(table) %>% kable() %>% kable_styling()

	mle	trunc
log_likelihood	-8905.915	-8942.591

More simulations

I randomly generate 10 positive definite correlation matrices, V. The sample size is 2000.

\[ \hat{z}|z \sim N_{5}(z, V) \] \[ z\sim\frac{1}{4}\delta_{0} + \frac{1}{4}N_{5}(0,\left(\begin{matrix} 1 & \mathbf{0}_{1\times 4} \\ \mathbf{0}_{4\times 1} & \mathbf{0}_{4\times 4} \end{matrix}\right)) + \frac{1}{4}N_{5}(0,\left(\begin{matrix} \mathbf{1}_{2\times 2} & \mathbf{0}_{1\times 3} \\ \mathbf{0}_{3\times 1} & \mathbf{0}_{3\times 3} \end{matrix}\right)) + \frac{1}{4}N_{5}(0,\mathbf{1}_{5\times 5}) \]

library(mashr)
n=500; p = 5
U0 = matrix(0,p,p)
U1 = U0; U1[1,1] = 1
U2 = U0; U2[c(1:2), c(1:2)] = 1
U3 = matrix(1, p,p)
Utrue = list(U0 = U0, U1 = U1, U2 = U2, U3 = U3)
result = V.simulation(n,p,10,Utrue)
saveRDS(result, '../output/MASH.10.mle.result.rds')

result = readRDS('../output/MASH.10.mle.result.rds')

The max elementwise norm is

temp = rbind(result$mMaxE.max, result$mMaxE.trun)
barplot(temp, axes=T,
        main="max elementwise error",
        names.arg = 1:10,
        legend.text = c("MLE", "Trunc"),
        ylim = c(0, max(colSums(temp))*2))

The spectral norm is

temp = rbind(result$m2E.max, result$m2E.trun)
barplot(temp, axes=T,
        main="Matrix 2 norm error",
        names.arg = 1:10,
        legend.text = c("MLE", "Trunc"),
        ylim = c(0, max(colSums(temp))*2))

The 10 estimated correlation matrices are all positive definite.

The total running time for each matrix is

t(result$Time) %>% kable() %>% kable_styling()

777.223

1078.659

651.365

745.071

1064.722

893.494

924.456

697.97

741.843

716.978

Session information

sessionInfo()

R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] kableExtra_0.9.0 knitr_1.20       mashr_0.2-9      ashr_2.2-10     

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.18             highr_0.7               
 [3] compiler_3.5.1           pillar_1.3.0            
 [5] plyr_1.8.4               iterators_1.0.10        
 [7] tools_3.5.1              digest_0.6.15           
 [9] viridisLite_0.3.0        evaluate_0.11           
[11] tibble_1.4.2             lattice_0.20-35         
[13] pkgconfig_2.0.1          rlang_0.2.1             
[15] Matrix_1.2-14            foreach_1.4.4           
[17] rstudioapi_0.7           yaml_2.2.0              
[19] parallel_3.5.1           mvtnorm_1.0-8           
[21] xml2_1.2.0               httr_1.3.1              
[23] stringr_1.3.1            REBayes_1.3             
[25] hms_0.4.2                rprojroot_1.3-2         
[27] grid_3.5.1               R6_2.2.2                
[29] rmarkdown_1.10           rmeta_3.0               
[31] readr_1.1.1              magrittr_1.5            
[33] scales_0.5.0             backports_1.1.2         
[35] codetools_0.2-15         htmltools_0.3.6         
[37] MASS_7.3-50              rvest_0.3.2             
[39] assertthat_0.2.0         colorspace_1.3-2        
[41] stringi_1.2.4            Rmosek_8.0.69           
[43] munsell_0.5.0            pscl_1.5.2              
[45] doParallel_1.0.11        truncnorm_1.0-8         
[47] SQUAREM_2017.10-1        ExtremeDeconvolution_1.3
[49] crayon_1.3.4

This R Markdown site was created with workflowr

Estimate cor—max MASH

Yuxin Zou

2018-07-25

One example

More simulations

Session information