Last updated: 2018-08-20
workflowr checks: (Click a bullet for more information)Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
set.seed(1)
The command set.seed(1)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/.DS_Store
Ignored: analysis/.Rhistory
Ignored: analysis/figure/
Ignored: analysis/include/.DS_Store
Ignored: data/.DS_Store
Ignored: docs/.DS_Store
Ignored: output/.DS_Store
Untracked files:
Untracked: _workflowr.yml
Untracked: analysis/Classify.Rmd
Untracked: analysis/EstimateCorMaxEM.Rmd
Untracked: analysis/EstimateCorMaxEMGD.Rmd
Untracked: analysis/EstimateCorPrior.Rmd
Untracked: analysis/EstimateCorSol.Rmd
Untracked: analysis/HierarchicalFlashSim.Rmd
Untracked: analysis/MashLowSignal.Rmd
Untracked: analysis/Mash_GTEx.Rmd
Untracked: analysis/MeanAsh.Rmd
Untracked: analysis/OutlierDetection.Rmd
Untracked: analysis/OutlierDetection2.Rmd
Untracked: analysis/OutlierDetection3.Rmd
Untracked: analysis/OutlierDetection4.Rmd
Untracked: analysis/Test.Rmd
Untracked: analysis/mash_missing_row.Rmd
Untracked: code/MashClassify.R
Untracked: code/MashCorResult.R
Untracked: code/MashSource.R
Untracked: code/Weight_plot.R
Untracked: code/addemV.R
Untracked: code/estimate_cor.R
Untracked: code/generateDataV.R
Untracked: code/johnprocess.R
Untracked: code/sim_mean_sig.R
Untracked: code/summary.R
Untracked: data/Blischak_et_al_2015/
Untracked: data/scale_data.rds
Untracked: docs/figure/Classify.Rmd/
Untracked: docs/figure/OutlierDetection.Rmd/
Untracked: docs/figure/OutlierDetection2.Rmd/
Untracked: docs/figure/OutlierDetection3.Rmd/
Untracked: docs/figure/Test.Rmd/
Untracked: docs/figure/mash_missing_whole_row_5.Rmd/
Untracked: docs/include/
Untracked: output/AddEMV/
Untracked: output/CovED_UKBio_strong.rds
Untracked: output/CovED_UKBio_strong_Z.rds
Untracked: output/Flash_UKBio_strong.rds
Untracked: output/MASH.10.em2.result.rds
Untracked: output/MASH.10.mle.result.rds
Untracked: output/MASH.result.1.rds
Untracked: output/MASH.result.10.rds
Untracked: output/MASH.result.2.rds
Untracked: output/MASH.result.3.rds
Untracked: output/MASH.result.4.rds
Untracked: output/MASH.result.5.rds
Untracked: output/MASH.result.6.rds
Untracked: output/MASH.result.7.rds
Untracked: output/MASH.result.8.rds
Untracked: output/MASH.result.9.rds
Untracked: output/Mash_EE_Cov_0_plusR1.rds
Untracked: output/Trail 1/
Untracked: output/Trail 2/
Untracked: output/UKBio_mash_model.rds
Unstaged changes:
Modified: analysis/EstimateCorMaxEM2.Rmd
Modified: analysis/Mash_UKBio.Rmd
Modified: analysis/_site.yml
Modified: analysis/chunks.R
Modified: analysis/mash_missing_samplesize.Rmd
Modified: output/Flash_T2_0.rds
Modified: output/Flash_T2_0_mclust.rds
Modified: output/Mash_model_0_plusR1.rds
Modified: output/PresiAddVarCol.rds
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes. File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | b083054 | zouyuxin | 2018-08-20 | wflow_publish(“analysis/EstimateCorMax.Rmd”) |
html | 6281062 | zouyuxin | 2018-08-15 | Build site. |
Rmd | 3e3e128 | zouyuxin | 2018-08-15 | wflow_publish(c(“analysis/EstimateCor.Rmd”, “analysis/EstimateCorMax.Rmd”, |
html | bf28e18 | zouyuxin | 2018-08-14 | Build site. |
Rmd | 15b85be | zouyuxin | 2018-08-14 | wflow_publish(c(“analysis/EstimateCorMax.Rmd”, “analysis/EstimateCorMaxEM2.Rmd”)) |
html | 568fbe6 | zouyuxin | 2018-08-13 | Build site. |
Rmd | 3ae3f08 | zouyuxin | 2018-08-13 | wflow_publish(c(“analysis/EstimateCor.Rmd”, |
html | 1ceff72 | zouyuxin | 2018-08-03 | Build site. |
Rmd | 9d2623c | zouyuxin | 2018-08-03 | wflow_publish(c(“analysis/EstimateCorMax.Rmd”)) |
html | faa5b1a | zouyuxin | 2018-07-26 | Build site. |
Rmd | 7ac3df0 | zouyuxin | 2018-07-26 | wflow_publish(“analysis/EstimateCorMax.Rmd”) |
library(mashr)
Loading required package: ashr
source('../code/generateDataV.R')
source('../code/estimate_cor.R')
source('../code/summary.R')
library(kableExtra)
library(knitr)
We want to estimate ρ (ˆxˆy)|(xy)∼N((ˆxˆy);(xy),(1ρρ1)) (xy)∼K∑k=0πkN((xy);0,Uk) ⇒ (ˆxˆy)∼K∑k=0πkN((ˆxˆy);0,(1ρρ1)+Uk) Σk=(1ρρ1)+Uk=(1ρρ1)+(uk11uk12uk21uk22)=(1+uk11ρ+uk12ρ+uk211+uk22) Let σk11=√1+uk11, σk22=√1+uk22, ϕk=ρ+uk12σk11σk22
The loglikelihood is (with penalty) l(ρ,π)=n∑i=1logK∑k=0πkN(xi;0,Σk)+K∑k=0(λk−1)logπk
The penalty on π encourages over-estimation of π0, λk≥1.
l(ρ,π)=n∑i=1logK∑k=0πk12πσk11σk22√1−ϕ2kexp(−12(1−ϕ2k)[x2iσ2k11+y2iσ2k22−2ϕkxiyiσk11σk22])+K∑k=0(λk−1)logπk
Note: This probelm is convex with respect to π. In terms of ρ, the covenxity depends on the data.
Algorithm:
Input: X, init_pi, init_rho, Ulist
Compute loglikelihood
delta = 1
while delta > tol
Given pi, estimate rho by max loglikelihood (optim function)
Given rho, estimate pi by max loglikelihood (convex problem)
Compute loglikelihood
Update delta
ˆβ|β∼N2(ˆβ;β,(10.50.51))
β∼14δ0+14N2(0,(1000))+14N2(0,(0001))+14N2(0,(1111))
n = 4000
set.seed(1)
n = 4000; p = 2
Sigma = matrix(c(1,0.5,0.5,1),p,p)
U0 = matrix(0,2,2)
U1 = U0; U1[1,1] = 1
U2 = U0; U2[2,2] = 1
U3 = matrix(1,2,2)
Utrue = list(U0=U0, U1=U1, U2=U2, U3=U3)
data = generate_data(n, p, Sigma, Utrue)
m.data = mash_set_data(data$Bhat, data$Shat)
U.c = cov_canonical(m.data)
grid = mashr:::autoselect_grid(m.data, sqrt(2))
Ulist = mashr:::normalize_Ulist(U.c)
xUlist = mashr:::expand_cov(Ulist,grid,usepointmass = TRUE)
result <- optimize_pi_rho_times(data$Bhat, xUlist, init_rho = c(-0.7,0,0.7))
Warning in REBayes::KWDual(A, rep(1, k), normalize(w), control = control): estimated mixing distribution has some negative values:
consider reducing rtol
Warning in mixIP(matrix_lik = structure(c(0.0627889120852815,
0.0114523005348735, : Optimization step yields mixture weights that are
either too small, or negative; weights have been corrected and renormalized
after the optimization.
plot(result[[1]]$loglik)
Version | Author | Date |
---|---|---|
568fbe6 | zouyuxin | 2018-08-13 |
faa5b1a | zouyuxin | 2018-07-26 |
The estimated ρ is 0.6124039.
m.data.mle = mash_set_data(data$Bhat, data$Shat, V = matrix(c(1,result[[1]]$rho,result[[1]]$rho,1),2,2))
U.c = cov_canonical(m.data.mle)
m.mle = mash(m.data.mle, U.c, verbose= FALSE)
null.ind = which(apply(data$B,1,sum) == 0)
The log likelihood is -1.23017710^{4}. There are 55 significant samples, 2 false positives. The RRMSE is 0.5964282.
The estimated pi
is
barplot(get_estimated_pi(m.mle), las=2, cex.names = 0.7, main='MLE', ylim=c(0,0.8))
Version | Author | Date |
---|---|---|
bf28e18 | zouyuxin | 2018-08-14 |
568fbe6 | zouyuxin | 2018-08-13 |
The ROC curve:
m.data.correct = mash_set_data(data$Bhat, data$Shat, V=Sigma)
m.correct = mash(m.data.correct, U.c, verbose = FALSE)
m.correct.seq = ROC.table(data$B, m.correct)
m.mle.seq = ROC.table(data$B, m.mle)
Version | Author | Date |
---|---|---|
568fbe6 | zouyuxin | 2018-08-13 |
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.20 kableExtra_0.9.0 mashr_0.2-11 ashr_2.2-10
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 pillar_1.3.0 compiler_3.5.1
[4] git2r_0.23.0 plyr_1.8.4 workflowr_1.1.1
[7] R.methodsS3_1.7.1 R.utils_2.6.0 iterators_1.0.10
[10] tools_3.5.1 digest_0.6.15 viridisLite_0.3.0
[13] tibble_1.4.2 evaluate_0.11 lattice_0.20-35
[16] pkgconfig_2.0.1 rlang_0.2.1 Matrix_1.2-14
[19] foreach_1.4.4 rstudioapi_0.7 yaml_2.2.0
[22] parallel_3.5.1 mvtnorm_1.0-8 xml2_1.2.0
[25] httr_1.3.1 stringr_1.3.1 REBayes_1.3
[28] hms_0.4.2 rprojroot_1.3-2 grid_3.5.1
[31] R6_2.2.2 rmarkdown_1.10 rmeta_3.0
[34] readr_1.1.1 magrittr_1.5 whisker_0.3-2
[37] scales_0.5.0 backports_1.1.2 codetools_0.2-15
[40] htmltools_0.3.6 MASS_7.3-50 rvest_0.3.2
[43] assertthat_0.2.0 colorspace_1.3-2 stringi_1.2.4
[46] Rmosek_8.0.69 munsell_0.5.0 doParallel_1.0.11
[49] pscl_1.5.2 truncnorm_1.0-8 SQUAREM_2017.10-1
[52] crayon_1.3.4 R.oo_1.22.0
This reproducible R Markdown analysis was created with workflowr 1.1.1