Last updated: 2017-03-06
Code version: c7339fc
include the most complex concepts required to understand the material.
Suppose we have a logistic regression \(Y_i | X_i \sim Bern(p_i)\) where \[log(p_i/(1-p_i)) = \mu + \theta X_i.\]
We will assume that \(X_i \in {-1,+1}\), and assume priors for \(\mu\) and \(\theta\): \[\mu \sim N(0,100)\] \[\theta \sim N(0,1)\]
For illustration we simulate data where \(\mu=\theta=0\):
x = sample(c(-1,1),1000,replace=TRUE)
y = rbinom(1000,1,0.5)
#b is a vector b=(mu,theta)
#loglikelihood for logistic regression
loglik = function(b){
eta = b[1]+b[2]*x
p = exp(eta)/(1+exp(eta))
return(sum(log(y*p+(1-y)*(1-p))))
}
#b is a vector b=(mu,theta)
log_prior = function(b){
return(dnorm(b[1],0,10, log=TRUE)+dnorm(b[2],0,1,log=TRUE))
}
#b is a vector b=(mu,theta)
log_post = function(b){
return(loglik(b)+log_prior(b))
}
Let’s compute a 95% CI for \(\theta\). First try a discrete grid
Note: This is still a work in progress.
m = seq(-10,10,length=100)
t = seq(-2,2,length=100)
df = expand.grid(m=m,t=t)
head(df)
#df = c(df,dplyr::ddply(df,log_post))
sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.15.1 MASS_7.3-45 expm_0.999-0
[4] Matrix_1.2-8 workflowr_0.4.0 rmarkdown_1.3.9004
loaded via a namespace (and not attached):
[1] Rcpp_0.12.9 lattice_0.20-34 gtools_3.5.0 digest_0.6.12
[5] rprojroot_1.2 mime_0.5 R6_2.2.0 grid_3.3.2
[9] xtable_1.8-2 backports_1.0.5 git2r_0.18.0 magrittr_1.5
[13] evaluate_0.10 stringi_1.1.2 tools_3.3.2 stringr_1.2.0
[17] shiny_1.0.0 httpuv_1.3.3 yaml_2.1.14 htmltools_0.3.5
This site was created with R Markdown