Last updated: 2017-01-02
Code version: 55e11cf8f7785ad926b716fb52e4e87b342f38e1
include the most complex concepts required to understand the material.
Suppose we have a logistic regression \(Y_i | X_i \sim Bern(p_i)\) where \[log(p_i/(1-p_i)) = \mu + \theta X_i.\]
We will assume that \(X_i \in {-1,+1}\), and assume priors for \(\mu\) and \(\theta\): \[\mu \sim N(0,100)\] \[\theta \sim N(0,1)\]
For illustration we simulate data where \(\mu=\theta=0\):
x = sample(c(-1,1),1000,replace=TRUE)
y = rbinom(1000,1,0.5)
#b is a vector b=(mu,theta)
#loglikelihood for logistic regression
loglik = function(b){
eta = b[1]+b[2]*x
p = exp(eta)/(1+exp(eta))
return(sum(log(y*p+(1-y)*(1-p))))
}
#b is a vector b=(mu,theta)
log_prior = function(b){
return(dnorm(b[1],0,10, log=TRUE)+dnorm(b[2],0,1,log=TRUE))
}
#b is a vector b=(mu,theta)
log_post = function(b){
return(loglik(b)+log_prior(b))
}
Let’s compute a 95% CI for \(\theta\). First try a discrete grid
Note: This is still a work in progress.
m = seq(-10,10,length=100)
t = seq(-2,2,length=100)
df = expand.grid(m=m,t=t)
head(df)
#df = c(df,dplyr::ddply(df,log_post))
sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] knitr_1.14 MASS_7.3-45 expm_0.999-0 Matrix_1.2-7.1
[5] rmarkdown_1.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.7 lattice_0.20-34 gtools_3.5.0 digest_0.6.9
[5] assertthat_0.1 mime_0.4 R6_2.1.2 grid_3.3.2
[9] xtable_1.8-2 formatR_1.4 magrittr_1.5 evaluate_0.9
[13] stringi_1.1.1 tools_3.3.2 stringr_1.0.0 shiny_0.13.2
[17] httpuv_1.3.3 yaml_2.1.13 htmltools_0.3.5 tibble_1.2
This site was created with R Markdown