Last updated: 2018-05-21
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date 
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
 ✔ Environment: empty 
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
 ✔ Seed: 
set.seed(20180411) 
The command set.seed(20180411) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
 ✔ Session information: recorded 
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
 ✔ Repository version: b3b09a1 
wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    .sos/
    Ignored:    exams/
    Ignored:    temp/
Untracked files:
    Untracked:  analysis/hmm.Rmd
    Untracked:  analysis/neanderthal.Rmd
    Untracked:  analysis/pca_cell_cycle.Rmd
    Untracked:  analysis/ridge_mle.Rmd
    Untracked:  data/reduced.chr12.90-100.data.txt
    Untracked:  data/reduced.chr12.90-100.snp.txt
    Untracked:  docs/figure/hmm.Rmd/
    Untracked:  docs/figure/pca_cell_cycle.Rmd/
    Untracked:  homework/fdr.aux
    Untracked:  homework/fdr.log
    Untracked:  tempETA_1_parBayesC.dat
    Untracked:  temp_ETA_1_parBayesC.dat
    Untracked:  temp_mu.dat
    Untracked:  temp_varE.dat
    Untracked:  tempmu.dat
    Untracked:  tempvarE.dat
Unstaged changes:
    Modified:   analysis/cell_cycle.Rmd
    Modified:   analysis/density_est_cell_cycle.Rmd
    Modified:   analysis/eb_vs_soft.Rmd
    Modified:   analysis/eight_schools.Rmd
    Modified:   analysis/svd_zip.Rmd
library(glmnet)Loading required package: MatrixLoading required package: foreachLoaded glmnet 2.0-16set.seed(1)
p = 100
n = 500
X = matrix(rnorm(n*p),ncol=p)
b = rnorm(p)
e = rnorm(n,0,sd=25)
Y = X %*% b + eNow fit ols, ridge regression and lasso, and see some basic plots.
Y.ols = lm(Y~X) 
Y.ridge = glmnet(X,Y,alpha=0)
plot(Y.ridge)
Y.lasso = glmnet(X,Y,alpha=1)
plot(Y.lasso)
The library also allows you to run cross-validation easily:
cv.ridge = cv.glmnet(X,Y,alpha=0)
plot(cv.ridge)
cv.lasso = cv.glmnet(X,Y,alpha=1)
plot(cv.lasso)
Extract coefficients from best cv fits.
b.ridge = predict(Y.ridge, type="coefficients", s = cv.ridge$lambda.min)
b.lasso = predict(Y.lasso, type="coefficients", s = cv.lasso$lambda.min)
b.ols = Y.ols$coefficientsNote that the fits include an intercept (unregularized, equal to the mean of Y).
length(b.lasso)[1] 101b.lasso[1][1] -1.38244mean(Y)[1] -1.233957Compare the estimated coefficients with the truth:
btrue = c(0,b) # Here the 0 is the intercept (true value 0)
sum((btrue-0)^2) # This is error if we just estimate 0 for everything and ignore data. It is better than OLS![1] 85.30411sum((btrue-Y.ols$coefficients)^2)[1] 138.2715sum((btrue-b.ridge)^2)[1] 56.99797sum((btrue-b.lasso)^2)[1] 68.7212sessionInfo()R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
[1] glmnet_2.0-16 foreach_1.4.4 Matrix_1.2-14
loaded via a namespace (and not attached):
 [1] workflowr_1.0.1   Rcpp_0.12.16      codetools_0.2-15 
 [4] lattice_0.20-35   digest_0.6.15     rprojroot_1.3-2  
 [7] R.methodsS3_1.7.1 grid_3.3.2        backports_1.1.2  
[10] git2r_0.21.0      magrittr_1.5      evaluate_0.10.1  
[13] stringi_1.1.7     whisker_0.3-2     R.oo_1.22.0      
[16] R.utils_2.6.0     rmarkdown_1.9     iterators_1.0.9  
[19] tools_3.3.2       stringr_1.3.0     yaml_2.1.18      
[22] htmltools_0.3.6   knitr_1.20       This reproducible R Markdown analysis was created with workflowr 1.0.1