Last updated: 2019-05-07

Checks: 6 0

Knit directory: mcfa-fit/

This reproducible R Markdown analysis was created with workflowr (version 1.3.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20190507)

The command set.seed(20190507) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Repository version: 569bace

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/

Untracked files:
    Untracked:  analysis/anova_roc_hit.Rmd
    Untracked:  analysis/first-analysis.Rmd
    Untracked:  code/r_functions.R
    Untracked:  data/compiled_fit_results.txt
    Untracked:  docs/figure/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File	Version	Author	Date	Message
Rmd	569bace	noah-padgett	2019-05-07	getting files to show

Purpose of this file:

creating simple descriptive statistics across conditions
Generating tables of those stats
Create simple boxplots for trends
Creating publication boxplots

The output is numerous .eps files (figures) and code for tables in latex

Packages and Set-Up

## Chunk iptions
knitr::opts_chunk$set(out.width = "225%")

# setwd('C:/Users/noahp/Dropbox/MCFA Thesis/Code Results')

## Packages General Packages
library(tidyverse)

-- Attaching packages ---------------------------------------------- tidyverse 1.2.1 --

v ggplot2 3.1.0       v purrr   0.2.5  
v tibble  2.0.1       v dplyr   0.8.0.1
v tidyr   0.8.2       v stringr 1.3.1  
v readr   1.3.1       v forcats 0.3.0

Warning: package 'dplyr' was built under R version 3.5.3

-- Conflicts ------------------------------------------------- tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()

# Formatting and Tables
library(kableExtra)


Attaching package: 'kableExtra'

The following object is masked from 'package:dplyr':

    group_rows

library(xtable)
# For plotting
library(ggplot2)
theme_set(theme_bw())
# Data manipulating
library(dplyr)
# ROC Analysis
library(pROC)

Type 'citation("pROC")' for a citation.


Attaching package: 'pROC'

The following objects are masked from 'package:stats':

    cov, smooth, var

## One global parameter for printing figures
save.fig <- F

Data Management

sim_results <- as_tibble(read.table("data/compiled_fit_results.txt", header = T, 
    sep = "\t"))

## Next, turn condition into a factor for plotting
sim_results$Condition <- as.factor(sim_results$Condition)

## Next, since TLI is non-normed, any value greater than 1 needs to be
## rescaled to 1.
sim_results$TLI <- ifelse(sim_results$TLI > 1, 1, sim_results$TLI)
sim_results$TLI <- ifelse(sim_results$TLI < 0, 0, sim_results$TLI)
## Next, summarize the results of the chi-square test of model fit. This is
## done simply by comparing the p-value to alpha (0.05) and indicating
## whether the model was flagged as fitting or not.  Note: if p < 0.05 then
## this variable is flagged as 0, and 1 otherwise
sim_results$Chi2_pvalue_decision <- ifelse(sim_results$chisqu_pvalue < 0.05, 
    0, 1)
# 0 = rejected that these data fit this model 1 = failed to reject that
# these data fit this model

Adding Labels to Conditions

Currently, each condition is kind of like a hidden id that we don’t know what the actual factor is. So, first thing isto create meaningful labels for us to use. Remember, the 72 conditions for the this study were

Level-1 sample size (5, 10, 30)
Level-2 sample size (30, 50, 100, 200)
Observed indicator ICC (.1, .3, .5)
Latent variable ICC (.1, .5)

## level-1 Sample size
ss_l1 <- c(5, 10, 30) ## 6 conditions each
ss_l2 <- c(30, 50, 100, 200) ## 18 condition each
icc_ov <- c(.1, .3, .5) ## 2 conditions each
icc_lv <- c(.1, .5) ## every other condition
nCon <- 72 # number of conditions
nRep <- 500 # number of replications per condition
nMod <- 12 ## numberof estimated models per conditions
## Total number of rows: 432,000
ss_l2 <- c(rep(ss_l2[1], 18*nRep*nMod), rep(ss_l2[2], 18*nRep*nMod), 
           rep(ss_l2[3], 18*nRep*nMod), rep(ss_l2[4], 18*nRep*nMod))
ss_l1 <- rep(c(rep(ss_l1[1],6*nRep*nMod), rep(ss_l1[2],6*nRep*nMod), rep(ss_l1[3],6*nRep*nMod)), 4)
icc_ov <- rep(c(rep(icc_ov[1], 2*nRep*nMod), rep(icc_ov[2], 2*nRep*nMod), rep(icc_ov[3], 2*nRep*nMod)), 12)
icc_lv <- rep(c(rep(icc_lv[1], nRep*nMod), rep(icc_lv[2], nRep*nMod)), 36)
## Force these vectors to be column vectors
ss_l1 <- matrix(ss_l1, ncol=1)
ss_l2 <- matrix(ss_l2, ncol=1)
icc_ov <- matrix(icc_ov, ncol=1)
icc_lv <- matrix(icc_lv, ncol=1)
## Add the labels to the results data frame
sim_results <- sim_results[order(sim_results$Condition),]
sim_results <- cbind(sim_results, ss_l1, ss_l2, icc_ov, icc_lv)

## Force the conditions to be factors
sim_results$ss_l1 <- as.factor(sim_results$ss_l1)
sim_results$ss_l2 <- as.factor(sim_results$ss_l2)
sim_results$icc_ov <- as.factor(sim_results$icc_ov)
sim_results$icc_lv <- as.factor(sim_results$icc_lv)
sim_results$Model <- factor(sim_results$Model, levels = c('C','M1','M2','M12'), ordered = T)
## Set up iterators for remainder of script
mods <- c('C', 'M1', 'M2', 'M12')
ests <- c('MLR', 'ULSMV', 'WLSMV')

Descriptive Statistcs

For the descriptive statistics, I will use dplyr. From here I can easily create matrices that store the results so that I can easily print out the results for summarizing the results. Each will be printed out as a html table and a xtable (latex ready) table.

Convergence Summary

Convergence will be broken out by Model (C, M1, M2, M12) and estimator (MLR, WLSMV, ULSMV). So, there will 12 tables piecemail tables and then one larger table.

## first table summary table
c <- sim_results %>% group_by(Model, Estimator) %>% summarise(Converge = mean(Converge))
# Next make the columns the estimator factor
c <- cbind(c[c$Estimator == "MLR", "Converge"], c[c$Estimator == "ULSMV", "Converge"], 
    c[c$Estimator == "WLSMV", "Converge"])
colnames(c) <- c("MLR", "ULSMV", "WLSMV")
rownames(c) <- c("C", "M1", "M2", "M12")
## Print results in a nice looking table in HTML
kable(c, format = "html") %>% kable_styling(full_width = T)

	MLR	ULSMV	WLSMV
C	0.9998056	0.9989722	0.9997778
M1	0.9819444	0.9737500	0.9645000
M2	0.9998056	0.9882500	0.9997500
M12	0.9997778	0.9853056	0.9919444

## Print out in tex
print(xtable(c, digits = 3), booktabs = T, include.rownames = T)

% latex table generated in R 3.5.2 by xtable 1.8-3 package
% Tue May 07 18:12:14 2019
\begin{table}[ht]
\centering
\begin{tabular}{rrrr}
  \toprule
 & MLR & ULSMV & WLSMV \\ 
  \midrule
C & 1.000 & 0.999 & 1.000 \\ 
  M1 & 0.982 & 0.974 & 0.965 \\ 
  M2 & 1.000 & 0.988 & 1.000 \\ 
  M12 & 1.000 & 0.985 & 0.992 \\ 
   \bottomrule
\end{tabular}
\end{table}

# ## Next, start the more descriptive tables across conditions ## loop
# around these iterators for(M in mods){ for(E in ests){ ### subset tothe
# model (M) and estimator (E) #M <- 'C' #E <- 'MLR' cat('\n\n
# ===============================\n') cat('\nModel:\t', M)
# cat('\nEstimator:\t', E, '\n') sub_dat <- sim_results[
# sim_results$Model == M & sim_results$Estimator == E,] c <- sub_dat %>%
# group_by(ss_l1, ss_l2, icc_ov, icc_lv) %>% summarize(C = mean(Converge))
# ## Now, print out the unformatted table kable(c, format='html') %>%
# kable_styling(full_width = T) print(c) ## now, make the columns the ICC
# conditions c <- cbind( c[ c$icc_ov == 0.1 & c$icc_lv == .1,
# c('ss_l1','ss_l2','C')], c[ c$icc_ov == 0.3 & c$icc_lv == .1, 'C'], c[
# c$icc_ov == 0.5 & c$icc_lv == .1, 'C'], c[ c$icc_ov == 0.1 & c$icc_lv ==
# .5, 'C'], c[ c$icc_ov == 0.3 & c$icc_lv == .5, 'C'], c[ c$icc_ov == 0.5 &
# c$icc_lv == .5, 'C'] ) ## End cbind() colnames(c) <- c('ss_l1','ss_l2',
# '.1' ,'.3', '.5', '.1' ,'.3', '.5') ## Print results in a nice looking
# table kable(c, format='html') %>% kable_styling(full_width = T) %>%
# add_header_above(c(' ' = 2, 'LV ICC = .10' = 3, 'LV ICC = .50'=3))
# print(xtable(c, digits = 3), booktabs = T, include.rownames = F) } } ##
# For a longtable instead c <-sim_results %>% group_by(Model, Estimator,
# ss_l1, ss_l2, icc_ov, icc_lv) %>% summarize(C = mean(Converge)) ## Now,
# print out the unformatted table kable(c, format='html') %>%
# kable_styling(full_width = T) ## now, make the columns the ICC conditions
# c <- cbind( c[ c$icc_ov == 0.1 & c$icc_lv == .1, c('Model', 'Estimator',
# 'ss_l1','ss_l2','C')], c[ c$icc_ov == 0.3 & c$icc_lv == .1, 'C'], c[
# c$icc_ov == 0.5 & c$icc_lv == .1, 'C'], c[ c$icc_ov == 0.1 & c$icc_lv ==
# .5, 'C'], c[ c$icc_ov == 0.3 & c$icc_lv == .5, 'C'], c[ c$icc_ov == 0.5 &
# c$icc_lv == .5, 'C'] ) ## End cbind() colnames(c) <- c('Model',
# 'Estimator', 'ss_l1','ss_l2', '.1' ,'.3', '.5', '.1' ,'.3', '.5') ## Print
# results in a nice looking table kable(c, format='html') %>%
# kable_styling(full_width = T) %>% add_header_above(c(' ' = 4, 'LV ICC =
# .10' = 3, 'LV ICC = .50'=3)) print(xtable(c, digits = 3), booktabs = T,
# include.rownames = F)

sessionInfo()

R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pROC_1.13.0      xtable_1.8-3     kableExtra_1.0.1 forcats_0.3.0   
 [5] stringr_1.3.1    dplyr_0.8.0.1    purrr_0.2.5      readr_1.3.1     
 [9] tidyr_0.8.2      tibble_2.0.1     ggplot2_3.1.0    tidyverse_1.2.1 

loaded via a namespace (and not attached):
 [1] tidyselect_0.2.5  xfun_0.4          haven_2.0.0      
 [4] lattice_0.20-38   colorspace_1.4-0  generics_0.0.2   
 [7] htmltools_0.3.6   viridisLite_0.3.0 yaml_2.2.0       
[10] rlang_0.3.1       pillar_1.3.1      glue_1.3.0       
[13] withr_2.1.2       modelr_0.1.2      readxl_1.2.0     
[16] plyr_1.8.4        munsell_0.5.0     gtable_0.2.0     
[19] workflowr_1.3.0   cellranger_1.1.0  rvest_0.3.2      
[22] evaluate_0.12     knitr_1.21        highr_0.7        
[25] broom_0.5.1       Rcpp_1.0.0        scales_1.0.0     
[28] backports_1.1.3   formatR_1.5       webshot_0.5.1    
[31] jsonlite_1.6      fs_1.2.6          hms_0.4.2        
[34] digest_0.6.18     stringi_1.2.4     grid_3.5.2       
[37] rprojroot_1.3-2   cli_1.0.1         tools_3.5.2      
[40] magrittr_1.5      lazyeval_0.2.1    crayon_1.3.4     
[43] whisker_0.3-2     pkgconfig_2.0.2   xml2_1.2.0       
[46] lubridate_1.7.4   assertthat_0.2.0  rmarkdown_1.11   
[49] httr_1.4.0        rstudioapi_0.9.0  R6_2.3.0         
[52] nlme_3.1-137      git2r_0.24.0      compiler_3.5.2

Descriptive Statistics, Tables and Boxplots

noah-padgett

2019-05-07

Packages and Set-Up

Data Management

Adding Labels to Conditions

Descriptive Statistcs

Convergence Summary