Last updated: 2019-10-17
Checks: 7 0
Knit directory: LPS_honeybees/
This reproducible R Markdown analysis was created with workflowr (version 1.4.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20190701)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: data/.DS_Store
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view them.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 367c1bd | lukeholman | 2019-09-30 | First big commit |
library(brms)
library(bayesplot)
library(tidyverse)
library(gridExtra)
library(kableExtra)
library(bayestestR)
library(tidybayes)
source("code/helper_functions.R")
exp2_treatments <- c("Ringer CHC", "LPS CHC")
durations <- read_csv("data/data_collection_sheets/experiment_durations.csv") %>%
filter(experiment == 2) %>% select(-experiment)
outcome_tally <- read_csv(file = "data/clean_data/experiment_2_outcome_tally.csv") %>%
mutate(outcome = factor(outcome, levels = c("Stayed inside the hive", "Left of own volition", "Forced out")),
treatment = factor(treatment, levels = exp2_treatments))
# Re-formatted version of the same data, where each row is an individual bee. We need this format to run the brms model.
data_for_categorical_model <- outcome_tally %>%
mutate(id = 1:n()) %>%
split(.$id) %>%
map(function(x){
if(x$n[1] == 0) return(NULL)
data.frame(
treatment = x$treatment[1],
hive = x$hive[1],
colour = x$colour[1],
outcome = rep(x$outcome[1], x$n))
}) %>% do.call("rbind", .) %>% as_tibble() %>%
arrange(hive, treatment) %>%
mutate(outcome_numeric = as.numeric(outcome),
hive = as.character(hive),
treatment = factor(treatment, levels = exp2_treatments)) %>%
left_join(durations, by = "hive")
outcome_tally %>%
kable(digits = 3) %>% kable_styling() %>%
scroll_box(height = "380px")
hive | treatment | colour | outcome | n |
---|---|---|---|---|
Arts | LPS CHC | Dark Blue | Stayed inside the hive | 56 |
Arts | LPS CHC | Dark Blue | Left of own volition | 5 |
Arts | LPS CHC | Dark Blue | Forced out | 7 |
Arts | Ringer CHC | Pink | Stayed inside the hive | 64 |
Arts | Ringer CHC | Pink | Left of own volition | 5 |
Arts | Ringer CHC | Pink | Forced out | 1 |
Garden | LPS CHC | Dark Blue | Stayed inside the hive | 70 |
Garden | LPS CHC | Dark Blue | Left of own volition | 2 |
Garden | LPS CHC | Dark Blue | Forced out | 3 |
Garden | Ringer CHC | Pink | Stayed inside the hive | 73 |
Garden | Ringer CHC | Pink | Left of own volition | 2 |
Garden | Ringer CHC | Pink | Forced out | 0 |
Skylab | LPS CHC | Pink | Stayed inside the hive | 93 |
Skylab | LPS CHC | Pink | Left of own volition | 2 |
Skylab | LPS CHC | Pink | Forced out | 5 |
Skylab | Ringer CHC | Dark Blue | Stayed inside the hive | 97 |
Skylab | Ringer CHC | Dark Blue | Left of own volition | 1 |
Skylab | Ringer CHC | Dark Blue | Forced out | 1 |
Zoology | LPS CHC | Pink | Stayed inside the hive | 38 |
Zoology | LPS CHC | Pink | Left of own volition | 4 |
Zoology | LPS CHC | Pink | Forced out | 6 |
Zoology | Ringer CHC | Dark Blue | Stayed inside the hive | 42 |
Zoology | Ringer CHC | Dark Blue | Left of own volition | 2 |
Zoology | Ringer CHC | Dark Blue | Forced out | 6 |
all_hives <- outcome_tally %>%
group_by(treatment, outcome) %>%
summarise(n = sum(n)) %>%
ungroup() %>% mutate(hive = "All hives")
outcome_tally %>%
group_by(treatment, hive, outcome) %>%
summarise(n = sum(n)) %>%
ungroup() %>%
mutate(hive = paste(hive, "hive")) %>%
full_join(all_hives, by = c("treatment", "hive", "outcome", "n")) %>%
mutate(prop = n / sum(n)) %>%
ggplot(aes(outcome, n , fill = treatment)) +
geom_bar(stat = "identity", position = "dodge") +
facet_wrap( ~ hive, nrow = 1, scales = "free_y") +
scale_fill_brewer(palette = "Set2", name = "Treatment") +
xlab("Outcome") + ylab("Number of bees") +
theme(axis.text.x = element_text(angle = 45, hjust=1))
Fit a multinomial logisitic model, with 3 possible outcomes describing what happened to each bee introduced to the hive: stayed inside, left of its own volition, or forced out by the other workers. To assess the effects of our predictor variables, we compare 5 models with different fixed factors, ranking them by posterior model probability.
if(!file.exists("output/model_probabilities_expt2.rds")){
my_prior <- c(set_prior("normal(0, 2)", class = "b"))
run_brms_categorical <- function(formula){
formula <- gsub(" ", "", paste("outcome_numeric ~", formula))
brm(formula,
data = data_for_categorical_model,
prior = my_prior,
family = "categorical",
save_all_pars = TRUE,
control = list(adapt_delta = 0.999, max_treedepth = 15),
chains = 4, cores = 4, iter = 40000, seed = 1)
}
formula_list <- c(
"treatment:hive + treatment + hive + observation_time_minutes",
"treatment + hive + observation_time_minutes",
"treatment + observation_time_minutes",
"hive + observation_time_minutes",
"observation_time_minutes")
# run all 5 models, sequentially
model_list_expt2 <- lapply(formula_list, run_brms_categorical)
# Estimate the log marginal likelihood by bridge sampling, and use to determine the posterior model probabilities
# See: vignette("bridgesampling_example_stan")
model_probabilities <- post_prob(model_list_expt2[[1]],
model_list_expt2[[2]],
model_list_expt2[[3]],
model_list_expt2[[4]],
model_list_expt2[[5]])
top_index <- which.max(unname(model_probabilities))
model_probabilities <- tibble(
`Fixed effects` = formula_list,
`Posterior model probability` = unname(model_probabilities)) %>%
arrange(-`Posterior model probability`)
model_probabilities <- model_probabilities %>%
mutate(`% of top model probability` = round(100 * `Posterior model probability` / `Posterior model probability`[1], 1),
`% of top model probability` = c("\\-", `% of top model probability`[2:length(`% of top model probability`)]))
# saveRDS(model_list_expt2[[1]], "output/full_model_expt2.rds")
saveRDS(model_list_expt2[[top_index]], "output/top_model_expt2.rds")
saveRDS(model_probabilities, "output/model_probabilities_expt2.rds")
}
top_model <- readRDS("output/top_model_expt2.rds")
model_probabilities <- readRDS("output/model_probabilities_expt2.rds")
model_probabilities %>%
kable(digits = 3) %>% kable_styling()
Fixed effects | Posterior model probability | % of top model probability |
---|---|---|
treatment + observation_time_minutes | 0.566 |
|
observation_time_minutes | 0.354 | 62.6 |
treatment + hive + observation_time_minutes | 0.048 | 8.4 |
hive + observation_time_minutes | 0.027 | 4.8 |
treatment:hive + treatment + hive + observation_time_minutes | 0.006 | 1 |
The model output suggests that the third possible outcome (i.e. being forced out of the hive; denoted as mu3 in the table) was more common in the ‘LPS CHCs’ treatment group. See Hypothesis testing and effect sizes below for more useful statistical results.
get_fixed_effects_with_p_values(top_model) %>%
kable(digits = 3) %>% kable_styling()
Parameter | Estimate | Est.Error | lower_95_CI | upper_95_CI | Rhat | Bulk_ESS | Tail_ESS | p | |
---|---|---|---|---|---|---|---|---|---|
mu2_Intercept | -6.396 | 1.191 | -8.931 | -4.262 | 1 | 44785 | 41916 | 0.000 | * |
mu3_Intercept | -6.002 | 0.993 | -8.059 | -4.176 | 1 | 46424 | 44048 | 0.000 | * |
mu2_treatmentLPSCHC | 0.362 | 0.431 | -0.474 | 1.223 | 1 | 60598 | 50515 | 0.199 | |
mu2_observation_time_minutes | 0.030 | 0.011 | 0.011 | 0.052 | 1 | 49351 | 45506 | 0.001 | * |
mu3_treatmentLPSCHC | 1.054 | 0.422 | 0.261 | 1.914 | 1 | 52801 | 48727 | 0.004 | * |
mu3_observation_time_minutes | 0.025 | 0.009 | 0.008 | 0.043 | 1 | 53563 | 48343 | 0.001 | * |
new <- expand.grid(treatment = levels(data_for_categorical_model$treatment),
observation_time_minutes = 120)
preds <- fitted(top_model, newdata = new, summary = FALSE)
dimnames(preds) <- list(NULL, new[,1], NULL)
plotting_data <- rbind(
as.data.frame(preds[,, 1]) %>% mutate(outcome = "Stayed inside the hive", posterior_sample = 1:n()),
as.data.frame(preds[,, 2]) %>% mutate(outcome = "Left voluntarily", posterior_sample = 1:n()),
as.data.frame(preds[,, 3]) %>% mutate(outcome = "Forced out", posterior_sample = 1:n())) %>%
gather(treatment, prop, `Ringer CHC`, `LPS CHC`) %>%
mutate(treatment = factor(treatment, c("Ringer CHC", "LPS CHC")),
outcome = factor(outcome, c("Stayed inside the hive", "Left voluntarily", "Forced out"))) %>%
as_tibble() %>% arrange(treatment, outcome)
ggplot(plotting_data, aes(treatment, 100 * prop)) +
geom_eye(aes(fill = treatment)) +
scale_fill_brewer(palette = "Pastel1", direction = -1) +
facet_wrap( ~ outcome, scales = "free_y") +
ylab("% bees leaving the hive this way") +
xlab("Mode of exit") +
theme_bw() +
theme(legend.position = "none")
This section calculates the posterior difference in treatment group means, in order to perform some null hypothesis testing, calculate effect size, and calculate the 95% credible intervals on the effect size. In all cases, the effect size expresses the effect of the “LPS-treated bee CHCs” treatment relative to the “Ringer-treated bee CHCs” control.
my_summary <- function(df, columns, outcome) {
lapply(columns, function(x){
p <- df %>% pull(!! x) %>% bayestestR::p_direction() %>% as.numeric()
p <- (100 - p) / 100
df %>% pull(!! x) %>% posterior_summary() %>% as_tibble() %>%
mutate(p=p, Outcome = outcome, Metric = x) %>% select(Outcome, Metric, everything())
}) %>% do.call("rbind", .)
}
stats_table <- rbind(
plotting_data %>%
filter(outcome == "Stayed inside the hive") %>%
mutate(prop = 1 - prop) %>%
spread(treatment, prop) %>%
mutate(`Absolute difference in % bees exiting the hive` = 100 * (`LPS CHC` - `Ringer CHC`),
`Log10 odds ratio` = log10(`LPS CHC` / `Ringer CHC`)) %>%
my_summary(c("Absolute difference in % bees exiting the hive",
"Log10 odds ratio"),
outcome = "Stayed inside the hive") %>%
mutate(p = c(" ", format(round(p[2], 4), nsmall = 4))),
plotting_data %>%
filter(outcome == "Left voluntarily") %>%
spread(treatment, prop) %>%
mutate(`Absolute difference in % bees leaving voluntarily` = 100 * (`LPS CHC` - `Ringer CHC`),
`Log10 odds ratio` = log10(`LPS CHC` / `Ringer CHC`)) %>%
my_summary(c("Absolute difference in % bees leaving voluntarily",
"Log10 odds ratio"),
outcome = "Left voluntarily") %>%
mutate(p = c(" ", format(round(p[2], 4), nsmall = 4))),
plotting_data %>%
filter(outcome == "Forced out") %>%
spread(treatment, prop) %>%
mutate(`Absolute difference in % bees leaving voluntarily` = 100 * (`LPS CHC` - `Ringer CHC`),
`Log10 odds ratio` = log10(`LPS CHC` / `Ringer CHC`)) %>%
my_summary(c("Absolute difference in % bees leaving voluntarily",
"Log10 odds ratio"),
outcome = "Forced out") %>%
mutate(p = c(" ", format(round(p[2], 4), nsmall = 4)))
)
stats_table[c(2,4,6), 1] <- " "
stats_table %>%
mutate(` ` = ifelse(p < 0.05, "\\*", ""),
` ` = replace(` `, p < 0.01, "**"),
` ` = replace(` `, p < 0.001, "***"),
` ` = replace(` `, p == " ", "")) %>%
kable(digits = 3) %>% kable_styling() %>%
row_spec(c(0,2,4,6), extra_css = "border-bottom: solid;")
Outcome | Metric | Estimate | Est.Error | Q2.5 | Q97.5 | p | |
---|---|---|---|---|---|---|---|
Stayed inside the hive | Absolute difference in % bees exiting the hive | 8.437 | 3.649 | 1.456 | 15.812 | ||
Log10 odds ratio | 0.265 | 0.115 | 0.045 | 0.496 | 0.0092 | ** | |
Left voluntarily | Absolute difference in % bees leaving voluntarily | 1.685 | 2.639 | -3.397 | 7.087 | ||
Log10 odds ratio | 0.114 | 0.175 | -0.226 | 0.465 | 0.2559 | ||
Forced out | Absolute difference in % bees leaving voluntarily | 6.751 | 2.831 | 1.574 | 12.657 | ||
Log10 odds ratio | 0.415 | 0.172 | 0.094 | 0.768 | 0.0054 | ** |
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] tidybayes_1.1.0 bayestestR_0.2.2 kableExtra_0.9.0 gridExtra_2.3
[5] forcats_0.4.0 stringr_1.4.0 dplyr_0.8.3 purrr_0.3.2
[9] readr_1.1.1 tidyr_0.8.2 tibble_2.1.3 ggplot2_3.1.0
[13] tidyverse_1.2.1 bayesplot_1.6.0 brms_2.10.0 Rcpp_1.0.2
loaded via a namespace (and not attached):
[1] colorspace_1.3-2 ggridges_0.5.0
[3] rsconnect_0.8.8 rprojroot_1.3-2
[5] ggstance_0.3.1 markdown_1.0
[7] base64enc_0.1-3 fs_1.3.1
[9] rstudioapi_0.10 rstan_2.19.2
[11] svUnit_0.7-12 DT_0.4
[13] mvtnorm_1.0-11 lubridate_1.7.4
[15] xml2_1.2.0 bridgesampling_0.4-0
[17] knitr_1.23 shinythemes_1.1.1
[19] jsonlite_1.6 workflowr_1.4.0
[21] broom_0.5.0 shiny_1.3.2
[23] compiler_3.5.1 httr_1.4.0
[25] backports_1.1.2 assertthat_0.2.1
[27] Matrix_1.2-14 lazyeval_0.2.2
[29] cli_1.1.0 later_0.8.0
[31] htmltools_0.3.6 prettyunits_1.0.2
[33] tools_3.5.1 igraph_1.2.1
[35] coda_0.19-2 gtable_0.2.0
[37] glue_1.3.1.9000 reshape2_1.4.3
[39] cellranger_1.1.0 nlme_3.1-137
[41] crosstalk_1.0.0 insight_0.3.0
[43] xfun_0.8 ps_1.3.0
[45] rvest_0.3.2 mime_0.7
[47] miniUI_0.1.1.1 gtools_3.8.1
[49] zoo_1.8-3 scales_1.0.0
[51] colourpicker_1.0 hms_0.4.2
[53] promises_1.0.1 Brobdingnag_1.2-5
[55] parallel_3.5.1 inline_0.3.15
[57] shinystan_2.5.0 RColorBrewer_1.1-2
[59] yaml_2.2.0 loo_2.1.0
[61] StanHeaders_2.19.0 stringi_1.4.3
[63] highr_0.8 dygraphs_1.1.1.6
[65] pkgbuild_1.0.2 rlang_0.4.0
[67] pkgconfig_2.0.2 matrixStats_0.54.0
[69] evaluate_0.14 lattice_0.20-35
[71] rstantools_2.0.0 htmlwidgets_1.3
[73] labeling_0.3 processx_3.2.1
[75] tidyselect_0.2.5 plyr_1.8.4
[77] magrittr_1.5 R6_2.4.0
[79] pillar_1.3.1.9000 haven_1.1.2
[81] whisker_0.3-2 withr_2.1.2
[83] xts_0.11-0 abind_1.4-5
[85] modelr_0.1.2 crayon_1.3.4
[87] arrayhelpers_1.0-20160527 rmarkdown_1.13
[89] grid_3.5.1 readxl_1.1.0
[91] callr_2.0.4 git2r_0.23.0
[93] threejs_0.3.1 digest_0.6.20
[95] xtable_1.8-4 httpuv_1.5.1
[97] stats4_3.5.1 munsell_0.5.0
[99] viridisLite_0.3.0 shinyjs_1.0