Last updated: 2021-05-12

Checks: 7 0

Knit directory: thesis/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20210321) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 7fd4ff2. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rproj.user/
    Ignored:    data/DB/
    Ignored:    data/raster/
    Ignored:    data/raw/
    Ignored:    data/vector/
    Ignored:    docker_command.txt
    Ignored:    output/acc/
    Ignored:    output/bayes/
    Ignored:    output/ffs/
    Ignored:    output/models/
    Ignored:    output/plots/
    Ignored:    output/test-results/
    Ignored:    renv/library/
    Ignored:    renv/staging/
    Ignored:    report/presentation/

Untracked files:
    Untracked:  analysis/assets/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/thesis-appendix.Rmd) and HTML (docs/thesis-appendix.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd de8ce5a Darius Görgen 2021-04-05 add content
html de8ce5a Darius Görgen 2021-04-05 add content

1 Appendix

var_iter = c("**cb**", "**sb**", "**ns**", "**os**")
unit_iter = c("*adm*", "*bas*")
combi = expand.grid(unit_iter, var_iter)
names_unit = combi$Var1
names_var = combi$Var2

Density Plots of Predicted Conflict Probability

files <- list.files("../output/acc/", pattern = ".rds", full.names = T)

data <- lapply(files, function(x){
  out = vars_detect(x)
  tmp = readRDS(x)
  
  thres_df = readRDS(
    paste0(
      "../output/test-results/test-results-", 
      paste(out$type, out$unit, out$var, sep = "-"),
      ".rds")
  )
  if(out$var == "all"){
    for(i in 1:length(tmp)){
      tmp[[i]]$df$var = "cb"
      out$var = "cb"
    }
  }
  thres_df %>%
    filter(name == "f2", month == 0) %>% 
    filter(score == max(score, na.rm = T)) %>%
    pull(rep) -> rep
  
  thres_df %>%
    filter(name == "threshold", rep == !!rep) %>%
    pull(score) -> threshold
  
  list(unit = out$unit, type = out$type, var = out$var, thres = threshold, df = tmp[[rep]]$df)
}) 

units <- unlist(lapply(data, `[[`, 1))
types <- unlist(lapply(data, `[[`, 2))
vars <- unlist(lapply(data, `[[`, 3))
dfs <-  lapply(data, `[[`, 4)


var_iter = c("cb", "sb", "ns", "os")
unit_iter = c("states", "basins")
type_iter = c("baseline", "structural", "environmental")


for(var in var_iter){
  tmp1 = data[vars == var]
  for(unit in unit_iter){
    unit_lab = if_else(unit == "states", "adm", "bas")
    units =  unlist(lapply(tmp1, `[[`, 1))
    tmp2 = tmp1[units == unit]
    types =  unlist(lapply(tmp2, `[[`, 2))
    thresholds = unlist(lapply(tmp2, `[[`, 4))
    thresholds = data.frame(type = factor(types, 
                                          levels = c("baseline", "structural", "environmental"),
                                          labels = c("CH", "SV", "EV")), 
                            thresholds = thresholds)
    
    df = do.call(rbind, lapply(tmp2, `[[`, 5))
    opts_current$set(label = paste("dens", unit,var, sep = "-")) # setting label 
    df %>%
      mutate(type = factor(type, 
                           levels = c("baseline", "structural", "environmental"),
                           labels = c("CH", "SV", "EV")),
             unit = factor(unit,
                           levels = c("states", "basins"),
                           labels = c("adm", "bas")),
             obsv = factor(obsv,
                           levels = c("peace", "conflict"),
                           labels = c("Peace", "Conflict"))) %>%
      ggplot() +
      geom_density(aes(x=pred, fill = obsv), alpha = .4) +
      geom_vline(data = thresholds, aes(xintercept=thresholds), linetype=2) +
      xlim(0,1) +
      labs(fill = "", x = "Predicted Probability", y = "Density", 
           subtitle = paste0("Unit: *", unit_lab, "*      Variable: **",var,"**")) +
      scale_fill_brewer(palette="Dark2") +
      facet_wrap(~type, scales = "free_y")+
      my_theme +
      theme(plot.subtitle =  element_markdown(size = 10)) -> plt_out
    print(plt_out)
  }
}
Predicted probability of **cb** conflicts for *adm* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.1: Predicted probability of cb conflicts for adm districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
Predicted probability of **cb** conflicts for *bas* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.2: Predicted probability of cb conflicts for bas districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
Predicted probability of **sb** conflicts for *adm* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.3: Predicted probability of sb conflicts for adm districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
Predicted probability of **sb** conflicts for *bas* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.4: Predicted probability of sb conflicts for bas districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
Predicted probability of **ns** conflicts for *adm* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.5: Predicted probability of ns conflicts for adm districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
Predicted probability of **ns** conflicts for *bas* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.6: Predicted probability of ns conflicts for bas districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
Predicted probability of **os** conflicts for *adm* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.7: Predicted probability of os conflicts for adm districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
Predicted probability of **os** conflicts for *bas* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.8: Predicted probability of os conflicts for bas districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version Author Date
de8ce5a Darius Görgen 2021-04-05

Additional Global Performance Metrics

#files <- list.files("../output/acc/", pattern = ".rds", full.names = T)
files <- list.files("../output/test-results/", pattern = "test-results", full.names = T)
# files <- files[-grep("test-results-regression-basins", files)]
acc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  tmp %>%
    filter(month == 0) %>%
    group_by(name) %>%
    summarise(mean = mean(score, na.rm = T),
              median = median(score, na.rm = TRUE),
              sd = sd(score, na.rm = T),
              type = out$type,
              unit = out$unit,
              var = out$var) %>%
    ungroup()
}) 
acc_data = do.call(rbind, acc_data)

acc_data %<>% 
  mutate(var = factor(var, 
                      levels = c("all", "sb", "ns", "os"),
                      labels = c("cb", "sb", "ns", "os")),
         type = factor(type, 
                       labels = c("LR", "CH", "SV", "EV"),
                       levels = c("regression", "baseline", "structural", "environmental")),
         unit = if_else(unit == "basins", "bas", "adm"),
         unit = factor(unit, levels = c("adm", "bas"), labels = c("adm", "bas")))

pd <- position_dodge(0.1)
score = "median"
labs.unit = c("_adm_", "_bas_")
names(labs.unit) = c("adm", "bas")
labs.name = c("", "")
names(labs.name) = c("AUC", "AUC")

plt_auc = acc_data %>%
  dplyr::filter(name == "auc") %>%
  dplyr::select(type, unit, var, name, score = !!score, sd) %>%
  ggplot(aes(color=var, x=type, shape=var)) +
  geom_point(aes(y=score), position = position_dodge2(width = .5)) +
  geom_errorbar(aes(ymin=score-sd, ymax=score+sd, x=type), width=.5, position = position_dodge2(width = .5)) +
  scale_color_brewer(palette = "Dark2") +
  labs(x="", y = "AUC", color = "Conflict Class", shape = "Conflict Class") +
  ylim(0,1) +
  facet_wrap(~unit+name, labeller = labeller(unit = labs.unit, name = labs.name)) +
  guides(color=guide_legend(ncol=4)) +
  my_theme 

plt_spec = acc_data %>%
  filter(name == "specificity") %>%
  dplyr::select(type, unit, var, name, score = !!score, sd) %>%
  ggplot(aes(color=var, x=type, shape=var)) +
  geom_point(aes(y=score), position = position_dodge2(width = .5)) +
  geom_errorbar(aes(ymin=score-sd, ymax=score+sd, x=type), width=.5, position = position_dodge2(width = .5)) +
  scale_color_brewer(palette = "Dark2") +
  labs(x="Model type", y = "Specificity", color = "Conflict Class", shape = "Conflict Class") +
  ylim(0,1) +
  facet_wrap(~unit+name, labeller = labeller(unit = labs.unit, name = labs.name)) +
  guides(color=guide_legend(ncol=4)) +
  my_theme

if(knitr::is_latex_output()){
  ggarrange(plt_auc, plt_spec, ncol = 1, common.legend = T, legend = "bottom")
} else {
  require(plotly)
  subplot(plt_auc, plt_spec, nrows = 2) %>%
    layout(legend = list(orientation = "h", 
                         xanchor = "center",
                         x = 0.5, y = -.1))
}

Figure 1.9: Global performance of the AUC metric (top) and pecificity (bottom).

Additional Temporal Performance Metrics

#files <- list.files("../output/acc/", pattern = ".rds", full.names = T)
files <- list.files("../output/test-results/", pattern = "test-results", full.names = T)
# files <- files[-grep("test-results-regression-basins", files)]
acc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  tmp$type = out$type
  
  tmp %>%
    filter(month != 0)
}) 
acc_data = do.call(rbind, acc_data)

acc_data %<>% 
  mutate(var = factor(var, labels = c("all", "sb", "ns", "os"),
                      levels = c("all", "sb", "ns", "os")),
         type = factor(type, 
                       labels = c("LR", "CH", "SV", "EV"),
                       levels = c("regression", "baseline", "structural", "environmental")),
         unit = if_else(unit == "basins", "bas", "adm"),
         unit = factor(unit, levels = c("adm", "bas"), labels = c("adm", "bas")))

labs.unit = c("_adm_", "_bas_")
names(labs.unit) = c("adm", "bas")
labs.var = c("__cb__", "__sb__", "__ns__", "__os__")
names(labs.var) = c("all", "sb", "ns", "os")
acc_data %>%
  filter(name == "auc") %>%
  dplyr::select(name, score, month, var, unit, type) %>%
  ggplot(aes(x=month, y=score, color=type, linetype=type))+
  stat_smooth(geom='line', alpha=.4, size=.5, se=FALSE, lwd=1)+
  facet_grid(unit~var, labeller = labeller(unit = labs.unit, var = labs.var)) +
  scale_x_continuous(breaks=c(1:12)) +
  ylim(0,1) +
  theme_classic() +
  scale_color_brewer(palette="Dark2") + 
  labs(subtitle = "AUC", x = "Prediction window (in months)", y = "Score", color = "Model", linetype = "Model") +
  guides(color=guide_legend(ncol=4)) +
  my_theme -> plt_out
plot_output(plt_out)

Figure 1.10: Time dependent performance of the AUC metric.

acc_data %>%
  filter(name == "specificity") %>%
  dplyr::select(name, score, month, var, unit, type) %>%
  ggplot(aes(x=month, y=score, color=type, linetype=type))+
  stat_smooth(geom='line', alpha=.4, size=.5, se=FALSE, lwd=1)+
  facet_grid(unit~var, labeller = labeller(unit = labs.unit, var = labs.var)) +
  scale_x_continuous(breaks=c(1:12)) +
  ylim(0,1) +
  theme_classic() +
  scale_color_brewer(palette="Dark2") + 
  labs(subtitle = "Specificity", x = "Prediction window (in months)", y = "Score", color = "Model", linetype="Model") +
  guides(color=guide_legend(ncol=4)) +
  my_theme -> plt_out
plot_output(plt_out)

Figure 1.11: Time dependent performance of specificity.

ROC Curves

files <- list.files("../output/acc/", pattern = ".rds", full.names = T)

auc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  
  df_auc <- lapply(1:length(tmp), function(i){
    rep = tmp[[i]]$rep
    FPR = tmp[[i]]$auc$curve[,1]
    recall = tmp[[i]]$auc$curve[,2]
    FPR = FPR[ c(rep(FALSE, 3), TRUE)] # reduce to every 4th entry to reduce file size
    recall = recall[ c(rep(FALSE, 3), TRUE)] # reduce to every 4th entry to reduce file size
    
    auc <- tibble(type = out$type,
                  unit = out$unit,
                  var = out$var,
                  rep = rep,
                  FPR = FPR,
                  recall = recall)
  })
  df_auc = do.call(rbind, df_auc)
}) 

auc_data <- do.call(rbind, auc_data)

auc_data %<>% 
  mutate(var = factor(var, 
                      labels = c("__cb__", "__sb__", "__ns__", "__os__"),
                      levels = c("all", "sb", "ns", "os")),
         type = factor(type, 
                       labels = c("CH", "SV", "EV"),
                       levels = c("baseline", "structural", "environmental")),
         unit = factor(unit, levels = c("states", "basins"), labels = c("adm", "bas")))
auc_data %>% 
  filter(unit == "adm") %>%
  mutate(group = paste(type,var,rep,sep="-")) %>%
  ggplot(aes(x=FPR, y=recall,color=type,linetype=type)) +
  geom_line(aes(group=group), size=.1, alpha = .5) +
  geom_smooth(alpha=1, size=.5, se=FALSE) +
  scale_color_manual(values = RColorBrewer::brewer.pal(n=4, name  = "Dark2")[2:4]) +
  facet_wrap(~var, ncol = 2) +
  labs(color = "Model", linetype="Model", y = "Sensitivity", x = "FPR", subtitle = "adm") +
  guides(color=guide_legend(override.aes = list(alpha = 1, ncol = 3))) +
  my_theme +
  theme(
    plot.subtitle =  element_markdown(size = 10, face =  "italic")
  ) 
ROC curves for *adm* district models.

Figure 1.12: ROC curves for adm district models.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
auc_data %>% 
  filter(unit == "bas") %>%
  mutate(group = paste(type,var,rep,sep="-")) %>%
  ggplot(aes(x=FPR, y=recall,color=type,linetype=type)) +
  geom_line(aes(group=group), size=.1, alpha = .5) +
  geom_smooth(alpha=1, size=.5, se=FALSE) +
  scale_color_manual(values = RColorBrewer::brewer.pal(n=4, name  = "Dark2")[2:4]) +
  facet_wrap(~var, ncol = 2) +
  labs(color = "Model", linetype="Model", y = "Sensitivity", x = "FPR", subtitle = "bas") +
  guides(color=guide_legend(override.aes = list(alpha = 1, ncol = 3))) +
  my_theme +
  theme(
    plot.subtitle =  element_markdown(size = 10, face =  "italic")
  )
ROC curves for *bas* district models.

Figure 1.13: ROC curves for bas district models.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
\begin{tiny}
adm <- st_read("../data/vector/states_mask.gpkg", quiet = T)
adm <- select(adm, geom)
adm <- st_make_valid(adm)
bas <- st_read("../data/vector/basins_simple.gpkg", quiet = T)
bas <- select(bas, geom)
bas <- st_make_valid(bas)
crs <- st_crs("EPSG:3857")
adm <- st_transform(adm, crs)
adm <- st_simplify(adm, dTolerance = 1500, preserveTopology = T)
bas <- st_transform(bas, crs) 
bas <- st_simplify(bas, dTolerance = 1000, preserveTopology = T)


files = list.files("../output/test-results/", pattern ="cnfrisk", full.names = T)

map_data = lapply(files, function(x){
  out = vars_detect(x)
  tmp = st_read(x, quiet = T)
  tmp = st_drop_geometry(tmp)
  tmp$unit = out$unit
  tmp$type = out$type
  tmp$var = out$var
  tmp %<>%
    select(unit,type,var,mean,obsv)
  if(out$unit == "basins") {
    st_as_sf(bind_cols(bas, tmp))
  } else {
    st_as_sf(bind_cols(adm, tmp))
  }
})

map_data = do.call(rbind, map_data)
map_data %>% 
  filter(unit == "states", var == "sb") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)
Spatial prediction of conflict class **sb** for *adm* districts.

Figure 1.14: Spatial prediction of conflict class sb for adm districts.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
map_data %>% 
  filter(unit == "states", var == "ns") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)
Spatial prediction of conflict class **ns** for *adm* districts.

Figure 1.15: Spatial prediction of conflict class ns for adm districts.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
map_data %>% 
  filter(unit == "states", var == "os") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)
Spatial prediction of conflict class **os** for *adm* districts.

Figure 1.16: Spatial prediction of conflict class os for adm districts.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
map_data %>% 
  filter(unit == "basins", var == "sb") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)
Spatial prediction of conflict class **sb** for *bas* districts.

Figure 1.17: Spatial prediction of conflict class sb for bas districts.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
map_data %>% 
  filter(unit == "basins", var == "ns") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)
Spatial prediction of conflict class **ns** for *bas* districts.

Figure 1.18: Spatial prediction of conflict class ns for bas districts.

Version Author Date
de8ce5a Darius Görgen 2021-04-05
map_data %>% 
  filter(unit == "basins", var == "os") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)
Spatial prediction of conflict class **os** for *bas* districts.

Figure 1.19: Spatial prediction of conflict class os for bas districts.

Version Author Date
de8ce5a Darius Görgen 2021-04-05

\end{tiny}

QQ-Plots and Residual Plots of Interaction Model

files <- list.files("../output/test-results/", pattern = "test-results", full.names = T)
acc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  tmp %<>%
    filter(name == "f2", 
           month == 0)
  tmp$type = out$type
  tmp
}) 

acc_data = do.call(rbind, acc_data)
acc_data %<>%
  mutate(type = factor(type,
                       levels = c("regression", "baseline", "structural", "environmental"),
                       labels = c("LR", "CH", "SV", "EV")),
         var = factor(var,
                      levels = c("all", "sb", "ns", "os"),
                      labels = c("cb", "sb", "ns", "os")),
         unit = factor(unit, 
                       levels = c("states", "basins"),
                       labels = c("adm", "bas")))

acc_data %>%
  # filter(type != "LR") %>%
  group_by(var) %>%
  nest() %>%
  mutate(model = map(data, ~lm(score ~ unit*type, data = .x)),
         model = map(model, ~augment(.x)),
         plots_resid= map(model, 
                          ~ggplot(data = .x, aes(x=.fitted, y =.resid, shape = unit, color = type)) +
                            geom_point(alpha = .5) +
                            xlim(0.15,.7) +
                            labs(x="Fitted values", 
                                 y = "Residuals", 
                                 color = "Predictor set", 
                                 shape = "Aggregation unit", 
                                 subtitle = var) +
                            scale_color_brewer(palette = "Dark2") +
                            guides(colour = guide_legend(override.aes = list(alpha = 1)),
                                   shape = guide_legend(override.aes = list(alpha = 1))) +
                            theme_classic() +
                            theme(
                              plot.subtitle=element_markdown(face = "bold", hjust = .5),
                              plot.title = element_markdown(size = 10),
                              plot.caption = element_markdown(size = 10),
                              plot.tag = element_markdown(size = 10),
                              strip.background = element_blank(),
                              strip.text = element_markdown(size = 10),    
                              strip.text.x = element_markdown(size = 10),
                              strip.text.y = element_markdown(size = 10),
                              legend.position = "bottom",
                              legend.key.size = unit(0.4, "cm"),
                              legend.text = element_markdown(size = 9),
                              legend.title = element_markdown(size=10),
                              axis.text = element_markdown(size = 7),
                              axis.title=element_markdown(size=10)
                            )),
         plots_qq = map(model, 
                        ~ggplot(data = .x, aes(sample = score, shape = unit, color = type)) +  
                          stat_qq(alpha = .55) + 
                          ylim(0.05, 0.85) +
                          labs(x="Theoretical quantiles", 
                               y = "Sample quantiles", 
                               color = "Predictor set", 
                               shape = "Aggregation unit", 
                               subtitle = var) +
                          scale_color_brewer(palette = "Dark2") +
                          guides(colour = guide_legend(override.aes = list(alpha = 1)),
                                 shape = guide_legend(override.aes = list(alpha = 1))) +
                          theme_classic() +
                          theme(
                            plot.subtitle=element_markdown(face = "bold", hjust = .5),
                            plot.title = element_markdown(size = 10),
                            plot.caption = element_markdown(size = 10),
                            plot.tag = element_markdown(size = 10),
                            strip.background = element_blank(),
                            strip.text = element_markdown(size = 10),    
                            strip.text.x = element_markdown(size = 10),
                            strip.text.y = element_markdown(size = 10),
                            legend.position = "bottom",
                            legend.key.size = unit(0.4, "cm"),
                            legend.text = element_markdown(size = 9),
                            legend.title = element_markdown(size=10),
                            axis.text = element_markdown(size = 7),
                            axis.title=element_markdown(size=10)
                          )
         )
  ) -> lm_plots
qqplots = lm_plots$plots_qq

if(knitr::is_latex_output()){
  ggarrange(qqplots[[1]] + 
              theme(axis.title.x = element_blank(),
                    axis.text.x = element_blank(),
                    axis.ticks.x = element_blank(),
                    axis.line.x = element_blank()), 
            qqplots[[4]]+ 
              theme(axis.title.x = element_blank(),
                    axis.text.x = element_blank(),
                    axis.ticks.x = element_blank(),
                    axis.line.x = element_blank(),
                    axis.title.y = element_blank(),
                    axis.text.y = element_blank(),
                    axis.ticks.y = element_blank(),
                    axis.line.y = element_blank()),
            qqplots[[2]], 
            qqplots[[3]] +
              theme(axis.title.y = element_blank(),
                    axis.text.y = element_blank(),
                    axis.ticks.y = element_blank(),
                    axis.line.y = element_blank()), 
            ncol = 2, nrow = 2, common.legend = TRUE, legend="bottom")
} else {
  for (i in 1:length(qqplots)){
    var = as.character(qqplots[[i]]$labels$subtitle)
    require(plotly)
    ggplotly(qqplots[[i]] + ggtitle(var) + theme(plot.title = element_markdown(face = "bold"))) %>%
      layout(legend = list(orientation = "h", 
                           xanchor = "center",
                           x = 0.5, y = -.2)) -> plt_out
    print(plt_out)
  }
}
residplots = lm_plots$plots_resid
if(knitr::is_latex_output()){
  ggarrange(residplots[[1]] + 
              theme(axis.title.x = element_blank(),
                    axis.text.x = element_blank(),
                    axis.ticks.x = element_blank(),
                    axis.line.x = element_blank()), 
            residplots[[4]]+ 
              theme(axis.title.x = element_blank(),
                    axis.text.x = element_blank(),
                    axis.ticks.x = element_blank(),
                    axis.line.x = element_blank(),
                    axis.title.y = element_blank(),
                    axis.text.y = element_blank(),
                    axis.ticks.y = element_blank(),
                    axis.line.y = element_blank()),
            residplots[[2]], 
            residplots[[3]] +
              theme(axis.title.y = element_blank(),
                    axis.text.y = element_blank(),
                    axis.ticks.y = element_blank(),
                    axis.line.y = element_blank()), 
            ncol = 2, nrow = 2, common.legend = TRUE, legend="bottom")
} else {
  for (i in 1:length(residplots)){
    var = as.character(residplots[[i]]$labels$subtitle)
    require(plotly)
    ggplotly(residplots[[i]] + ggtitle(var) + theme(plot.title = element_markdown(face = "bold"))) %>%
      layout(legend = list(orientation = "h", 
                           xanchor = "center",
                           x = 0.5, y = -.2)) -> plt_out
    print(plt_out)
  }
}

Descriptive Statistics of Interaction Variables with Agricultural Mask

predictors = readRDS("../data/vector/predictor_cube.rds")
vars = c("AGRET", "AGRGPP", "AGRLST", "AGRPREC", "AGRANOM", "AGRSPI1", "AGRSPI3", "AGRSPI6", "AGRSPI12", "AGRSPEI1", "AGRSPEI3", "AGRSPEI6", "AGRSPEI12")
predictors %>% 
  filter(var %in% vars) %>%
  as_tibble() %>%
  mutate(var = factor(var,
                      labels = vars,
                      levels = vars)) %>%
  dplyr::select(id, unit, var, value) %>%
  filter(!(unit == "states" & id > 847)) %>%
  group_by(unit, var) %>%
  summarise("Min." = round(min(value, na.rm = T),3),
            "1st Qu." = round(quantile(value, 0.25, na.rm = T),3),
            "Median" = round(median(value, na.rm = T),3),
            "Mean" = round(mean(value, na.rm = T),3),
            "3rd Qu." = round(quantile(value, 0.75, na.rm = T),3),
            "Max." = round(max(value, na.rm = T),3),
            "NA's" = round(sum(is.na(value)),3)) %>%
  ungroup() %>%
  mutate(unit = if_else(unit == "basins","*bas*", "*adm*")) %>%
  mutate(unit = factor(unit, levels = c("*adm*", "*bas*"),
                       labels = c("*adm*", "*bas*"))) %>%
  arrange(var, unit) %>%
  dplyr::select(-var) %>%
  rename("Spatial Unit" = unit) %>%
  thesis_kable(escape = FALSE,
               caption = "Descriptive statistics of agricultural interaction variables. (Unit of measurment: AGRET - kg/m²; AGRGPP - kg C/m²; AGRLST - K; AGRPREC - mm AGRANOM - mm; others - dimensionless)", 
               caption.short = "Descriptive statistics of agricultural interaction variables.",
               linesep = c("", "\\addlinespace"),
               longtable = T) %>%
  kable_styling(latex_options = c("HOLD_position", "repeat_header")) %>%
  group_rows("AGRET", 1, 2) %>%
  group_rows("AGRGPP", 3, 4) %>% 
  group_rows("AGRLST", 5, 6) %>%
  group_rows("AGRPREC", 7, 8) %>%
  group_rows("AGRANOM", 9, 10) %>%
  group_rows("AGRSPI1", 11, 12) %>%
  group_rows("AGRSPI3", 13, 14) %>%
  group_rows("AGRSPI6", 15, 16) %>%
  group_rows("AGRSPI12", 17, 18) %>%
  group_rows("AGRSPEI1", 19, 20) %>%
  group_rows("AGRSPEI3", 21, 22) %>%
  group_rows("AGRSPEI6", 23, 24) %>%
  group_rows("AGRSPEI12", 25, 26)
Table 1.1: Descriptive statistics of agricultural interaction variables. (Unit of measurment: AGRET - kg/m²; AGRGPP - kg C/m²; AGRLST - K; AGRPREC - mm AGRANOM - mm; others - dimensionless)
Spatial Unit Min. 1st Qu. Median Mean 3rd Qu. Max. NA’s
AGRET
adm 0.000 0.253 9.996 118.725 138.828 1273.573 643
bas 0.000 0.000 0.647 28.715 12.693 973.495 39358
AGRGPP
adm 0.000 0.491 20.158 232.349 272.386 6251.350 478
bas 0.000 0.000 1.337 55.510 27.987 2465.299 38957
AGRLST
adm 0.000 0.137 8.428 60.278 104.179 312.276 0
bas 0.000 0.000 0.066 18.272 7.486 279.603 0
AGRPREC
adm 0.000 0.003 0.526 17.237 13.174 429.346 0
bas 0.000 0.000 0.001 4.473 0.500 304.854 0
AGRANOM
adm -154.115 -0.148 0.000 0.535 0.050 265.607 0
bas -138.119 -0.001 0.000 0.082 0.000 125.800 0
AGRSPI1
adm -3.554 -0.006 0.000 0.011 0.004 5.854 1100
bas -3.471 0.000 0.000 0.003 0.000 4.494 2571
AGRSPI3
adm -4.829 -0.004 0.000 0.012 0.007 5.773 2654
bas -2.411 0.000 0.000 0.003 0.000 3.517 3331
AGRSPI6
adm -3.935 -0.003 0.000 0.012 0.008 4.157 5180
bas -2.596 0.000 0.000 0.003 0.000 3.035 6005
AGRSPI12
adm -4.750 -0.003 0.000 0.012 0.008 3.555 10233
bas -2.481 0.000 0.000 0.003 0.000 2.726 12059
AGRSPEI1
adm -4.126 -0.005 0.000 0.007 0.006 3.306 973
bas -2.446 0.000 0.000 0.002 0.000 2.427 2230
AGRSPEI3
adm -2.740 -0.005 0.000 0.008 0.006 5.006 2651
bas -1.819 0.000 0.000 0.002 0.000 2.636 3310
AGRSPEI6
adm -2.423 -0.004 0.000 0.008 0.007 4.259 5183
bas -1.856 0.000 0.000 0.002 0.000 2.730 6005
AGRSPEI12
adm -3.134 -0.004 0.000 0.007 0.007 3.558 10246
bas -2.689 0.000 0.000 0.002 0.000 2.668 12059

Table of Global Performance Metrics

#files <- list.files("../output/acc/", pattern = ".rds", full.names = T)
files <- list.files("../output/test-results/", pattern = "test-results", full.names = T)
# files <- files[-grep("test-results-regression-basins", files)]
acc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  tmp %>%
    filter(month == 0) %>%
    group_by(name) %>%
    summarise(target = mean(score, na.rm = T),
              sd = sd(score, na.rm = T),
              type = out$type,
              unit = out$unit,
              var = out$var) %>%
    ungroup()
}) 
acc_data = do.call(rbind, acc_data)

acc_data %<>% 
  mutate(var = factor(var, labels = c("**cb**", "**sb**", "**ns**", "**os**"),
                      levels = c("all", "sb", "ns", "os")),
         type = factor(type, labels = c("LR", "CH", "SV", "EV"),
                       levels = c("regression", "baseline", "structural", "environmental")),
         unit = if_else(unit == "basins", "bas", "adm"),
         unit = factor(unit, levels = c("adm", "bas"), labels = c("\\emph{adm}", "\\emph{bas}")))

acc_data %>%
  filter(name %in% c("f2", "auc", "aucpr", "sensitivity", "specificity", "precision")) %>%
  mutate(
    sd = paste0("(", as.character(round(sd,3)), ")"),
    score = paste0(round(target,3), " +/- ", sd)) %>%
  dplyr::select(type, unit, var, name, score, target) %>%
  group_by(name, var) %>%
  mutate(score = if_else(target == max(target), paste0("**", score, "**"), score )) %>%
  dplyr::select(-target) %>% 
  ungroup() %>%
  pivot_wider(names_from = name, values_from = c(score),
              id_cols = c(type, unit, var)) %>%
  arrange(type, unit, var) %>% 
  rename("F2" = f2, "AUC" = auc, "AUPR" = aucpr, "Precision" = precision, "Sensitivity" = sensitivity, "Specificity" = specificity) %>%
  rename(Theme = type, Unit = "unit", Variable = var) %>% 
  relocate(Theme, Unit, Variable, F2, AUC, AUPR, Precision, Sensitivity, Specificity) %>%
  ungroup %>%
  thesis_kable(escape = F,
               align = "ccccccccc", 
               longtable = T, 
               linesep = c("", "", "","\\addlinespace"),
               caption.short = "Global performance metrics for all model configurations.",
               caption = "Global performance metrics for all models configurations. Bold number indicate the best performance of the respective performance metric per conflict class. Standard deviation is given in brackets.") %>%
   kable_styling(latex_options = c("HOLD_position", "repeat_header"))
Table 1.2: Global performance metrics for all models configurations. Bold number indicate the best performance of the respective performance metric per conflict class. Standard deviation is given in brackets.
Theme Unit Variable F2 AUC AUPR Precision Sensitivity Specificity
LR cb 0.325 +/- (0.001) 0.671 +/- (0.001) 0.128 +/- (0.001) 0.11 +/- (0.002) 0.668 +/- (0.007) 0.598 +/- (0.007)
LR sb 0.242 +/- (0.003) 0.703 +/- (0.003) 0.084 +/- (0.002) 0.08 +/- (0.002) 0.541 +/- (0.02) 0.751 +/- (0.014)
LR ns 0.183 +/- (0.002) 0.739 +/- (0.002) 0.049 +/- (0.001) 0.055 +/- (0.002) 0.468 +/- (0.032) 0.842 +/- (0.019)
LR os 0.226 +/- (0.002) 0.681 +/- (0.002) 0.072 +/- (0.001) 0.075 +/- (0.001) 0.494 +/- (0.02) 0.777 +/- (0.014)
LR cb 0.389 +/- (0.001) 0.758 +/- (0.001) 0.168 +/- (0.001) 0.141 +/- (0.003) 0.708 +/- (0.016) 0.687 +/- (0.014)
LR sb 0.302 +/- (0.002) 0.775 +/- (0.003) 0.121 +/- (0.002) 0.105 +/- (0.003) 0.592 +/- (0.019) 0.8 +/- (0.012)
LR ns 0.216 +/- (0.002) 0.779 +/- (0.003) 0.069 +/- (0.002) 0.073 +/- (0.003) 0.477 +/- (0.022) 0.867 +/- (0.013)
LR os 0.287 +/- (0.003) 0.774 +/- (0.003) 0.105 +/- (0.002) 0.1 +/- (0.002) 0.549 +/- (0.012) 0.83 +/- (0.007)
CH cb 0.59 +/- (0.027) 0.918 +/- (0.002) 0.6 +/- (0.008) 0.283 +/- (0.046) 0.824 +/- (0.041) 0.845 +/- (0.045)
CH sb 0.604 +/- (0.045) 0.943 +/- (0.006) 0.634 +/- (0.019) 0.297 +/- (0.069) 0.832 +/- (0.022) 0.923 +/- (0.023)
CH ns 0.467 +/- (0.044) 0.943 +/- (0.003) 0.359 +/- (0.024) 0.178 +/- (0.043) 0.828 +/- (0.073) 0.922 +/- (0.029)
CH os 0.486 +/- (0.022) 0.909 +/- (0.003) 0.472 +/- (0.01) 0.207 +/- (0.027) 0.743 +/- (0.031) 0.902 +/- (0.021)
CH cb 0.664 +/- (0.014) 0.942 +/- (0.013) 0.573 +/- (0.107) 0.395 +/- (0.084) 0.822 +/- (0.073) 0.916 +/- (0.034)
CH sb 0.607 +/- (0.087) 0.954 +/- (0.003) 0.518 +/- (0.048) 0.552 +/- (0.08) 0.635 +/- (0.125) 0.982 +/- (0.01)
CH ns 0.375 +/- (0.107) 0.92 +/- (0.047) 0.266 +/- (0.04) 0.335 +/- (0.054) 0.404 +/- (0.151) 0.986 +/- (0.007)
CH os 0.5 +/- (0.039) 0.925 +/- (0.006) 0.42 +/- (0.05) 0.295 +/- (0.114) 0.683 +/- (0.151) 0.937 +/- (0.048)
SV cb 0.616 +/- (0.009) 0.92 +/- (0.007) 0.599 +/- (0.011) 0.402 +/- (0.036) 0.715 +/- (0.037) 0.923 +/- (0.015)
SV sb 0.633 +/- (0.022) 0.947 +/- (0.006) 0.613 +/- (0.022) 0.455 +/- (0.044) 0.706 +/- (0.044) 0.968 +/- (0.008)
SV ns 0.398 +/- (0.085) 0.933 +/- (0.006) 0.322 +/- (0.021) 0.345 +/- (0.09) 0.45 +/- (0.155) 0.98 +/- (0.014)
SV os 0.493 +/- (0.026) 0.906 +/- (0.01) 0.426 +/- (0.016) 0.275 +/- (0.048) 0.638 +/- (0.1) 0.94 +/- (0.023)
SV cb 0.649 +/- (0.014) 0.948 +/- (0.003) 0.625 +/- (0.011) 0.32 +/- (0.021) 0.876 +/- (0.016) 0.886 +/- (0.013)
SV sb 0.671 +/- (0.026) 0.963 +/- (0.004) 0.617 +/- (0.028) 0.356 +/- (0.036) 0.865 +/- (0.025) 0.948 +/- (0.01)
SV ns 0.452 +/- (0.02) 0.94 +/- (0.005) 0.275 +/- (0.026) 0.191 +/- (0.022) 0.696 +/- (0.046) 0.95 +/- (0.009)
SV os 0.518 +/- (0.033) 0.941 +/- (0.006) 0.48 +/- (0.026) 0.241 +/- (0.041) 0.747 +/- (0.074) 0.928 +/- (0.031)
EV cb 0.618 +/- (0.009) 0.924 +/- (0.003) 0.609 +/- (0.007) 0.401 +/- (0.044) 0.72 +/- (0.048) 0.922 +/- (0.02)
EV sb 0.647 +/- (0.026) 0.947 +/- (0.005) 0.65 +/- (0.012) 0.394 +/- (0.072) 0.783 +/- (0.039) 0.953 +/- (0.015)
EV ns 0.438 +/- (0.038) 0.936 +/- (0.007) 0.298 +/- (0.058) 0.275 +/- (0.091) 0.565 +/- (0.144) 0.967 +/- (0.018)
EV os 0.508 +/- (0.012) 0.903 +/- (0.008) 0.431 +/- (0.02) 0.237 +/- (0.027) 0.719 +/- (0.042) 0.921 +/- (0.016)
EV cb 0.689 +/- (0.006) 0.951 +/- (0.003) 0.639 +/- (0.021) 0.418 +/- (0.026) 0.824 +/- (0.017) 0.93 +/- (0.009)
EV sb 0.67 +/- (0.021) 0.963 +/- (0.003) 0.616 +/- (0.021) 0.42 +/- (0.036) 0.791 +/- (0.038) 0.964 +/- (0.007)
EV ns 0.487 +/- (0.016) 0.947 +/- (0.004) 0.305 +/- (0.025) 0.249 +/- (0.028) 0.647 +/- (0.051) 0.967 +/- (0.007)
EV os 0.553 +/- (0.014) 0.944 +/- (0.003) 0.5 +/- (0.011) 0.308 +/- (0.035) 0.697 +/- (0.04) 0.954 +/- (0.01)

sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] plotly_4.9.2.1     lubridate_1.7.9.2  rgdal_1.5-18       countrycode_1.2.0 
 [5] welchADF_0.3.2     rstatix_0.6.0      ggpubr_0.4.0       scales_1.1.1      
 [9] RColorBrewer_1.1-2 latex2exp_0.4.0    cubelyr_1.0.0      gridExtra_2.3     
[13] ggtext_0.1.1       magrittr_2.0.1     tmap_3.2           sf_0.9-7          
[17] raster_3.4-5       sp_1.4-4           forcats_0.5.0      stringr_1.4.0     
[21] purrr_0.3.4        readr_1.4.0        tidyr_1.1.2        tibble_3.0.6      
[25] tidyverse_1.3.0    huwiwidown_0.0.1   kableExtra_1.3.1   knitr_1.31        
[29] rmarkdown_2.7.3    bookdown_0.21      ggplot2_3.3.3      dplyr_1.0.2       
[33] devtools_2.3.2     usethis_2.0.0     

loaded via a namespace (and not attached):
  [1] readxl_1.3.1       backports_1.2.0    workflowr_1.6.2   
  [4] lwgeom_0.2-5       lazyeval_0.2.2     splines_3.6.3     
  [7] crosstalk_1.1.0.1  leaflet_2.0.3      digest_0.6.27     
 [10] htmltools_0.5.1.1  memoise_1.1.0      openxlsx_4.2.3    
 [13] remotes_2.2.0      modelr_0.1.8       prettyunits_1.1.1 
 [16] colorspace_2.0-0   rvest_0.3.6        haven_2.3.1       
 [19] xfun_0.21          leafem_0.1.3       callr_3.5.1       
 [22] crayon_1.4.0       jsonlite_1.7.2     lme4_1.1-26       
 [25] glue_1.4.2         stars_0.4-3        gtable_0.3.0      
 [28] webshot_0.5.2      car_3.0-10         pkgbuild_1.2.0    
 [31] abind_1.4-5        DBI_1.1.0          Rcpp_1.0.5        
 [34] viridisLite_0.3.0  gridtext_0.1.4     units_0.6-7       
 [37] foreign_0.8-71     htmlwidgets_1.5.3  httr_1.4.2        
 [40] ellipsis_0.3.1     farver_2.0.3       pkgconfig_2.0.3   
 [43] XML_3.99-0.3       dbplyr_2.0.0       labeling_0.4.2    
 [46] tidyselect_1.1.0   rlang_0.4.10       later_1.1.0.1     
 [49] tmaptools_3.1      munsell_0.5.0      cellranger_1.1.0  
 [52] tools_3.6.3        cli_2.3.0          generics_0.1.0    
 [55] broom_0.7.2        evaluate_0.14      yaml_2.2.1        
 [58] processx_3.4.5     leafsync_0.1.0     fs_1.5.0          
 [61] zip_2.1.1          nlme_3.1-150       whisker_0.4       
 [64] xml2_1.3.2         compiler_3.6.3     rstudioapi_0.13   
 [67] curl_4.3           png_0.1-7          e1071_1.7-4       
 [70] testthat_3.0.1     ggsignif_0.6.0     reprex_0.3.0      
 [73] statmod_1.4.35     stringi_1.5.3      highr_0.8         
 [76] ps_1.5.0           desc_1.2.0         lattice_0.20-41   
 [79] Matrix_1.2-18      markdown_1.1       nloptr_1.2.2.2    
 [82] classInt_0.4-3     vctrs_0.3.6        pillar_1.4.7      
 [85] lifecycle_0.2.0    data.table_1.13.2  httpuv_1.5.5      
 [88] R6_2.5.0           promises_1.1.1     KernSmooth_2.23-18
 [91] rio_0.5.16         sessioninfo_1.1.1  codetools_0.2-16  
 [94] dichromat_2.0-0    boot_1.3-25        MASS_7.3-53       
 [97] assertthat_0.2.1   pkgload_1.1.0      rprojroot_2.0.2   
[100] withr_2.4.1        mgcv_1.8-33        parallel_3.6.3    
[103] hms_1.0.0          grid_3.6.3         minqa_1.2.4       
[106] class_7.3-17       carData_3.0-4      git2r_0.27.1      
[109] base64enc_0.1-3