Last updated: 2021-05-12

Checks: 7 0

Knit directory: thesis/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20210321)

The command set.seed(20210321) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: 7fd4ff2

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 7fd4ff2. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rproj.user/
    Ignored:    data/DB/
    Ignored:    data/raster/
    Ignored:    data/raw/
    Ignored:    data/vector/
    Ignored:    docker_command.txt
    Ignored:    output/acc/
    Ignored:    output/bayes/
    Ignored:    output/ffs/
    Ignored:    output/models/
    Ignored:    output/plots/
    Ignored:    output/test-results/
    Ignored:    renv/library/
    Ignored:    renv/staging/
    Ignored:    report/presentation/

Untracked files:
    Untracked:  analysis/assets/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/thesis-appendix.Rmd) and HTML (docs/thesis-appendix.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	de8ce5a	Darius Görgen	2021-04-05	add content
html	de8ce5a	Darius Görgen	2021-04-05	add content

1 Appendix

var_iter = c("**cb**", "**sb**", "**ns**", "**os**")
unit_iter = c("*adm*", "*bas*")
combi = expand.grid(unit_iter, var_iter)
names_unit = combi$Var1
names_var = combi$Var2

Density Plots of Predicted Conflict Probability

files <- list.files("../output/acc/", pattern = ".rds", full.names = T)

data <- lapply(files, function(x){
  out = vars_detect(x)
  tmp = readRDS(x)
  
  thres_df = readRDS(
    paste0(
      "../output/test-results/test-results-", 
      paste(out$type, out$unit, out$var, sep = "-"),
      ".rds")
  )
  if(out$var == "all"){
    for(i in 1:length(tmp)){
      tmp[[i]]$df$var = "cb"
      out$var = "cb"
    }
  }
  thres_df %>%
    filter(name == "f2", month == 0) %>% 
    filter(score == max(score, na.rm = T)) %>%
    pull(rep) -> rep
  
  thres_df %>%
    filter(name == "threshold", rep == !!rep) %>%
    pull(score) -> threshold
  
  list(unit = out$unit, type = out$type, var = out$var, thres = threshold, df = tmp[[rep]]$df)
}) 

units <- unlist(lapply(data, `[[`, 1))
types <- unlist(lapply(data, `[[`, 2))
vars <- unlist(lapply(data, `[[`, 3))
dfs <-  lapply(data, `[[`, 4)


var_iter = c("cb", "sb", "ns", "os")
unit_iter = c("states", "basins")
type_iter = c("baseline", "structural", "environmental")


for(var in var_iter){
  tmp1 = data[vars == var]
  for(unit in unit_iter){
    unit_lab = if_else(unit == "states", "adm", "bas")
    units =  unlist(lapply(tmp1, `[[`, 1))
    tmp2 = tmp1[units == unit]
    types =  unlist(lapply(tmp2, `[[`, 2))
    thresholds = unlist(lapply(tmp2, `[[`, 4))
    thresholds = data.frame(type = factor(types, 
                                          levels = c("baseline", "structural", "environmental"),
                                          labels = c("CH", "SV", "EV")), 
                            thresholds = thresholds)
    
    df = do.call(rbind, lapply(tmp2, `[[`, 5))
    opts_current$set(label = paste("dens", unit,var, sep = "-")) # setting label 
    df %>%
      mutate(type = factor(type, 
                           levels = c("baseline", "structural", "environmental"),
                           labels = c("CH", "SV", "EV")),
             unit = factor(unit,
                           levels = c("states", "basins"),
                           labels = c("adm", "bas")),
             obsv = factor(obsv,
                           levels = c("peace", "conflict"),
                           labels = c("Peace", "Conflict"))) %>%
      ggplot() +
      geom_density(aes(x=pred, fill = obsv), alpha = .4) +
      geom_vline(data = thresholds, aes(xintercept=thresholds), linetype=2) +
      xlim(0,1) +
      labs(fill = "", x = "Predicted Probability", y = "Density", 
           subtitle = paste0("Unit: *", unit_lab, "*      Variable: **",var,"**")) +
      scale_fill_brewer(palette="Dark2") +
      facet_wrap(~type, scales = "free_y")+
      my_theme +
      theme(plot.subtitle =  element_markdown(size = 10)) -> plt_out
    print(plt_out)
  }
}

Predicted probability of **cb** conflicts for *adm* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.1: Predicted probability of cb conflicts for adm districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

Predicted probability of **cb** conflicts for *bas* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.2: Predicted probability of cb conflicts for bas districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

Predicted probability of **sb** conflicts for *adm* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.3: Predicted probability of sb conflicts for adm districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

Predicted probability of **sb** conflicts for *bas* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.4: Predicted probability of sb conflicts for bas districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

Predicted probability of **ns** conflicts for *adm* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.5: Predicted probability of ns conflicts for adm districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

Predicted probability of **ns** conflicts for *bas* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.6: Predicted probability of ns conflicts for bas districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

Predicted probability of **os** conflicts for *adm* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.7: Predicted probability of os conflicts for adm districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

Predicted probability of **os** conflicts for *bas* districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Figure 1.8: Predicted probability of os conflicts for bas districts. Note that in order to increase visibility the scale on the y-axis differs from one facet to another.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

Additional Global Performance Metrics

#files <- list.files("../output/acc/", pattern = ".rds", full.names = T)
files <- list.files("../output/test-results/", pattern = "test-results", full.names = T)
# files <- files[-grep("test-results-regression-basins", files)]
acc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  tmp %>%
    filter(month == 0) %>%
    group_by(name) %>%
    summarise(mean = mean(score, na.rm = T),
              median = median(score, na.rm = TRUE),
              sd = sd(score, na.rm = T),
              type = out$type,
              unit = out$unit,
              var = out$var) %>%
    ungroup()
}) 
acc_data = do.call(rbind, acc_data)

acc_data %<>% 
  mutate(var = factor(var, 
                      levels = c("all", "sb", "ns", "os"),
                      labels = c("cb", "sb", "ns", "os")),
         type = factor(type, 
                       labels = c("LR", "CH", "SV", "EV"),
                       levels = c("regression", "baseline", "structural", "environmental")),
         unit = if_else(unit == "basins", "bas", "adm"),
         unit = factor(unit, levels = c("adm", "bas"), labels = c("adm", "bas")))

pd <- position_dodge(0.1)
score = "median"

labs.unit = c("_adm_", "_bas_")
names(labs.unit) = c("adm", "bas")
labs.name = c("", "")
names(labs.name) = c("AUC", "AUC")

plt_auc = acc_data %>%
  dplyr::filter(name == "auc") %>%
  dplyr::select(type, unit, var, name, score = !!score, sd) %>%
  ggplot(aes(color=var, x=type, shape=var)) +
  geom_point(aes(y=score), position = position_dodge2(width = .5)) +
  geom_errorbar(aes(ymin=score-sd, ymax=score+sd, x=type), width=.5, position = position_dodge2(width = .5)) +
  scale_color_brewer(palette = "Dark2") +
  labs(x="", y = "AUC", color = "Conflict Class", shape = "Conflict Class") +
  ylim(0,1) +
  facet_wrap(~unit+name, labeller = labeller(unit = labs.unit, name = labs.name)) +
  guides(color=guide_legend(ncol=4)) +
  my_theme 

plt_spec = acc_data %>%
  filter(name == "specificity") %>%
  dplyr::select(type, unit, var, name, score = !!score, sd) %>%
  ggplot(aes(color=var, x=type, shape=var)) +
  geom_point(aes(y=score), position = position_dodge2(width = .5)) +
  geom_errorbar(aes(ymin=score-sd, ymax=score+sd, x=type), width=.5, position = position_dodge2(width = .5)) +
  scale_color_brewer(palette = "Dark2") +
  labs(x="Model type", y = "Specificity", color = "Conflict Class", shape = "Conflict Class") +
  ylim(0,1) +
  facet_wrap(~unit+name, labeller = labeller(unit = labs.unit, name = labs.name)) +
  guides(color=guide_legend(ncol=4)) +
  my_theme

if(knitr::is_latex_output()){
  ggarrange(plt_auc, plt_spec, ncol = 1, common.legend = T, legend = "bottom")
} else {
  require(plotly)
  subplot(plt_auc, plt_spec, nrows = 2) %>%
    layout(legend = list(orientation = "h", 
                         xanchor = "center",
                         x = 0.5, y = -.1))
}

Figure 1.9: Global performance of the AUC metric (top) and pecificity (bottom).

Additional Temporal Performance Metrics

#files <- list.files("../output/acc/", pattern = ".rds", full.names = T)
files <- list.files("../output/test-results/", pattern = "test-results", full.names = T)
# files <- files[-grep("test-results-regression-basins", files)]
acc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  tmp$type = out$type
  
  tmp %>%
    filter(month != 0)
}) 
acc_data = do.call(rbind, acc_data)

acc_data %<>% 
  mutate(var = factor(var, labels = c("all", "sb", "ns", "os"),
                      levels = c("all", "sb", "ns", "os")),
         type = factor(type, 
                       labels = c("LR", "CH", "SV", "EV"),
                       levels = c("regression", "baseline", "structural", "environmental")),
         unit = if_else(unit == "basins", "bas", "adm"),
         unit = factor(unit, levels = c("adm", "bas"), labels = c("adm", "bas")))

labs.unit = c("_adm_", "_bas_")
names(labs.unit) = c("adm", "bas")
labs.var = c("__cb__", "__sb__", "__ns__", "__os__")
names(labs.var) = c("all", "sb", "ns", "os")

acc_data %>%
  filter(name == "auc") %>%
  dplyr::select(name, score, month, var, unit, type) %>%
  ggplot(aes(x=month, y=score, color=type, linetype=type))+
  stat_smooth(geom='line', alpha=.4, size=.5, se=FALSE, lwd=1)+
  facet_grid(unit~var, labeller = labeller(unit = labs.unit, var = labs.var)) +
  scale_x_continuous(breaks=c(1:12)) +
  ylim(0,1) +
  theme_classic() +
  scale_color_brewer(palette="Dark2") + 
  labs(subtitle = "AUC", x = "Prediction window (in months)", y = "Score", color = "Model", linetype = "Model") +
  guides(color=guide_legend(ncol=4)) +
  my_theme -> plt_out
plot_output(plt_out)

Figure 1.10: Time dependent performance of the AUC metric.

acc_data %>%
  filter(name == "specificity") %>%
  dplyr::select(name, score, month, var, unit, type) %>%
  ggplot(aes(x=month, y=score, color=type, linetype=type))+
  stat_smooth(geom='line', alpha=.4, size=.5, se=FALSE, lwd=1)+
  facet_grid(unit~var, labeller = labeller(unit = labs.unit, var = labs.var)) +
  scale_x_continuous(breaks=c(1:12)) +
  ylim(0,1) +
  theme_classic() +
  scale_color_brewer(palette="Dark2") + 
  labs(subtitle = "Specificity", x = "Prediction window (in months)", y = "Score", color = "Model", linetype="Model") +
  guides(color=guide_legend(ncol=4)) +
  my_theme -> plt_out
plot_output(plt_out)

Figure 1.11: Time dependent performance of specificity.

ROC Curves

files <- list.files("../output/acc/", pattern = ".rds", full.names = T)

auc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  
  df_auc <- lapply(1:length(tmp), function(i){
    rep = tmp[[i]]$rep
    FPR = tmp[[i]]$auc$curve[,1]
    recall = tmp[[i]]$auc$curve[,2]
    FPR = FPR[ c(rep(FALSE, 3), TRUE)] # reduce to every 4th entry to reduce file size
    recall = recall[ c(rep(FALSE, 3), TRUE)] # reduce to every 4th entry to reduce file size
    
    auc <- tibble(type = out$type,
                  unit = out$unit,
                  var = out$var,
                  rep = rep,
                  FPR = FPR,
                  recall = recall)
  })
  df_auc = do.call(rbind, df_auc)
}) 

auc_data <- do.call(rbind, auc_data)

auc_data %<>% 
  mutate(var = factor(var, 
                      labels = c("__cb__", "__sb__", "__ns__", "__os__"),
                      levels = c("all", "sb", "ns", "os")),
         type = factor(type, 
                       labels = c("CH", "SV", "EV"),
                       levels = c("baseline", "structural", "environmental")),
         unit = factor(unit, levels = c("states", "basins"), labels = c("adm", "bas")))

auc_data %>% 
  filter(unit == "adm") %>%
  mutate(group = paste(type,var,rep,sep="-")) %>%
  ggplot(aes(x=FPR, y=recall,color=type,linetype=type)) +
  geom_line(aes(group=group), size=.1, alpha = .5) +
  geom_smooth(alpha=1, size=.5, se=FALSE) +
  scale_color_manual(values = RColorBrewer::brewer.pal(n=4, name  = "Dark2")[2:4]) +
  facet_wrap(~var, ncol = 2) +
  labs(color = "Model", linetype="Model", y = "Sensitivity", x = "FPR", subtitle = "adm") +
  guides(color=guide_legend(override.aes = list(alpha = 1, ncol = 3))) +
  my_theme +
  theme(
    plot.subtitle =  element_markdown(size = 10, face =  "italic")
  )

Figure 1.12: ROC curves for adm district models.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

auc_data %>% 
  filter(unit == "bas") %>%
  mutate(group = paste(type,var,rep,sep="-")) %>%
  ggplot(aes(x=FPR, y=recall,color=type,linetype=type)) +
  geom_line(aes(group=group), size=.1, alpha = .5) +
  geom_smooth(alpha=1, size=.5, se=FALSE) +
  scale_color_manual(values = RColorBrewer::brewer.pal(n=4, name  = "Dark2")[2:4]) +
  facet_wrap(~var, ncol = 2) +
  labs(color = "Model", linetype="Model", y = "Sensitivity", x = "FPR", subtitle = "bas") +
  guides(color=guide_legend(override.aes = list(alpha = 1, ncol = 3))) +
  my_theme +
  theme(
    plot.subtitle =  element_markdown(size = 10, face =  "italic")
  )

Figure 1.13: ROC curves for bas district models.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

\begin{tiny}

adm <- st_read("../data/vector/states_mask.gpkg", quiet = T)
adm <- select(adm, geom)
adm <- st_make_valid(adm)
bas <- st_read("../data/vector/basins_simple.gpkg", quiet = T)
bas <- select(bas, geom)
bas <- st_make_valid(bas)
crs <- st_crs("EPSG:3857")
adm <- st_transform(adm, crs)
adm <- st_simplify(adm, dTolerance = 1500, preserveTopology = T)
bas <- st_transform(bas, crs) 
bas <- st_simplify(bas, dTolerance = 1000, preserveTopology = T)


files = list.files("../output/test-results/", pattern ="cnfrisk", full.names = T)

map_data = lapply(files, function(x){
  out = vars_detect(x)
  tmp = st_read(x, quiet = T)
  tmp = st_drop_geometry(tmp)
  tmp$unit = out$unit
  tmp$type = out$type
  tmp$var = out$var
  tmp %<>%
    select(unit,type,var,mean,obsv)
  if(out$unit == "basins") {
    st_as_sf(bind_cols(bas, tmp))
  } else {
    st_as_sf(bind_cols(adm, tmp))
  }
})

map_data = do.call(rbind, map_data)

map_data %>% 
  filter(unit == "states", var == "sb") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)

Figure 1.14: Spatial prediction of conflict class sb for adm districts.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

map_data %>% 
  filter(unit == "states", var == "ns") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)

Figure 1.15: Spatial prediction of conflict class ns for adm districts.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

map_data %>% 
  filter(unit == "states", var == "os") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)

Figure 1.16: Spatial prediction of conflict class os for adm districts.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

map_data %>% 
  filter(unit == "basins", var == "sb") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)

Figure 1.17: Spatial prediction of conflict class sb for bas districts.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

map_data %>% 
  filter(unit == "basins", var == "ns") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)

Figure 1.18: Spatial prediction of conflict class ns for bas districts.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

map_data %>% 
  filter(unit == "basins", var == "os") -> plt_data

breaks = seq(0,1,0.1)

plt_data %>% 
  filter(type == "baseline") %>%
  dplyr::select(obsv) %>%
  mutate(obsv = ifelse(obsv > 0, obsv, NA)) -> obsv_data

obsv_data %>%
  tm_shape() +
  tm_polygons("obsv", 
              palette = "PuBu",
              lwd=.01, 
              border.col = "grey90", 
              breaks = 0:12,
              style = "fixed", 
              title = "Conflict Months",
              showNA = FALSE,
              legend.hist = TRUE,
              labels = as.character(1:12)) +
  tm_legend(stack = "horizontal") + 
  tm_layout(title = "Observation",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_obsv

plt_data %>%
  filter(type == "baseline") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE) +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Conflict History (CH)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)   -> map_bas

plt_data %>%
  filter(type == "structural") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Structural Variables (SV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_str

plt_data %>%
  filter(type == "environmental") %>%
  tm_shape() +
  tm_polygons("mean", 
              palette = "PuBu", 
              lwd=.01, 
              border.col = "grey90", 
              title = "Probability of conflict", 
              breaks = breaks, 
              style = "fixed", 
              legend.hist = TRUE)  +
  tm_legend(stack = "horizontal") +
  tm_layout(title = "Environmental Variables (EV)",
            title.position = c("right", "top"),
            title.size = 0.7,
            legend.title.size = 0.7,
            legend.text.size = 0.45,
            legend.position = c("left", "bottom"),
            legend.outside.position = "bottom",
            legend.bg.color = "white",
            legend.bg.alpha = 0,
            legend.outside = TRUE,
            legend.hist.width = 1,
            legend.hist.height = .8,
            legend.hist.size = .5)  -> map_env

tmap_arrange(map_obsv, map_bas, map_str, map_env, ncol = 2)

Figure 1.19: Spatial prediction of conflict class os for bas districts.

Version	Author	Date
de8ce5a	Darius Görgen	2021-04-05

\end{tiny}

QQ-Plots and Residual Plots of Interaction Model

files <- list.files("../output/test-results/", pattern = "test-results", full.names = T)
acc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  tmp %<>%
    filter(name == "f2", 
           month == 0)
  tmp$type = out$type
  tmp
}) 

acc_data = do.call(rbind, acc_data)
acc_data %<>%
  mutate(type = factor(type,
                       levels = c("regression", "baseline", "structural", "environmental"),
                       labels = c("LR", "CH", "SV", "EV")),
         var = factor(var,
                      levels = c("all", "sb", "ns", "os"),
                      labels = c("cb", "sb", "ns", "os")),
         unit = factor(unit, 
                       levels = c("states", "basins"),
                       labels = c("adm", "bas")))

acc_data %>%
  # filter(type != "LR") %>%
  group_by(var) %>%
  nest() %>%
  mutate(model = map(data, ~lm(score ~ unit*type, data = .x)),
         model = map(model, ~augment(.x)),
         plots_resid= map(model, 
                          ~ggplot(data = .x, aes(x=.fitted, y =.resid, shape = unit, color = type)) +
                            geom_point(alpha = .5) +
                            xlim(0.15,.7) +
                            labs(x="Fitted values", 
                                 y = "Residuals", 
                                 color = "Predictor set", 
                                 shape = "Aggregation unit", 
                                 subtitle = var) +
                            scale_color_brewer(palette = "Dark2") +
                            guides(colour = guide_legend(override.aes = list(alpha = 1)),
                                   shape = guide_legend(override.aes = list(alpha = 1))) +
                            theme_classic() +
                            theme(
                              plot.subtitle=element_markdown(face = "bold", hjust = .5),
                              plot.title = element_markdown(size = 10),
                              plot.caption = element_markdown(size = 10),
                              plot.tag = element_markdown(size = 10),
                              strip.background = element_blank(),
                              strip.text = element_markdown(size = 10),    
                              strip.text.x = element_markdown(size = 10),
                              strip.text.y = element_markdown(size = 10),
                              legend.position = "bottom",
                              legend.key.size = unit(0.4, "cm"),
                              legend.text = element_markdown(size = 9),
                              legend.title = element_markdown(size=10),
                              axis.text = element_markdown(size = 7),
                              axis.title=element_markdown(size=10)
                            )),
         plots_qq = map(model, 
                        ~ggplot(data = .x, aes(sample = score, shape = unit, color = type)) +  
                          stat_qq(alpha = .55) + 
                          ylim(0.05, 0.85) +
                          labs(x="Theoretical quantiles", 
                               y = "Sample quantiles", 
                               color = "Predictor set", 
                               shape = "Aggregation unit", 
                               subtitle = var) +
                          scale_color_brewer(palette = "Dark2") +
                          guides(colour = guide_legend(override.aes = list(alpha = 1)),
                                 shape = guide_legend(override.aes = list(alpha = 1))) +
                          theme_classic() +
                          theme(
                            plot.subtitle=element_markdown(face = "bold", hjust = .5),
                            plot.title = element_markdown(size = 10),
                            plot.caption = element_markdown(size = 10),
                            plot.tag = element_markdown(size = 10),
                            strip.background = element_blank(),
                            strip.text = element_markdown(size = 10),    
                            strip.text.x = element_markdown(size = 10),
                            strip.text.y = element_markdown(size = 10),
                            legend.position = "bottom",
                            legend.key.size = unit(0.4, "cm"),
                            legend.text = element_markdown(size = 9),
                            legend.title = element_markdown(size=10),
                            axis.text = element_markdown(size = 7),
                            axis.title=element_markdown(size=10)
                          )
         )
  ) -> lm_plots

qqplots = lm_plots$plots_qq

if(knitr::is_latex_output()){
  ggarrange(qqplots[[1]] + 
              theme(axis.title.x = element_blank(),
                    axis.text.x = element_blank(),
                    axis.ticks.x = element_blank(),
                    axis.line.x = element_blank()), 
            qqplots[[4]]+ 
              theme(axis.title.x = element_blank(),
                    axis.text.x = element_blank(),
                    axis.ticks.x = element_blank(),
                    axis.line.x = element_blank(),
                    axis.title.y = element_blank(),
                    axis.text.y = element_blank(),
                    axis.ticks.y = element_blank(),
                    axis.line.y = element_blank()),
            qqplots[[2]], 
            qqplots[[3]] +
              theme(axis.title.y = element_blank(),
                    axis.text.y = element_blank(),
                    axis.ticks.y = element_blank(),
                    axis.line.y = element_blank()), 
            ncol = 2, nrow = 2, common.legend = TRUE, legend="bottom")
} else {
  for (i in 1:length(qqplots)){
    var = as.character(qqplots[[i]]$labels$subtitle)
    require(plotly)
    ggplotly(qqplots[[i]] + ggtitle(var) + theme(plot.title = element_markdown(face = "bold"))) %>%
      layout(legend = list(orientation = "h", 
                           xanchor = "center",
                           x = 0.5, y = -.2)) -> plt_out
    print(plt_out)
  }
}

residplots = lm_plots$plots_resid
if(knitr::is_latex_output()){
  ggarrange(residplots[[1]] + 
              theme(axis.title.x = element_blank(),
                    axis.text.x = element_blank(),
                    axis.ticks.x = element_blank(),
                    axis.line.x = element_blank()), 
            residplots[[4]]+ 
              theme(axis.title.x = element_blank(),
                    axis.text.x = element_blank(),
                    axis.ticks.x = element_blank(),
                    axis.line.x = element_blank(),
                    axis.title.y = element_blank(),
                    axis.text.y = element_blank(),
                    axis.ticks.y = element_blank(),
                    axis.line.y = element_blank()),
            residplots[[2]], 
            residplots[[3]] +
              theme(axis.title.y = element_blank(),
                    axis.text.y = element_blank(),
                    axis.ticks.y = element_blank(),
                    axis.line.y = element_blank()), 
            ncol = 2, nrow = 2, common.legend = TRUE, legend="bottom")
} else {
  for (i in 1:length(residplots)){
    var = as.character(residplots[[i]]$labels$subtitle)
    require(plotly)
    ggplotly(residplots[[i]] + ggtitle(var) + theme(plot.title = element_markdown(face = "bold"))) %>%
      layout(legend = list(orientation = "h", 
                           xanchor = "center",
                           x = 0.5, y = -.2)) -> plt_out
    print(plt_out)
  }
}

Descriptive Statistics of Interaction Variables with Agricultural Mask

predictors = readRDS("../data/vector/predictor_cube.rds")
vars = c("AGRET", "AGRGPP", "AGRLST", "AGRPREC", "AGRANOM", "AGRSPI1", "AGRSPI3", "AGRSPI6", "AGRSPI12", "AGRSPEI1", "AGRSPEI3", "AGRSPEI6", "AGRSPEI12")
predictors %>% 
  filter(var %in% vars) %>%
  as_tibble() %>%
  mutate(var = factor(var,
                      labels = vars,
                      levels = vars)) %>%
  dplyr::select(id, unit, var, value) %>%
  filter(!(unit == "states" & id > 847)) %>%
  group_by(unit, var) %>%
  summarise("Min." = round(min(value, na.rm = T),3),
            "1st Qu." = round(quantile(value, 0.25, na.rm = T),3),
            "Median" = round(median(value, na.rm = T),3),
            "Mean" = round(mean(value, na.rm = T),3),
            "3rd Qu." = round(quantile(value, 0.75, na.rm = T),3),
            "Max." = round(max(value, na.rm = T),3),
            "NA's" = round(sum(is.na(value)),3)) %>%
  ungroup() %>%
  mutate(unit = if_else(unit == "basins","*bas*", "*adm*")) %>%
  mutate(unit = factor(unit, levels = c("*adm*", "*bas*"),
                       labels = c("*adm*", "*bas*"))) %>%
  arrange(var, unit) %>%
  dplyr::select(-var) %>%
  rename("Spatial Unit" = unit) %>%
  thesis_kable(escape = FALSE,
               caption = "Descriptive statistics of agricultural interaction variables. (Unit of measurment: AGRET - kg/m²; AGRGPP - kg C/m²; AGRLST - K; AGRPREC - mm AGRANOM - mm; others - dimensionless)", 
               caption.short = "Descriptive statistics of agricultural interaction variables.",
               linesep = c("", "\\addlinespace"),
               longtable = T) %>%
  kable_styling(latex_options = c("HOLD_position", "repeat_header")) %>%
  group_rows("AGRET", 1, 2) %>%
  group_rows("AGRGPP", 3, 4) %>% 
  group_rows("AGRLST", 5, 6) %>%
  group_rows("AGRPREC", 7, 8) %>%
  group_rows("AGRANOM", 9, 10) %>%
  group_rows("AGRSPI1", 11, 12) %>%
  group_rows("AGRSPI3", 13, 14) %>%
  group_rows("AGRSPI6", 15, 16) %>%
  group_rows("AGRSPI12", 17, 18) %>%
  group_rows("AGRSPEI1", 19, 20) %>%
  group_rows("AGRSPEI3", 21, 22) %>%
  group_rows("AGRSPEI6", 23, 24) %>%
  group_rows("AGRSPEI12", 25, 26)

Table 1.1: Descriptive statistics of agricultural interaction variables. (Unit of measurment: AGRET - kg/m²; AGRGPP - kg C/m²; AGRLST - K; AGRPREC - mm AGRANOM - mm; others - dimensionless)
Spatial Unit	Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	NA’s
AGRET
adm	0.000	0.253	9.996	118.725	138.828	1273.573	643
bas	0.000	0.000	0.647	28.715	12.693	973.495	39358
AGRGPP
adm	0.000	0.491	20.158	232.349	272.386	6251.350	478
bas	0.000	0.000	1.337	55.510	27.987	2465.299	38957
AGRLST
adm	0.000	0.137	8.428	60.278	104.179	312.276	0
bas	0.000	0.000	0.066	18.272	7.486	279.603	0
AGRPREC
adm	0.000	0.003	0.526	17.237	13.174	429.346	0
bas	0.000	0.000	0.001	4.473	0.500	304.854	0
AGRANOM
adm	-154.115	-0.148	0.000	0.535	0.050	265.607	0
bas	-138.119	-0.001	0.000	0.082	0.000	125.800	0
AGRSPI1
adm	-3.554	-0.006	0.000	0.011	0.004	5.854	1100
bas	-3.471	0.000	0.000	0.003	0.000	4.494	2571
AGRSPI3
adm	-4.829	-0.004	0.000	0.012	0.007	5.773	2654
bas	-2.411	0.000	0.000	0.003	0.000	3.517	3331
AGRSPI6
adm	-3.935	-0.003	0.000	0.012	0.008	4.157	5180
bas	-2.596	0.000	0.000	0.003	0.000	3.035	6005
AGRSPI12
adm	-4.750	-0.003	0.000	0.012	0.008	3.555	10233
bas	-2.481	0.000	0.000	0.003	0.000	2.726	12059
AGRSPEI1
adm	-4.126	-0.005	0.000	0.007	0.006	3.306	973
bas	-2.446	0.000	0.000	0.002	0.000	2.427	2230
AGRSPEI3
adm	-2.740	-0.005	0.000	0.008	0.006	5.006	2651
bas	-1.819	0.000	0.000	0.002	0.000	2.636	3310
AGRSPEI6
adm	-2.423	-0.004	0.000	0.008	0.007	4.259	5183
bas	-1.856	0.000	0.000	0.002	0.000	2.730	6005
AGRSPEI12
adm	-3.134	-0.004	0.000	0.007	0.007	3.558	10246
bas	-2.689	0.000	0.000	0.002	0.000	2.668	12059

Table of Global Performance Metrics

#files <- list.files("../output/acc/", pattern = ".rds", full.names = T)
files <- list.files("../output/test-results/", pattern = "test-results", full.names = T)
# files <- files[-grep("test-results-regression-basins", files)]
acc_data <- lapply(files, function(x){
  out <- vars_detect(x)
  tmp = readRDS(x)
  tmp %>%
    filter(month == 0) %>%
    group_by(name) %>%
    summarise(target = mean(score, na.rm = T),
              sd = sd(score, na.rm = T),
              type = out$type,
              unit = out$unit,
              var = out$var) %>%
    ungroup()
}) 
acc_data = do.call(rbind, acc_data)

acc_data %<>% 
  mutate(var = factor(var, labels = c("**cb**", "**sb**", "**ns**", "**os**"),
                      levels = c("all", "sb", "ns", "os")),
         type = factor(type, labels = c("LR", "CH", "SV", "EV"),
                       levels = c("regression", "baseline", "structural", "environmental")),
         unit = if_else(unit == "basins", "bas", "adm"),
         unit = factor(unit, levels = c("adm", "bas"), labels = c("\\emph{adm}", "\\emph{bas}")))

acc_data %>%
  filter(name %in% c("f2", "auc", "aucpr", "sensitivity", "specificity", "precision")) %>%
  mutate(
    sd = paste0("(", as.character(round(sd,3)), ")"),
    score = paste0(round(target,3), " +/- ", sd)) %>%
  dplyr::select(type, unit, var, name, score, target) %>%
  group_by(name, var) %>%
  mutate(score = if_else(target == max(target), paste0("**", score, "**"), score )) %>%
  dplyr::select(-target) %>% 
  ungroup() %>%
  pivot_wider(names_from = name, values_from = c(score),
              id_cols = c(type, unit, var)) %>%
  arrange(type, unit, var) %>% 
  rename("F2" = f2, "AUC" = auc, "AUPR" = aucpr, "Precision" = precision, "Sensitivity" = sensitivity, "Specificity" = specificity) %>%
  rename(Theme = type, Unit = "unit", Variable = var) %>% 
  relocate(Theme, Unit, Variable, F2, AUC, AUPR, Precision, Sensitivity, Specificity) %>%
  ungroup %>%
  thesis_kable(escape = F,
               align = "ccccccccc", 
               longtable = T, 
               linesep = c("", "", "","\\addlinespace"),
               caption.short = "Global performance metrics for all model configurations.",
               caption = "Global performance metrics for all models configurations. Bold number indicate the best performance of the respective performance metric per conflict class. Standard deviation is given in brackets.") %>%
   kable_styling(latex_options = c("HOLD_position", "repeat_header"))

Table 1.2: Global performance metrics for all models configurations. Bold number indicate the best performance of the respective performance metric per conflict class. Standard deviation is given in brackets.
Theme	Variable	F2	AUC	AUPR	Precision	Sensitivity	Specificity
LR	cb	0.325 +/- (0.001)	0.671 +/- (0.001)	0.128 +/- (0.001)	0.11 +/- (0.002)	0.668 +/- (0.007)	0.598 +/- (0.007)
LR	sb	0.242 +/- (0.003)	0.703 +/- (0.003)	0.084 +/- (0.002)	0.08 +/- (0.002)	0.541 +/- (0.02)	0.751 +/- (0.014)
LR	ns	0.183 +/- (0.002)	0.739 +/- (0.002)	0.049 +/- (0.001)	0.055 +/- (0.002)	0.468 +/- (0.032)	0.842 +/- (0.019)
LR	os	0.226 +/- (0.002)	0.681 +/- (0.002)	0.072 +/- (0.001)	0.075 +/- (0.001)	0.494 +/- (0.02)	0.777 +/- (0.014)
LR	cb	0.389 +/- (0.001)	0.758 +/- (0.001)	0.168 +/- (0.001)	0.141 +/- (0.003)	0.708 +/- (0.016)	0.687 +/- (0.014)
LR	sb	0.302 +/- (0.002)	0.775 +/- (0.003)	0.121 +/- (0.002)	0.105 +/- (0.003)	0.592 +/- (0.019)	0.8 +/- (0.012)
LR	ns	0.216 +/- (0.002)	0.779 +/- (0.003)	0.069 +/- (0.002)	0.073 +/- (0.003)	0.477 +/- (0.022)	0.867 +/- (0.013)
LR	os	0.287 +/- (0.003)	0.774 +/- (0.003)	0.105 +/- (0.002)	0.1 +/- (0.002)	0.549 +/- (0.012)	0.83 +/- (0.007)
CH	cb	0.59 +/- (0.027)	0.918 +/- (0.002)	0.6 +/- (0.008)	0.283 +/- (0.046)	0.824 +/- (0.041)	0.845 +/- (0.045)
CH	sb	0.604 +/- (0.045)	0.943 +/- (0.006)	0.634 +/- (0.019)	0.297 +/- (0.069)	0.832 +/- (0.022)	0.923 +/- (0.023)
CH	ns	0.467 +/- (0.044)	0.943 +/- (0.003)	0.359 +/- (0.024)	0.178 +/- (0.043)	0.828 +/- (0.073)	0.922 +/- (0.029)
CH	os	0.486 +/- (0.022)	0.909 +/- (0.003)	0.472 +/- (0.01)	0.207 +/- (0.027)	0.743 +/- (0.031)	0.902 +/- (0.021)
CH	cb	0.664 +/- (0.014)	0.942 +/- (0.013)	0.573 +/- (0.107)	0.395 +/- (0.084)	0.822 +/- (0.073)	0.916 +/- (0.034)
CH	sb	0.607 +/- (0.087)	0.954 +/- (0.003)	0.518 +/- (0.048)	0.552 +/- (0.08)	0.635 +/- (0.125)	0.982 +/- (0.01)
CH	ns	0.375 +/- (0.107)	0.92 +/- (0.047)	0.266 +/- (0.04)	0.335 +/- (0.054)	0.404 +/- (0.151)	0.986 +/- (0.007)
CH	os	0.5 +/- (0.039)	0.925 +/- (0.006)	0.42 +/- (0.05)	0.295 +/- (0.114)	0.683 +/- (0.151)	0.937 +/- (0.048)
SV	cb	0.616 +/- (0.009)	0.92 +/- (0.007)	0.599 +/- (0.011)	0.402 +/- (0.036)	0.715 +/- (0.037)	0.923 +/- (0.015)
SV	sb	0.633 +/- (0.022)	0.947 +/- (0.006)	0.613 +/- (0.022)	0.455 +/- (0.044)	0.706 +/- (0.044)	0.968 +/- (0.008)
SV	ns	0.398 +/- (0.085)	0.933 +/- (0.006)	0.322 +/- (0.021)	0.345 +/- (0.09)	0.45 +/- (0.155)	0.98 +/- (0.014)
SV	os	0.493 +/- (0.026)	0.906 +/- (0.01)	0.426 +/- (0.016)	0.275 +/- (0.048)	0.638 +/- (0.1)	0.94 +/- (0.023)
SV	cb	0.649 +/- (0.014)	0.948 +/- (0.003)	0.625 +/- (0.011)	0.32 +/- (0.021)	0.876 +/- (0.016)	0.886 +/- (0.013)
SV	sb	0.671 +/- (0.026)	0.963 +/- (0.004)	0.617 +/- (0.028)	0.356 +/- (0.036)	0.865 +/- (0.025)	0.948 +/- (0.01)
SV	ns	0.452 +/- (0.02)	0.94 +/- (0.005)	0.275 +/- (0.026)	0.191 +/- (0.022)	0.696 +/- (0.046)	0.95 +/- (0.009)
SV	os	0.518 +/- (0.033)	0.941 +/- (0.006)	0.48 +/- (0.026)	0.241 +/- (0.041)	0.747 +/- (0.074)	0.928 +/- (0.031)
EV	cb	0.618 +/- (0.009)	0.924 +/- (0.003)	0.609 +/- (0.007)	0.401 +/- (0.044)	0.72 +/- (0.048)	0.922 +/- (0.02)
EV	sb	0.647 +/- (0.026)	0.947 +/- (0.005)	0.65 +/- (0.012)	0.394 +/- (0.072)	0.783 +/- (0.039)	0.953 +/- (0.015)
EV	ns	0.438 +/- (0.038)	0.936 +/- (0.007)	0.298 +/- (0.058)	0.275 +/- (0.091)	0.565 +/- (0.144)	0.967 +/- (0.018)
EV	os	0.508 +/- (0.012)	0.903 +/- (0.008)	0.431 +/- (0.02)	0.237 +/- (0.027)	0.719 +/- (0.042)	0.921 +/- (0.016)
EV	cb	0.689 +/- (0.006)	0.951 +/- (0.003)	0.639 +/- (0.021)	0.418 +/- (0.026)	0.824 +/- (0.017)	0.93 +/- (0.009)
EV	sb	0.67 +/- (0.021)	0.963 +/- (0.003)	0.616 +/- (0.021)	0.42 +/- (0.036)	0.791 +/- (0.038)	0.964 +/- (0.007)
EV	ns	0.487 +/- (0.016)	0.947 +/- (0.004)	0.305 +/- (0.025)	0.249 +/- (0.028)	0.647 +/- (0.051)	0.967 +/- (0.007)
EV	os	0.553 +/- (0.014)	0.944 +/- (0.003)	0.5 +/- (0.011)	0.308 +/- (0.035)	0.697 +/- (0.04)	0.954 +/- (0.01)

sessionInfo()

R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] plotly_4.9.2.1     lubridate_1.7.9.2  rgdal_1.5-18       countrycode_1.2.0 
 [5] welchADF_0.3.2     rstatix_0.6.0      ggpubr_0.4.0       scales_1.1.1      
 [9] RColorBrewer_1.1-2 latex2exp_0.4.0    cubelyr_1.0.0      gridExtra_2.3     
[13] ggtext_0.1.1       magrittr_2.0.1     tmap_3.2           sf_0.9-7          
[17] raster_3.4-5       sp_1.4-4           forcats_0.5.0      stringr_1.4.0     
[21] purrr_0.3.4        readr_1.4.0        tidyr_1.1.2        tibble_3.0.6      
[25] tidyverse_1.3.0    huwiwidown_0.0.1   kableExtra_1.3.1   knitr_1.31        
[29] rmarkdown_2.7.3    bookdown_0.21      ggplot2_3.3.3      dplyr_1.0.2       
[33] devtools_2.3.2     usethis_2.0.0     

loaded via a namespace (and not attached):
  [1] readxl_1.3.1       backports_1.2.0    workflowr_1.6.2   
  [4] lwgeom_0.2-5       lazyeval_0.2.2     splines_3.6.3     
  [7] crosstalk_1.1.0.1  leaflet_2.0.3      digest_0.6.27     
 [10] htmltools_0.5.1.1  memoise_1.1.0      openxlsx_4.2.3    
 [13] remotes_2.2.0      modelr_0.1.8       prettyunits_1.1.1 
 [16] colorspace_2.0-0   rvest_0.3.6        haven_2.3.1       
 [19] xfun_0.21          leafem_0.1.3       callr_3.5.1       
 [22] crayon_1.4.0       jsonlite_1.7.2     lme4_1.1-26       
 [25] glue_1.4.2         stars_0.4-3        gtable_0.3.0      
 [28] webshot_0.5.2      car_3.0-10         pkgbuild_1.2.0    
 [31] abind_1.4-5        DBI_1.1.0          Rcpp_1.0.5        
 [34] viridisLite_0.3.0  gridtext_0.1.4     units_0.6-7       
 [37] foreign_0.8-71     htmlwidgets_1.5.3  httr_1.4.2        
 [40] ellipsis_0.3.1     farver_2.0.3       pkgconfig_2.0.3   
 [43] XML_3.99-0.3       dbplyr_2.0.0       labeling_0.4.2    
 [46] tidyselect_1.1.0   rlang_0.4.10       later_1.1.0.1     
 [49] tmaptools_3.1      munsell_0.5.0      cellranger_1.1.0  
 [52] tools_3.6.3        cli_2.3.0          generics_0.1.0    
 [55] broom_0.7.2        evaluate_0.14      yaml_2.2.1        
 [58] processx_3.4.5     leafsync_0.1.0     fs_1.5.0          
 [61] zip_2.1.1          nlme_3.1-150       whisker_0.4       
 [64] xml2_1.3.2         compiler_3.6.3     rstudioapi_0.13   
 [67] curl_4.3           png_0.1-7          e1071_1.7-4       
 [70] testthat_3.0.1     ggsignif_0.6.0     reprex_0.3.0      
 [73] statmod_1.4.35     stringi_1.5.3      highr_0.8         
 [76] ps_1.5.0           desc_1.2.0         lattice_0.20-41   
 [79] Matrix_1.2-18      markdown_1.1       nloptr_1.2.2.2    
 [82] classInt_0.4-3     vctrs_0.3.6        pillar_1.4.7      
 [85] lifecycle_0.2.0    data.table_1.13.2  httpuv_1.5.5      
 [88] R6_2.5.0           promises_1.1.1     KernSmooth_2.23-18
 [91] rio_0.5.16         sessioninfo_1.1.1  codetools_0.2-16  
 [94] dichromat_2.0-0    boot_1.3-25        MASS_7.3-53       
 [97] assertthat_0.2.1   pkgload_1.1.0      rprojroot_2.0.2   
[100] withr_2.4.1        mgcv_1.8-33        parallel_3.6.3    
[103] hms_1.0.0          grid_3.6.3         minqa_1.2.4       
[106] class_7.3-17       carData_3.0-4      git2r_0.27.1      
[109] base64enc_0.1-3