Last updated: 2021-01-23

Checks: 7 0

Knit directory: MHWflux/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(666) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 3f6b550. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    data/ALL_anom.Rda
    Ignored:    data/ALL_other.Rda
    Ignored:    data/ALL_ts_anom.Rda
    Ignored:    data/ERA5_down.Rda
    Ignored:    data/ERA5_down_anom.Rda
    Ignored:    data/ERA5_evp_anom.Rda
    Ignored:    data/ERA5_lhf_anom.Rda
    Ignored:    data/ERA5_lwr_anom.Rda
    Ignored:    data/ERA5_mslp_anom.Rda
    Ignored:    data/ERA5_pcp_anom.Rda
    Ignored:    data/ERA5_qnet_anom.Rda
    Ignored:    data/ERA5_shf_anom.Rda
    Ignored:    data/ERA5_swr_MLD.Rda
    Ignored:    data/ERA5_swr_anom.Rda
    Ignored:    data/ERA5_t2m_anom.Rda
    Ignored:    data/ERA5_tcc_anom.Rda
    Ignored:    data/ERA5_u_anom.Rda
    Ignored:    data/ERA5_v_anom.Rda
    Ignored:    data/GLORYS_all_anom.Rda
    Ignored:    data/OISST_all_anom.Rda
    Ignored:    data/packet.Rda
    Ignored:    data/som.Rda
    Ignored:    data/synoptic_states.Rda
    Ignored:    data/synoptic_states_other.Rda

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/polygon-prep.Rmd) and HTML (docs/polygon-prep.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
html 76b55b6 robwschlegel 2020-12-21 Build site.
html 4a00400 robwschlegel 2020-12-21 Build site.
html 65f38bf robwschlegel 2020-12-21 Build site.
html 33f4595 robwschlegel 2020-11-10 Build site.
html a438235 robwschlegel 2020-11-10 Build site.
Rmd cbc5b74 robwschlegel 2020-11-10 Re-built site.
Rmd 54144a7 robwschlegel 2020-09-28 Working on non-numeric node labels
Rmd 0447e0e robwschlegel 2020-09-28 More updates to figures and earlier region capitalisation in workflow
html 8d65758 robwschlegel 2020-09-03 Build site.
Rmd fd8772f robwschlegel 2020-08-13 Finished new GLORYS processing
Rmd c0c599d robwschlegel 2020-08-12 Combining the MHWNWA and MHWflux code bases

Introduction

This vignette contains all of the code that prepares the polygons used to define the different regions in the Northwest Atlantic. The pixels for each variable from the NOAA OISST, ERA5, and GLORYS products found within each region are then spatially averaged in the data preparation vignette. The MHW detection algorithm is run on the individual SST time series as a general representation of the MHWs in those regions, rather than running the algorithm on each individual pixel, which would introduce a host of technical and philosophical problems that I won’t go into here.

# Packages used in this vignette
.libPaths(c("~/R-packages", .libPaths()))
library(tidyverse) # Base suite of functions
library(R.matlab) # For dealing with MATLAB files
# library(marmap) # For bathymetry, but not currently used

Coastal region polygons

The first step in this analysis is to broadly define the coastal regions based on previous research into thermally relevant boundaries. We have chosen to use the paper by Richaud et al. (2016) to do this (https://www.sciencedirect.com/science/article/pii/S0278434316303181#f0010). Being the wonderful person that he is, Benjamin forwarded us the polygons [Richaud et al. (2016); Figure 2] from this paper. The only hiccup being that they are a MATLAB file so we must convert them to an R format. It should be noted that these areas were designed to not encompass depths deeper than 600 m as the investigators were interested in characterising the climatologies for the shelf and upper slope regions of the north east coast of North America. This works for our research purposes as well.

# Load the MATLAB file
NWA_polygons <- readMat("metadata/boundaries.mat")

# Remove index list items and attributes
NWA_polygons[grepl("[.]",names(NWA_polygons))] <- NULL
# attributes(NWA_polygons) <- NULL

# Function for neatly converting list items into dataframes
mat_col <- function(vec){
  df <- as.data.frame(vec)
  df$region <- substr(colnames(df)[1], 2, nchar(colnames(df)[1]))
  colnames(df)[1] <- strtrim(colnames(df)[1], 1)
  df <- df[c(2,1)]
  return(df)
}

# Create multiple smaller data.frames
coords_1 <- cbind(mat_col(NWA_polygons[1]), mat_col(NWA_polygons[2])[2])
coords_2 <- cbind(mat_col(NWA_polygons[3]), mat_col(NWA_polygons[4])[2])
coords_3 <- cbind(mat_col(NWA_polygons[5]), mat_col(NWA_polygons[6])[2])
coords_4 <- cbind(mat_col(NWA_polygons[7]), mat_col(NWA_polygons[8])[2])
coords_5 <- cbind(mat_col(NWA_polygons[9]), mat_col(NWA_polygons[10])[2])
coords_6 <- cbind(mat_col(NWA_polygons[11]), mat_col(NWA_polygons[12])[2])

# Combine them into one full dataframe
NWA_coords_base <- rbind(coords_1, coords_2, coords_3, coords_4, coords_5, coords_6)
colnames(NWA_coords_base) <- c("region", "lon", "lat")
NWA_coords_base$region <- toupper(NWA_coords_base$region)

With our polygons switched over from MATLAB to R we now want to visualise them to ensure that everything has gone smoothly.

# The base map
map_base <- ggplot2::fortify(maps::map(fill = TRUE, col = "grey80", plot = FALSE)) %>%
  dplyr::rename(lon = long) %>%
  mutate(group = ifelse(lon > 180, group+9999, group),
         lon = ifelse(lon > 180, lon-360, lon)) %>% 
  dplyr::select(-region, -subregion)
# saveRDS(map_base, "metadata/map_base.Rda")

# Quick map
ggplot(data = NWA_coords_base, aes(x = lon, y = lat)) +
  geom_polygon(aes(colour = region, fill = region), size = 1.5, alpha = 0.2) +
  geom_polygon(data = map_base, aes(group = group), show.legend = F) +
  coord_cartesian(xlim = c(min(NWA_coords_base$lon)-2, max(NWA_coords_base$lon)+2),
                  ylim = c(min(NWA_coords_base$lat)-2, max(NWA_coords_base$lat)+2)) +
  labs(x = NULL, y = NULL, colour = "Region", fill = "Region") +
  theme(legend.position = "bottom")

Version Author Date
a438235 robwschlegel 2020-11-10
8d65758 robwschlegel 2020-09-03

The region abbreviations are: “GM” for Gulf of Maine, “GLS” for Gulf of St. Lawrence, “LS” for Labrador Shelf, “MAB” for Mid-Atlantic Bight, “NFS” for Newfoundland Shelf, and “SS” for Scotian Shelf.

Cabot Strait

It was decided that because we are interested in the geography of the regions, and not just their temperature regimes, the Cabot Strait needed to be defined apart from the Gulf of St. Lawrence region. To do this we will simply snip the “GSL” polygon into two pieces at its narrowest point.

# Extract the gsl region only
gsl_sub <- NWA_coords_base[NWA_coords_base$region == "GSL",]

# Add a simple integer column for ease of plotting
gsl_sub$row_count <- 1:nrow(gsl_sub)

ggplot(data = gsl_sub, aes(x = lon, y = lat)) +
  geom_polygon(aes(fill = region)) +
  geom_label(aes(label = row_count)) +
  labs(x = NULL, y = NULL)

Version Author Date
a438235 robwschlegel 2020-11-10
8d65758 robwschlegel 2020-09-03

It appears from the crude figure above that we should pinch the polygon off into two separate shapes at row 6 and 10.

# Create smaller gsl polygon
gsl_new <- NWA_coords_base[NWA_coords_base$region == "GSL",] %>% 
  slice(-c(7:9))

# Create new cbs (Cabot Strait) polygon
cbs <- NWA_coords_base[NWA_coords_base$region == "GSL",] %>% 
  slice(6:10) %>% 
  mutate(region = "CBS")

# Attach the new polygons to the original polygons
NWA_coords_cabot <- NWA_coords_base %>% 
  filter(region != "GSL") %>% 
  rbind(., gsl_new, cbs)

# Plot the new areas to ensure everything worked
ggplot(data = NWA_coords_cabot, aes(x = lon, y = lat)) +
  geom_polygon(aes(colour = region, fill = region), size = 1.5, alpha = 0.2) +
  geom_polygon(data = map_base, aes(group = group), show.legend = F) +
  coord_cartesian(xlim = c(min(NWA_coords_cabot$lon)-2, max(NWA_coords_cabot$lon)+2),
                  ylim = c(min(NWA_coords_cabot$lat)-2, max(NWA_coords_cabot$lat)+2)) +
  labs(x = NULL, y = NULL, colour = "Region", fill = "Region") +
  theme(legend.position = "bottom")

Version Author Date
a438235 robwschlegel 2020-11-10
8d65758 robwschlegel 2020-09-03

Labrador Shelf

After running through a series of self-organising map (SOM) experiments in an earlier version of this project it was decided that the Labrador Shelf (LS) region needs to be excluded from the study. This is predominantly because the inclusion of this region into the SOM study makes it too difficult for the machine to make sense of the patterns it is seeing. We concluded that this was because of the strong, sometimes unrelated processes happening in the Gulf Stream vs. the Labrador Sea. Because we are primarily concerned with the Atlantic coast, we prioritised the more southern regions over the Labrador shelf region. The code below shows what these final regions look like.

# Filter out the ls region
NWA_coords <- NWA_coords_cabot %>% 
  filter(region != "LS")

# Save these final study region coordinates
# saveRDS(NWA_coords, "metadata/NWA_coords.Rda")

# Plot the new areas to ensure everything worked
NWA_study_area <- ggplot(data = NWA_coords, aes(x = lon, y = lat)) +
  geom_polygon(aes(colour = region, fill = region), size = 1.5, alpha = 0.2) +
  geom_polygon(data = map_base, aes(group = group), show.legend = F) +
  coord_cartesian(xlim = c(min(NWA_coords$lon)-2, max(NWA_coords$lon)+2),
                  ylim = c(min(NWA_coords$lat)-2, max(NWA_coords$lat)+0.5),
                  expand = FALSE) +
  scale_x_continuous(breaks = seq(-70, -50, 10),
                     labels = c("70°W", "60°W", "50°W"),
                     position = "top") +
  scale_y_continuous(breaks = c(40, 50),
                     labels = scales::unit_format(suffix = "°N", sep = "")) +
  scale_colour_brewer(palette = "Dark2") +
  scale_fill_brewer(palette = "Dark2") +
  labs(x = NULL, y = NULL, colour = "Region", fill = "Region") +
  theme_bw() +
  theme(legend.position = c(0.6, 0.2),
        legend.background = element_rect(colour = "black"),
        legend.direction = "horizontal")
# ggsave(NWA_study_area, filename = "output/NWA_study_area.pdf", height = 5, width = 6)
# ggsave(NWA_study_area, filename = "output/NWA_study_area.png", height = 5, width = 6)

# Visualise
NWA_study_area

Version Author Date
a438235 robwschlegel 2020-11-10
8d65758 robwschlegel 2020-09-03

Study area extent

The final step in this vignette is to establish a consistent study area for this project based on our regions. We’ll simply extend the study area by the nearest 2 whole degrees of longitude and latitude from the furthest edges of the polygons, as seen in the figure above. This will encompass broad synoptic scale variables that may be driving MHWs in our study regions, but should not be so broad as to begin to account for teleconnections, which are currently beyond the scope of this project. Because we want to exclude as much of the Labrador Sea as possible, we will only extend the northern edge of the study area by 0.5 degrees of latitude from the northernmost point of our study regions.

# Set the max/min lon/at values
lon_min <- round(min(NWA_coords$lon)-2)
lon_max <- round(max(NWA_coords$lon)+2)
lat_min <- round(min(NWA_coords$lat)-2)
lat_max <- round(max(NWA_coords$lat)+0.5)

# Combine and save
NWA_corners <- c(lon_min, lon_max, lat_min, lat_max)
# saveRDS(NWA_corners, file = "metadata/NWA_corners.Rda")

# Print the values
NWA_corners
[1] -80 -41  32  52

Up next is the creation of the SST time series for each of the regions and the calculation of the marine heatwaves (MHWs). This work is continued in the data preparation vignette.

References

Richaud, B., Kwon, Y.-O., Joyce, T. M., Fratantoni, P. S., and Lentz, S. J. (2016). Surface and bottom temperature and salinity climatology along the continental shelf off the canadian and us east coasts. Continental Shelf Research 124, 165–181.


sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.1 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8    
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] R.matlab_3.6.2  forcats_0.5.0   stringr_1.4.0   dplyr_1.0.2    
 [5] purrr_0.3.4     readr_1.4.0     tidyr_1.1.2     tibble_3.0.4   
 [9] ggplot2_3.3.3   tidyverse_1.3.0

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5         lubridate_1.7.9.2  assertthat_0.2.1   rprojroot_2.0.2   
 [5] digest_0.6.27      R6_2.5.0           cellranger_1.1.0   backports_1.2.1   
 [9] reprex_0.3.0       evaluate_0.14      httr_1.4.2         pillar_1.4.7      
[13] rlang_0.4.10       readxl_1.3.1       rstudioapi_0.13    whisker_0.4       
[17] R.utils_2.10.1     R.oo_1.24.0        rmarkdown_2.6      labeling_0.4.2    
[21] munsell_0.5.0      broom_0.7.3        compiler_4.0.3     httpuv_1.5.5      
[25] modelr_0.1.8       xfun_0.20          pkgconfig_2.0.3    htmltools_0.5.1   
[29] tidyselect_1.1.0   workflowr_1.6.2    fansi_0.4.1        crayon_1.3.4      
[33] dbplyr_2.0.0       withr_2.3.0        later_1.1.0.1      R.methodsS3_1.8.1 
[37] grid_4.0.3         jsonlite_1.7.2     gtable_0.3.0       lifecycle_0.2.0   
[41] DBI_1.1.1          git2r_0.28.0       magrittr_2.0.1     scales_1.1.1      
[45] cli_2.2.0          stringi_1.5.3      farver_2.0.3       fs_1.5.0          
[49] promises_1.1.1     xml2_1.3.2         ellipsis_0.3.1     generics_0.1.0    
[53] vctrs_0.3.6        RColorBrewer_1.1-2 tools_4.0.3        glue_1.4.2        
[57] maps_3.3.0         hms_1.0.0          yaml_2.2.1         colorspace_2.0-0  
[61] rvest_0.3.6        knitr_1.30         haven_2.3.1