Preparing the data

Last updated: 2020-07-21

Checks: 7 0

Knit directory: MHWflux/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20200117)

The command set.seed(20200117) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: 49fd753

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 49fd753. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/

Unstaged changes:
    Modified:   code/functions.R
    Modified:   data/ALL_cor.Rda
    Modified:   data/events_cor_region.Rds
    Modified:   data/events_cor_season.Rds
    Modified:   shiny/ALL_cor.Rda

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/data-prep.Rmd) and HTML (docs/data-prep.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	49fd753	robwschlegel	2020-07-21	Re-built site.
html	cf24288	robwschlegel	2020-07-17	Build site.
html	ab06b94	robwschlegel	2020-07-14	Build site.
Rmd	8a8180f	robwschlegel	2020-07-14	Performed 12 hour nudge on Wx terns. Completed RMSE calculations, comparisons, and integration into shiny app.
Rmd	c5c1b35	robwschlegel	2020-07-14	Working on RMSE code
Rmd	f4e6cf5	robwschlegel	2020-07-13	Changed Qx units from seconds to days
Rmd	dfeee7a	robwschlegel	2020-07-08	Some notes from today’s meeting
Rmd	8221586	robwschlegel	2020-07-08	Updates to figures 2 and 3
Rmd	40247a1	robwschlegel	2020-06-12	Shifted the SST from GLORYS to NOAA
html	97d0296	robwschlegel	2020-06-02	Build site.
html	0634d98	robwschlegel	2020-06-02	Build site.
html	c6087d9	robwschlegel	2020-06-02	Build site.
Rmd	c839511	robwschlegel	2020-06-02	Working back over some old thoughts
Rmd	9e749bc	robwschlegel	2020-05-28	First pass at connecting the SOM results to the correlations
Rmd	09ce925	robwschlegel	2020-05-20	Some work on comparing the OISST and GLORYS MHWs. They are somewhat different…
Rmd	e837288	robwschlegel	2020-05-19	A bunch of updates to the shiny app
html	12b4f67	robwschlegel	2020-04-29	Build site.
Rmd	bc4ee87	robwschlegel	2020-04-28	Added more functionality to app. Added cloud coverage, speds, and precip-evap.
Rmd	29eb557	robwschlegel	2020-04-27	Much progress on shiny app
html	7c04311	robwschlegel	2020-04-22	Build site.
html	99eda29	robwschlegel	2020-04-16	Build site.
Rmd	e4b9586	robwschlegel	2020-04-16	Re-built site.
Rmd	f963741	robwschlegel	2020-04-15	Some text edits and published the shiny app
Rmd	d22d6a7	robwschlegel	2020-04-14	Text edits
Rmd	7c19a6f	robwschlegel	2020-02-28	Notes from meeting with Ke.
Rmd	b10501e	robwschlegel	2020-02-27	Working on correlation code
html	50eb5a5	robwschlegel	2020-02-26	Build site.
Rmd	891e53a	robwschlegel	2020-02-26	Published site for first time.
Rmd	1be0a1e	robwschlegel	2020-02-26	Completed the data prep for the project
Rmd	bcd165b	robwschlegel	2020-02-26	Writing
Rmd	29883d6	robwschlegel	2020-02-26	Calculated the MHWs from GLORYS data. Am now wrestling with the pipeline for ERA5 loading.
Rmd	c4343c0	robwschlegel	2020-02-26	Pushing quite a few changes
Rmd	80324fe	robwschlegel	2020-02-25	Adding the foundational content to the site

Introduction

Much of the code in this vignette is taken entirely or partially from the study area prep, the MHW prep, and the gridded data prep vignettes from the drivers of MHWs in the NW Atlantic project. Because this process has already been established we are going to put it all together in this one vignette in a more streamlined manner.

All of the libraries and functions used in this vignette, and the project more broadly may be found here.

# get everything up and running in one go
source("code/functions.R")
library(ggpubr)
library(gridExtra)
  # NB: This package was removed from CRAN :(
  # It may be downloaded manually at: https://cran.r-project.org/src/contrib/Archive/SDMTools/
# library(SDMTools) # For finding points within polygons

Study area

A reminder of what the study area looks like. It has been cut into 6 regions, adapted from work by Richaud et al. (2016).

frame_base +
  geom_polygon(data = NWA_coords, alpha = 0.7, size = 2,
               aes(fill = region, colour = region)) +
  geom_polygon(data = map_base, aes(group = group))

Version	Author	Date
50eb5a5	robwschlegel	2020-02-26

Pixels per region

In this study it was decided to use the higher resolution 1/12th degree GLORYS data. This means we will need to re-calculate which pixels fall within which region so we can later determine how to create our average SST time series per region as well as the other averaged heat flux term time series.

# Load one GLORYS file to extract the lon/lat coords
GLORYS_files <- dir("../data/GLORYS", full.names = T, pattern = "MHWflux")
GLORYS_grid <- tidync(GLORYS_files[1]) %>% 
  hyper_tibble() %>% 
  dplyr::rename(lon = longitude, lat = latitude) %>% 
  dplyr::select(lon, lat) %>% 
  unique()

# Load one ERA5 file to get the lon/lat coords
ERA5_files <- dir("../../oliver/data/ERA/ERA5/LWR", full.names = T, pattern = "ERA5")
ERA5_grid <- tidync(ERA5_files[1]) %>% 
  hyper_filter(latitude = dplyr::between(latitude, min(NWA_coords$lat), max(NWA_coords$lat)),
               longitude = dplyr::between(longitude, min(NWA_coords$lon)+360, max(NWA_coords$lon)+360),
               time = index == 1) %>%
  hyper_tibble() %>% 
  dplyr::rename(lon = longitude, lat = latitude) %>% 
  dplyr::select(lon, lat) %>% 
  unique() %>% 
  mutate(lon = lon-360)

# Function for finding and cleaning up points within a given region polygon
pnts_in_region <- function(region_in, product_grid){
  region_sub <- NWA_coords %>% 
    filter(region == region_in)
  coords_in <- pnt.in.poly(pnts = product_grid[,c("lon", "lat")], poly.pnts = region_sub[,c("lon", "lat")]) %>% 
    filter(pip == 1) %>% 
    dplyr::select(-pip) %>% 
    mutate(region = region_in)
  return(coords_in)
}

# Run the function
GLORYS_regions <- plyr::ldply(unique(NWA_coords$region), pnts_in_region, 
                              .parallel = T, product_grid = GLORYS_grid)
saveRDS(GLORYS_regions, "data/GLORYS_regions.Rda")
ERA5_regions <- plyr::ldply(unique(NWA_coords$region), pnts_in_region, 
                            .parallel = T, product_grid = ERA5_grid)
saveRDS(ERA5_regions, "data/ERA5_regions.Rda")

GLORYS_regions <- readRDS("data/GLORYS_regions.Rda")
ERA5_regions <- readRDS("data/ERA5_regions.Rda")

# Combine for visual
both_regions <- rbind(GLORYS_regions, ERA5_regions) %>% 
  mutate(product = c(rep("GLORYS", nrow(GLORYS_regions)),
                     rep("ERA5", nrow(ERA5_regions))))

# Visualise to ensure success
ggplot(NWA_coords, aes(x = lon, y = lat)) +
  # geom_polygon(aes(fill = region), alpha = 0.2) +
  geom_point(data = both_regions, aes(colour = region)) +
  geom_polygon(data = map_base, aes(group = group), show.legend = F) +
    coord_cartesian(xlim = NWA_corners[1:2],
                  ylim = NWA_corners[3:4]) +
  labs(x = NULL, y = NULL) +
  facet_wrap(~product)

Version	Author	Date
c6087d9	robwschlegel	2020-06-02
50eb5a5	robwschlegel	2020-02-26

Average time series per region

With our pixels per region sorted we may now go about creating the average time series for each region from the GLORYS and ERA5 data. First we will load a brick of the data constrained roughly to the study area into memory before assigning the correct pixels to their regions. Once the pixels are assigned we will summarise them into one mean time series per variable per region. These mean time series are what the rest of the analyses will depend on.

The code for loading and processing the GLORYS data.

# Set number of cores
  # NB: This is very RAM heavy, be carfeul with core use
doParallel::registerDoParallel(cores = 25)

# The GLORYS file location
GLORYS_files <- dir("../data/GLORYS", full.names = T, pattern = "MHWflux")
system.time(
GLORYS_all_ts <- load_all_GLORYS_region(GLORYS_files) %>% 
  dplyr::arrange(region, t) %>% 
  mutate(cur_spd = round(sqrt(u^2 + v^2), 2),
         cur_dir = round((270-(atan2(v, u)*(180/pi)))%%360))
) # 187 seconds on 25 cores
saveRDS(GLORYS_all_ts, "data/GLORYS_all_ts.Rda")

The code for the ERA5 data. NB: The ERA5 data are on an hourly 0.25 degree spatiotemporal grid. This loading process constrains them to a daily 0.25 degree grid.

# See the code/workflow script for the code used for ERA5 data prep
# There is too much code to run from an RMarkdown document

MHWs per region

We will be using the SST values from GLORYS for calculating the MHWs and will use the standard Hobday definition with a base period of 1993-01-01 to 2018-12-25. We are using an uneven length year as the data do not quite extend to the end of December. It was decided that the increased accuracy of the climatology from the 2018 year outweighed the negative consideration of having a clim period that excludes a few days of winter.

# Load the data
GLORYS_all_ts <- readRDS("data/GLORYS_all_ts.Rda")

# Calculate the MHWs
GLORYS_region_MHW <- GLORYS_all_ts %>%
  dplyr::select(region:temp) %>% 
  group_by(region) %>%
  nest() %>%
  mutate(clims = map(data, ts2clm,
                     climatologyPeriod = c("1993-01-01", "2018-12-25")),
         events = map(clims, detect_event),
         cats = map(events, category, S = FALSE)) %>%
  select(-data, -clims)

# Save
saveRDS(GLORYS_region_MHW, "data/GLORYS_region_MHW.Rda")
saveRDS(GLORYS_region_MHW, "shiny/GLORYS_region_MHW.Rda")

Ke pointed out however that it may be better to use the NOAA OISST data. The reasoning being that because we are not fully closing the heat budget with GLORYS, there is no particular benefit to using the SST data from that modelled ensemble product. Rather it would be better to use the remotely observed NOAA OISST product as this is a more direct measure of the surface temperature of the ocean. Then again, there is a lot of benefit to just using two products instead of three. Particularly considering that all of the marine variables used here come from the GLORYS product. To that end the GLORYS and OISST MHWs must be compared to see if they are markedly different. If not, we will use the GLORYS SST data.

# Load the MHW calculations from the NOAA OISST data
OISST_region_MHW <- readRDS("../MHWNWA/data/OISST_region_MHW.Rda")

# Load the GLORYS MHW data
GLORYS_region_MHW <- readRDS("data/GLORYS_region_MHW.Rda")

# Extract the time series
OISST_MHW_clim <- OISST_region_MHW %>%
  select(-cats) %>%
  unnest(events) %>%
  filter(row_number() %% 2 == 1) %>%
  unnest(events) %>% 
  mutate(product = "OISST")
GLORYS_MHW_clim <- GLORYS_region_MHW %>%
  select(-cats) %>%
  unnest(events) %>%
  filter(row_number() %% 2 == 1) %>%
  unnest(events) %>% 
  mutate(product = "GLORYS")
MHW_clim <- rbind(OISST_MHW_clim, GLORYS_MHW_clim) %>% 
  mutate(anom = temp-seas)

# Extract the events
OISST_MHW_event <- OISST_region_MHW %>%
  select(-cats) %>%
  unnest(events) %>%
  filter(row_number() %% 2 == 0) %>%
  unnest(events) %>% 
  mutate(product = "OISST")
GLORYS_MHW_event <- GLORYS_region_MHW %>%
  select(-cats) %>%
  unnest(events) %>%
  filter(row_number() %% 2 == 0) %>%
  unnest(events) %>% 
  mutate(product = "GLORYS")
MHW_event <- rbind(OISST_MHW_event, GLORYS_MHW_event) %>% 
  mutate(month_peak = lubridate::month(date_peak, label = T),
         season = case_when(month_peak %in% c("Jan", "Feb", "Mar") ~ "Winter",
                            month_peak %in% c("Apr", "May", "Jun") ~ "Spring",
                            month_peak %in% c("Jul", "Aug", "Sep") ~ "Summer",
                            month_peak %in% c("Oct", "Nov", "Dec") ~ "Autumn"),
         season = factor(season, levels = c("Spring", "Summer", "Autumn", "Winter"))) %>%
  select(-month_peak)

# Compare time series
MHW_clim_wide <- MHW_clim %>% 
  dplyr::select(product, region, doy, t, anom) %>% 
  pivot_wider(names_from = product, values_from = anom) %>% 
  mutate(t_diff = OISST-GLORYS)
MHW_clim_wide_monthly <- MHW_clim_wide %>% 
  mutate(t = round_date(t, unit = "month")) %>%
  group_by(region, t) %>% 
  summarise(t_diff = mean(t_diff, na.rm = T))

# Plot regional anomaly comparison
ts_comp <- ggplot(data = MHW_clim_wide, aes(x = t, y = t_diff)) +
  geom_line(aes(colour = region), alpha = 0.5, show.legend = F) +
  geom_line(data = MHW_clim_wide_monthly, show.legend = F,
            aes(colour = region)) +
  geom_smooth(method = "lm", show.legend = F) +
  facet_wrap(~region) +
  labs(x = "Date", y = "OISST anom. - GLORYS anom.", 
       title = "Daily anomaly comparisons", 
       subtitle = paste0("Faint line shows daily differences, bold line shows monthly.",
                         "\nStraight blue line shows linear trend in daily differences."))

# Plot the comparison of the seasonal and threshold signals
seas_thresh_comp <- MHW_clim %>% 
  dplyr::select(product, region, doy, seas, thresh) %>%
  unique() %>% 
  pivot_wider(names_from = product, values_from = c(seas, thresh)) %>% 
  mutate(seas_diff = seas_OISST-seas_GLORYS,
         thresh_diff = thresh_OISST-thresh_GLORYS) %>% 
  ggplot(aes(x = doy)) +
  geom_line(aes(y = seas_diff, colour = region), linetype = "solid", show.legend = F) +
  geom_line(aes(y = thresh_diff, colour = region), linetype = "dashed", show.legend = F) +
  facet_wrap(~region) +
  labs(x = "Day-of-year (doy)", y = "OISST clim. - GLORYS clim.",
       title = "Difference per day-of-year (doy)",
       subtitle = paste0("Solid line shows seasonal climatology,",
                         "\ndashed line shows threshold."))

# Plot average doy difference histogram
doy_comp <- MHW_clim_wide %>%
  group_by(region, doy) %>% 
  summarise(doy_diff = mean(t_diff)) %>% 
  ggplot(aes(x = doy_diff)) +
  geom_histogram(aes(fill = region), bins = 20, show.legend = F) +
  facet_wrap(~region) +
  labs(x = "Mean difference (OISST - GLORYS) per doy",
       title = "Distribution of mean differences per doy")

# Combine
OISST_GLORYS_ts_comp <- ggarrange(ts_comp,
                                  ggarrange(seas_thresh_comp, doy_comp, ncol = 2, nrow = 1, align = "hv", labels = c("B", "C")),
                                  nrow = 2, labels = "A", align = "hv")
OISST_GLORYS_ts_comp

Version	Author	Date
c6087d9	robwschlegel	2020-06-02

# ggsave(plot = OISST_GLORYS_ts_comp, filename = "output/OISST_GLORYS_ts_comp.png", height = 8, width = 10)

# Compare MHW results
MHW_event_comp <- MHW_event %>% 
  group_by(product, region) %>% 
  summarise(event_count = n(),
            dur = mean(duration),
            int_mean = mean(intensity_mean),
            int_cum_mean = mean(intensity_cumulative),
            int_max = max(intensity_max),
            onset = mean(rate_onset),
            decline = mean(rate_decline)) %>% 
  ungroup() %>%
  arrange(region, product) %>% 
  mutate_if(is.numeric, round, 2) #%>% 
  # pivot_wider(names_from = product, values_from = c(event_count:decline))
  # tableGrob(rows = NULL)

event_count_table <- MHW_event_comp %>% 
  dplyr::select(product:event_count) %>% 
  pivot_wider(names_from = region, values_from = event_count) %>% 
  tableGrob(rows = NULL)

# Boxplot of key variables
box_comp <- MHW_event %>% 
  dplyr::select(product, region, duration, intensity_mean,
                intensity_cumulative, intensity_max, rate_onset, rate_decline) %>% 
  pivot_longer(cols = duration:rate_decline) %>% 
  ggplot(aes(x = region, y = value, fill = region)) +
  geom_boxplot(aes(colour = product), notch = TRUE) +
  scale_colour_manual(values = c("black", "red")) +
  facet_wrap(~name, scales = "free_y") +
  labs(fill = "Region", colour = "Product", x = NULL, y = "Value for given facet",
       title = "Boxplots showing range of values for MHWs in each region")
# box_comp

OISST_GLORYS_MHW_comp <- ggarrange(box_comp, event_count_table, ncol = 1, nrow = 2, 
                                   heights = c(8, 1), labels = c("A", "B"), align = "hv")
OISST_GLORYS_MHW_comp

Version	Author	Date
c6087d9	robwschlegel	2020-06-02

# ggsave(plot = OISST_GLORYS_MHW_comp, filename = "output/OISST_GLORYS_MHW_comp.png", height = 8, width = 10)

# Seasons within regions
MHW_event_season_comp <- MHW_event %>% 
  group_by(product, region, season) %>% 
  summarise(event_count = n(),
            dur = mean(duration),
            int_mean = mean(intensity_mean),
            int_cum_mean = mean(intensity_cumulative),
            int_max = max(intensity_max),
            onset = mean(rate_onset),
            decline = mean(rate_decline)) %>% 
  ungroup() %>%
  arrange(region, season, product) %>% 
  mutate_if(is.numeric, round, 2)
knitr::kable(MHW_event_season_comp)

product	region	season	event_count	dur	int_mean	int_cum_mean	int_max	onset	decline
GLORYS	cbs	Spring	10	17.00	1.64	31.58	3.33	0.15	0.17
OISST	cbs	Spring	7	19.86	1.82	40.54	4.58	0.14	0.23
GLORYS	cbs	Summer	16	13.81	2.02	29.01	3.55	0.23	0.25
OISST	cbs	Summer	19	11.53	2.09	25.76	3.88	0.26	0.28
GLORYS	cbs	Autumn	11	18.64	1.59	30.09	2.41	0.09	0.10
OISST	cbs	Autumn	15	14.33	1.76	25.65	3.16	0.15	0.14
GLORYS	cbs	Winter	7	16.14	1.07	17.68	1.60	0.04	0.04
OISST	cbs	Winter	15	9.20	1.21	11.08	1.69	0.12	0.11
GLORYS	gm	Spring	9	12.22	1.54	21.41	3.61	0.17	0.16
OISST	gm	Spring	6	17.67	1.88	36.19	3.55	0.20	0.11
GLORYS	gm	Summer	15	17.33	1.99	36.49	2.88	0.15	0.25
OISST	gm	Summer	16	12.06	2.09	25.76	3.39	0.21	0.25
GLORYS	gm	Autumn	12	21.92	1.74	38.70	2.44	0.06	0.10
OISST	gm	Autumn	13	16.62	1.69	28.52	2.57	0.13	0.16
GLORYS	gm	Winter	7	20.14	1.41	29.58	2.30	0.04	0.06
OISST	gm	Winter	10	19.20	1.49	30.10	2.96	0.11	0.08
GLORYS	gsl	Spring	10	18.30	1.89	30.65	3.10	0.20	0.22
OISST	gsl	Spring	13	11.23	1.93	20.62	3.90	0.21	0.32
GLORYS	gsl	Summer	18	14.67	1.78	27.19	3.11	0.16	0.23
OISST	gsl	Summer	19	11.95	1.97	24.25	3.57	0.20	0.32
GLORYS	gsl	Autumn	9	22.22	1.47	33.79	2.30	0.08	0.10
OISST	gsl	Autumn	8	15.88	1.53	25.62	2.65	0.18	0.12
GLORYS	gsl	Winter	3	6.33	0.63	3.85	0.90	0.05	0.04
OISST	gsl	Winter	8	15.38	0.61	16.19	2.45	0.12	0.07
GLORYS	mab	Spring	11	14.73	1.91	28.88	3.43	0.20	0.19
OISST	mab	Spring	11	9.09	2.17	20.40	3.84	0.24	0.25
GLORYS	mab	Summer	19	13.00	1.40	18.85	2.41	0.12	0.15
OISST	mab	Summer	20	11.25	1.65	19.26	3.36	0.17	0.18
GLORYS	mab	Autumn	10	19.00	1.63	35.26	3.28	0.08	0.10
OISST	mab	Autumn	12	19.00	1.63	32.97	3.30	0.07	0.16
GLORYS	mab	Winter	12	17.08	1.81	33.39	3.96	0.10	0.14
OISST	mab	Winter	12	19.83	1.74	39.08	4.14	0.12	0.09
GLORYS	nfs	Spring	10	17.00	1.75	30.26	3.16	0.17	0.15
OISST	nfs	Spring	11	12.27	1.82	22.97	3.32	0.20	0.16
GLORYS	nfs	Summer	14	19.86	1.96	40.57	3.41	0.15	0.16
OISST	nfs	Summer	14	17.50	2.03	37.19	3.63	0.20	0.26
GLORYS	nfs	Autumn	7	20.71	1.52	33.26	2.27	0.09	0.09
OISST	nfs	Autumn	10	15.80	1.65	27.17	2.73	0.13	0.12
GLORYS	nfs	Winter	8	30.25	1.03	36.61	2.00	0.04	0.03
OISST	nfs	Winter	9	17.11	1.04	21.91	2.05	0.05	0.10
GLORYS	ss	Spring	7	21.43	1.72	42.00	3.46	0.14	0.29
OISST	ss	Spring	9	12.56	1.91	26.98	3.39	0.21	0.22
GLORYS	ss	Summer	17	13.88	1.88	27.68	3.46	0.15	0.19
OISST	ss	Summer	15	14.87	2.12	31.97	3.15	0.22	0.24
GLORYS	ss	Autumn	8	26.12	1.87	52.53	2.75	0.08	0.07
OISST	ss	Autumn	10	21.50	2.07	47.17	3.37	0.18	0.12
GLORYS	ss	Winter	8	23.50	1.62	40.92	3.05	0.10	0.07
OISST	ss	Winter	7	25.43	1.66	47.73	3.32	0.15	0.09

# Compare top 3 events per region
MHW_event_top <- MHW_event %>% 
  dplyr::select(product, everything()) %>% 
  group_by(product, region) %>% 
  dplyr::top_n(3, intensity_cumulative) %>% 
  ungroup() %>% 
  arrange(region, product) %>% 
    dplyr::select(product, region, event_no, date_start, date_peak, date_end, duration, 
                  intensity_mean, intensity_cumulative, intensity_max, rate_onset, rate_decline)
knitr::kable(MHW_event_top)

product	region	event_no	date_start	date_peak	date_end	duration	intensity_mean	intensity_cumulative	intensity_max	rate_onset	rate_decline
GLORYS	cbs	8	1999-05-18	1999-05-24	1999-07-11	55	2.3509	129.2968	3.3311	0.3042	0.0358
GLORYS	cbs	22	2010-12-14	2010-12-28	2011-02-13	62	1.4210	88.1017	1.7336	0.0387	0.0165
GLORYS	cbs	27	2012-04-15	2012-05-25	2012-06-05	52	1.7086	88.8481	2.2792	0.0309	0.0667
OISST	cbs	5	1999-05-16	1999-06-14	1999-07-04	50	2.4918	124.5916	3.0999	0.0593	0.0618
OISST	cbs	16	2010-12-16	2010-12-22	2011-02-08	55	1.4283	78.5568	1.9537	0.1036	0.0213
OISST	cbs	25	2012-08-04	2012-08-28	2012-08-31	28	2.5828	72.3182	3.2647	0.0642	0.4498
GLORYS	gm	16	2012-04-11	2012-05-28	2012-06-04	55	1.9869	109.2778	3.6064	0.0511	0.3032
GLORYS	gm	32	2015-12-07	2015-12-27	2016-02-12	68	1.6870	114.7178	2.0200	0.0262	0.0163
GLORYS	gm	42	2018-08-05	2018-08-17	2018-09-23	50	2.2495	112.4762	2.8637	0.1101	0.0303
OISST	gm	13	2012-02-17	2012-03-22	2012-05-04	78	1.7031	132.8428	2.9630	0.0498	0.0380
OISST	gm	15	2012-06-19	2012-06-22	2012-07-23	35	2.0962	73.3654	2.9183	0.3863	0.0450
OISST	gm	32	2015-12-12	2015-12-27	2016-01-18	38	1.7902	68.0294	2.3256	0.0697	0.0449
GLORYS	gsl	6	2006-05-02	2006-05-16	2006-06-08	38	2.0900	79.4199	3.0976	0.1162	0.0598
GLORYS	gsl	15	2010-02-19	2010-04-26	2010-05-06	77	1.2746	98.1421	2.3029	0.0274	0.0765
GLORYS	gsl	18	2010-12-03	2010-12-24	2011-03-08	96	1.4750	141.6043	2.3004	0.0571	0.0249
OISST	gsl	20	2010-12-03	2011-01-03	2011-02-09	69	1.4929	103.0067	2.4543	0.0398	0.0575
OISST	gsl	28	2012-07-28	2012-08-27	2012-08-31	35	2.3861	83.5149	3.3690	0.0658	0.4422
OISST	gsl	43	2016-11-07	2016-11-20	2016-12-11	35	1.9243	67.3499	2.6531	0.1013	0.0663
GLORYS	mab	23	2012-02-18	2012-03-24	2012-04-11	54	2.2572	121.8880	3.9615	0.0718	0.1235
GLORYS	mab	24	2012-04-16	2012-04-21	2012-06-04	50	2.0551	102.7560	2.7504	0.1934	0.0296
GLORYS	mab	38	2015-11-03	2015-12-31	2016-01-19	78	2.1563	168.1888	3.2791	0.0351	0.0867
OISST	mab	23	2012-02-17	2012-03-22	2012-06-03	108	2.1392	231.0288	4.1351	0.0816	0.0307
OISST	mab	39	2015-12-10	2015-12-30	2016-01-16	38	2.2643	86.0446	3.3012	0.0967	0.0964
OISST	mab	55	2018-08-29	2018-10-10	2018-10-19	52	1.7723	92.1591	2.6505	0.0316	0.1413
GLORYS	nfs	18	2006-04-16	2006-05-09	2006-06-16	62	1.7691	109.6832	3.0425	0.0906	0.0393
GLORYS	nfs	26	2010-12-15	2011-01-10	2011-03-25	101	1.4222	143.6379	1.9973	0.0353	0.0157
GLORYS	nfs	33	2012-07-30	2012-08-12	2012-11-19	113	2.0479	231.4079	3.0709	0.0941	0.0183
OISST	nfs	22	2010-12-05	2011-01-09	2011-02-13	71	1.6190	114.9496	2.0536	0.0227	0.0388
OISST	nfs	34	2012-07-26	2012-08-08	2012-10-21	88	2.2499	197.9868	3.3667	0.1112	0.0296
OISST	nfs	40	2014-07-27	2014-07-31	2014-08-28	33	2.1731	71.7124	3.6304	0.4104	0.0912
GLORYS	ss	13	2012-04-15	2012-05-28	2012-06-26	73	1.9321	141.0404	3.3902	0.0503	0.0700
GLORYS	ss	36	2017-10-25	2017-11-07	2017-12-21	58	2.2008	127.6466	2.7477	0.0797	0.0271
GLORYS	ss	37	2018-02-19	2018-03-08	2018-04-20	61	2.0423	124.5817	3.0536	0.1007	0.0439
OISST	ss	4	1999-05-29	1999-06-15	1999-07-06	39	2.3591	92.0067	3.3359	0.0981	0.0760
OISST	ss	38	2017-10-25	2017-11-04	2017-12-23	60	2.4411	146.4679	3.3673	0.1512	0.0367
OISST	ss	39	2018-02-11	2018-03-29	2018-05-22	101	2.0386	205.9005	3.3236	0.0417	0.0313

From the figures and tables output from this comparison analysis we may see that there are some larger differences than were expected. Most importantly perhaps is the the MHWs in the OISST data are more numerous, intense, and shorter in duration. It appears that the GLORYS data assimilation methodology smooths the data more than what we see in the remotely sensed SST. In the peer-reviewed write-up this difference between OISST and GLORYS smoothness will need to be discussed. I think it still best to use the GLORYS data as the SST should match more closely to the flux terms considering they are also likely smoothed more than a different more direct sensing would report. There are arguments for and against the use of the SST from GLORYS or NOAA but looking at the SOM results with the different SST products it appears that the OISST allows for more meaningful nodes. So I’m now swapping out the GLORYS SST for NOAA and will see if the correlation results still hold up. (They did)

MHW results

As this is not really the focus of the paper, I will quickly go over the MHW results here. Specifically we are looking at the summary stats for MHWs per region and season, and how they may differ between those classifying groups. The stats created here are used in the results section of the manuscript and some of the musings will be used for the discussion.

MHW_annual_count <- OISST_MHW_event %>%
  mutate(year = year(date_peak)) %>% 
  group_by(year) %>% 
  summarise(count = n())

Clims + anoms per variable

The analyses to come are going to be performed on anomaly values, not the original time series. In order to calculate the anomalies we are first going to need the climatologies for each variable. We will use the Hobday definition of climatology creation and then subtract the expected climatology from the observed values. We are again using the 1993-01-01 to 2018-12-25 base period for these calculations to ensure consistency throughout the project.

# Load the data
GLORYS_all_ts <- readRDS("data/GLORYS_all_ts.Rda")
ERA5_all_ts <- readRDS("data/ERA5_all_ts.Rda")
OISST_all_ts <- readRDS("../MHWNWA/data/OISST_region.Rda")
ALL_ts <- left_join(ERA5_all_ts, GLORYS_all_ts, by = c("region", "t")) %>% 
  left_join(OISST_all_ts, by = c("region", "t"))

# Calculate GLORYS clims and anoms
  # Also give better names to the variables
  # Also convert the Qx terms from seconds to days by multiplying by 86,400
ALL_anom <- ALL_ts %>%
  dplyr::rename(lwr = msnlwrf, swr = msnswrf, lhf = mslhf, 
                shf = msshf, mslp = msl, sst = temp.y) %>% 
  dplyr::select(-wind_dir, -cur_dir, -temp.x) %>% 
  mutate(qnet_mld = (qnet*86400)/(mld*1024*4000),
         lwr_mld = (lwr*86400)/(mld*1024*4000),
         swr_mld = (swr*86400)/(mld*1024*4000),
         lhf_mld = (lhf*86400)/(mld*1024*4000),
         shf_mld = (shf*86400)/(mld*1024*4000),
         mld_1 = 1/mld) %>% 
  pivot_longer(cols = c(-region, -t), names_to = "var", values_to = "val") %>% 
  group_by(region, var) %>%
  nest() %>%
  mutate(clims = map(data, ts2clm, y = val, roundClm = 10,
                     climatologyPeriod = c("1993-01-01", "2018-12-25"))) %>% 
  dplyr::select(-data) %>% 
  unnest(cols = clims) %>%
  mutate(anom = val-seas) %>% 
  ungroup()

# Save
saveRDS(ALL_anom, "data/ALL_anom.Rda")
saveRDS(ALL_anom, "shiny/ALL_anom.Rda")

Cumulative heat flux terms

We also need to create cumulative heatflux terms as well as a few other choice variables. This is done by taking the first day during the MHW and adding the daily values together cumulatively until the end of the event. The daily values are first divided by the MLD on that day as seen above. The MLD value used to divide the daily variables accounts for the water density and specific heat constant: Q/(rho x Cp x hmld), where rho = 1042 and Cp ~= 4000. Th Qnet term calculated this way approximates the air-sea flux term.

The movement terms aren’t very useful and may not be worth including as they don’t really show advection. So rather one can say that the parts of the heating that aren’t explained by anything else could be attributed to advection through the process of elimination. For the moment they are still left in here.

# We're going to switch over to the NOAA OISST data for MHWs now
ALL_anom_cum <- ALL_anom %>%
  dplyr::select(region, var, t, anom) %>% 
  pivot_wider(id_cols = c("region", "var", "t"), names_from = var, 
              values_from = anom, values_fn = mean) %>% 
  dplyr::select(region:tcc, mslp, qnet, p_e, mld, mld_1, qnet_mld:shf_mld) %>% 
  # Change this line depending on GLORYS or NOAA use
  left_join(OISST_MHW_clim[,c("region", "t", "event_no")], by = c("region", "t")) %>% 
  filter(event_no > 0) %>% 
  group_by(region, event_no) %>% 
  mutate_if(is.numeric, cumsum) %>% 
  ungroup() %>% 
  dplyr::select(region, event_no, t, everything()) %>% 
  pivot_longer(cols = c(-region, -event_no, -t), names_to = "var", values_to = "anom") %>% 
  mutate(var = paste0(var,"_cum")) %>% 
  dplyr::select(region, var, event_no, t, anom)

# Save
saveRDS(ALL_anom_cum, "data/ALL_anom_cum.Rda")
saveRDS(ALL_anom_cum, "shiny/ALL_anom_cum.Rda")

In the next vignette we will take the periods of time over which MHWs occurred per region and pair those up with the GLORYS and ERA5 data. This will be used to investigate which drivers are best related to the onset and decline of MHWs.

References

Richaud, B., Kwon, Y.-O., Joyce, T. M., Fratantoni, P. S., and Lentz, S. J. (2016). Surface and bottom temperature and salinity climatology along the continental shelf off the canadian and us east coasts. Continental Shelf Research 124, 165–181.

sessionInfo()

R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8    
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gridExtra_2.3        ggpubr_0.3.0         Metrics_0.1.4       
 [4] ggraph_2.0.3         correlation_0.3.0    tidync_0.2.4        
 [7] heatwaveR_0.4.3.9000 lubridate_1.7.9      forcats_0.5.0       
[10] stringr_1.4.0        dplyr_1.0.0          purrr_0.3.4         
[13] readr_1.3.1          tidyr_1.1.0          tibble_3.0.1        
[16] ggplot2_3.3.2        tidyverse_1.3.0     

loaded via a namespace (and not attached):
  [1] colorspace_1.4-1   ggsignif_0.6.0     ellipsis_0.3.1    
  [4] rio_0.5.16         rprojroot_1.3-2    parameters_0.8.0  
  [7] fs_1.4.1           rstudioapi_0.11    farver_2.0.3      
 [10] graphlayouts_0.7.0 ggrepel_0.8.2      fansi_0.4.1       
 [13] xml2_1.3.2         splines_4.0.2      codetools_0.2-16  
 [16] ncdf4_1.17         doParallel_1.0.15  knitr_1.29        
 [19] polyclip_1.10-0    jsonlite_1.6.1     workflowr_1.6.2   
 [22] broom_0.5.6        dbplyr_1.4.4       ggforce_0.3.2     
 [25] effectsize_0.3.1   compiler_4.0.2     httr_1.4.1        
 [28] backports_1.1.8    Matrix_1.2-18      assertthat_0.2.1  
 [31] lazyeval_0.2.2     cli_2.0.2          later_1.1.0.1     
 [34] tweenr_1.0.1       htmltools_0.5.0    tools_4.0.2       
 [37] igraph_1.2.5       gtable_0.3.0       glue_1.4.1        
 [40] Rcpp_1.0.4.6       carData_3.0-4      cellranger_1.1.0  
 [43] RNetCDF_2.3-1      vctrs_0.3.1        nlme_3.1-147      
 [46] iterators_1.0.12   insight_0.8.5      xfun_0.15         
 [49] openxlsx_4.1.5     rvest_0.3.5        lifecycle_0.2.0   
 [52] ncmeta_0.2.5       rstatix_0.6.0      MASS_7.3-51.6     
 [55] scales_1.1.1       tidygraph_1.2.0    hms_0.5.3         
 [58] promises_1.1.1     parallel_4.0.2     yaml_2.2.1        
 [61] curl_4.3           stringi_1.4.6      highr_0.8         
 [64] bayestestR_0.7.0   foreach_1.5.0      zip_2.0.4         
 [67] rlang_0.4.6        pkgconfig_2.0.3    evaluate_0.14     
 [70] lattice_0.20-41    labeling_0.3       htmlwidgets_1.5.1 
 [73] cowplot_1.0.0      tidyselect_1.1.0   magrittr_1.5      
 [76] R6_2.4.1           generics_0.0.2     DBI_1.1.0         
 [79] mgcv_1.8-31        pillar_1.4.4       haven_2.3.1       
 [82] whisker_0.4        foreign_0.8-79     withr_2.2.0       
 [85] abind_1.4-5        modelr_0.1.8       crayon_1.3.4      
 [88] car_3.0-8          plotly_4.9.2.1     rmarkdown_2.3     
 [91] viridis_0.5.1      grid_4.0.2         readxl_1.3.1      
 [94] data.table_1.12.8  blob_1.2.1         git2r_0.27.1      
 [97] reprex_0.3.0       digest_0.6.25      httpuv_1.5.4      
[100] munsell_0.5.0      viridisLite_0.3.0