Last updated: 2020-09-03
Checks: 7 0
Knit directory: MHWflux/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20200117)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version d1c9bad. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: data/ALL_anom.Rda
Ignored: data/ALL_other.Rda
Ignored: data/ALL_ts_anom.Rda
Ignored: data/ERA5_evp_anom.Rda
Ignored: data/ERA5_lhf_anom.Rda
Ignored: data/ERA5_lwr_anom.Rda
Ignored: data/ERA5_mslp_anom.Rda
Ignored: data/ERA5_pcp_anom.Rda
Ignored: data/ERA5_qnet_anom.Rda
Ignored: data/ERA5_shf_anom.Rda
Ignored: data/ERA5_swr_anom.Rda
Ignored: data/ERA5_t2m_anom.Rda
Ignored: data/ERA5_tcc_anom.Rda
Ignored: data/ERA5_u_anom.Rda
Ignored: data/ERA5_v_anom.Rda
Ignored: data/GLORYS_all_anom.Rda
Ignored: data/OISST_all_anom.Rda
Ignored: data/packet.Rda
Ignored: data/som.Rda
Ignored: data/synoptic_states.Rda
Ignored: data/synoptic_states_other.Rda
Unstaged changes:
Modified: analysis/_site.yml
Modified: analysis/node-summary.Rmd
Modified: code/workflow.R
Modified: talk/WHOI_talk.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/data-prep.Rmd
) and HTML (docs/data-prep.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | de9c829 | robwschlegel | 2020-09-01 | More work on follow up analyses |
Rmd | 66f3736 | robwschlegel | 2020-08-26 | More edits to the figures |
Rmd | d61c819 | robwschlegel | 2020-08-20 | The full range of SOM figures |
Rmd | 6b15849 | robwschlegel | 2020-08-18 | More integration and optimisation of the code bases |
Rmd | 84d1750 | robwschlegel | 2020-08-17 | Finished porting over the SOM code. Still needs thorough testing. |
Rmd | b029400 | robwschlegel | 2020-08-15 | Re-calculated all of the base data |
Rmd | 4b04d7a | robwschlegel | 2020-08-14 | Renamed some files in preparation for the file runs on the SOM sized data |
Rmd | fd8772f | robwschlegel | 2020-08-13 | Finished new GLORYS processing |
Rmd | 8da255f | robwschlegel | 2020-08-13 | Removed ERA5 grid and calculated the FNN match between the GLORYS and OISST grids |
Rmd | 3fca8ac | robwschlegel | 2020-08-13 | Removed the OISST and GLORYS MHW comparison code |
Rmd | f41bd24 | robwschlegel | 2020-08-12 | Experimenting with other NetCDF packages |
Rmd | c0c599d | robwschlegel | 2020-08-12 | Combining the MHWNWA and MHWflux code bases |
Rmd | 192c062 | robwschlegel | 2020-07-29 | Propogated some small fixes through the pipeline to the results. Updated the shiny app accordingly. |
html | 9304ba0 | robwschlegel | 2020-07-21 | Build site. |
Rmd | 49fd753 | robwschlegel | 2020-07-21 | Re-built site. |
html | cf24288 | robwschlegel | 2020-07-17 | Build site. |
html | ab06b94 | robwschlegel | 2020-07-14 | Build site. |
Rmd | 8a8180f | robwschlegel | 2020-07-14 | Performed 12 hour nudge on Wx terns. Completed RMSE calculations, comparisons, and integration into shiny app. |
Rmd | c5c1b35 | robwschlegel | 2020-07-14 | Working on RMSE code |
Rmd | f4e6cf5 | robwschlegel | 2020-07-13 | Changed Qx units from seconds to days |
Rmd | dfeee7a | robwschlegel | 2020-07-08 | Some notes from today’s meeting |
Rmd | 8221586 | robwschlegel | 2020-07-08 | Updates to figures 2 and 3 |
Rmd | 40247a1 | robwschlegel | 2020-06-12 | Shifted the SST from GLORYS to NOAA |
html | 97d0296 | robwschlegel | 2020-06-02 | Build site. |
html | 0634d98 | robwschlegel | 2020-06-02 | Build site. |
html | c6087d9 | robwschlegel | 2020-06-02 | Build site. |
Rmd | c839511 | robwschlegel | 2020-06-02 | Working back over some old thoughts |
Rmd | 9e749bc | robwschlegel | 2020-05-28 | First pass at connecting the SOM results to the correlations |
Rmd | 09ce925 | robwschlegel | 2020-05-20 | Some work on comparing the OISST and GLORYS MHWs. They are somewhat different… |
Rmd | e837288 | robwschlegel | 2020-05-19 | A bunch of updates to the shiny app |
html | 12b4f67 | robwschlegel | 2020-04-29 | Build site. |
Rmd | bc4ee87 | robwschlegel | 2020-04-28 | Added more functionality to app. Added cloud coverage, speds, and precip-evap. |
Rmd | 29eb557 | robwschlegel | 2020-04-27 | Much progress on shiny app |
html | 7c04311 | robwschlegel | 2020-04-22 | Build site. |
html | 99eda29 | robwschlegel | 2020-04-16 | Build site. |
Rmd | e4b9586 | robwschlegel | 2020-04-16 | Re-built site. |
Rmd | f963741 | robwschlegel | 2020-04-15 | Some text edits and published the shiny app |
Rmd | d22d6a7 | robwschlegel | 2020-04-14 | Text edits |
Rmd | 7c19a6f | robwschlegel | 2020-02-28 | Notes from meeting with Ke. |
Rmd | b10501e | robwschlegel | 2020-02-27 | Working on correlation code |
html | 50eb5a5 | robwschlegel | 2020-02-26 | Build site. |
Rmd | 891e53a | robwschlegel | 2020-02-26 | Published site for first time. |
Rmd | 1be0a1e | robwschlegel | 2020-02-26 | Completed the data prep for the project |
Rmd | bcd165b | robwschlegel | 2020-02-26 | Writing |
Rmd | 29883d6 | robwschlegel | 2020-02-26 | Calculated the MHWs from GLORYS data. Am now wrestling with the pipeline for ERA5 loading. |
Rmd | c4343c0 | robwschlegel | 2020-02-26 | Pushing quite a few changes |
Rmd | 80324fe | robwschlegel | 2020-02-25 | Adding the foundational content to the site |
The purpose of this vignette is to walk through the steps taken to prepare the data from the NOAA OISST, ERA5, and GLORYS products for the following analyses. The statistical analyses may be found in the MHWs vs. heat flux vignette, and the SOM analysis may be found in the SOM vignette.
All of the libraries and functions used in this vignette, and the project more broadly, may be found here.
# get everything up and running in one go
source("code/functions.R")
library(ggpubr)
library(gridExtra)
library(FNN)
A reminder of what the study area looks like. It has been cut into 6 regions, adapted from work by Richaud et al. (2016).
frame_base +
geom_polygon(data = NWA_coords, alpha = 0.7, size = 2,
aes(fill = region, colour = region)) +
geom_polygon(data = map_base, aes(group = group))
Version | Author | Date |
---|---|---|
50eb5a5 | robwschlegel | 2020-02-26 |
In this study it was decided to use the higher resolution 1/12th degree GLORYS data. This means we will need to re-calculate which pixels fall within which region so we can later determine how to create our average SST time series per region as well as the other averaged heat flux term time series.
For the SOM we will want to have everything on the same 1/4 degree grid. For the ERA5 data this is an easy shift to make, but for the GLORYS 1/12 degree data we want to ensure that these smaller pixels are being matched correctly to the nearest centre of an OISST pixels. To do this we will create an index of pairings using a fastest nearest neighbour search.
# Load one OISST file to extract the lon/lat coords
OISST_grid <- readRDS("../data/OISST/daily/1982/daily.1982-01-01.Rda") %>%
dplyr::select(lon, lat) %>%
filter(lon >= NWA_corners[1], lon <= NWA_corners[2],
lat >= NWA_corners[3], lat <= NWA_corners[4]) %>%
unique() %>%
mutate(OISST_index = 1:n()) # For merging to GLORYS grid
saveRDS(OISST_grid[,1:2], "metadata/OISST_grid.Rda")
# Load one high-res GLORYS file to extract the lon/lat coords
GLORYS_grid_base <- tidync(GLORYS_files[1]) %>%
hyper_tibble() %>%
dplyr::rename(lon = longitude, lat = latitude) %>%
filter(lon >= NWA_corners[1], lon <= NWA_corners[2],
lat >= NWA_corners[3], lat <= NWA_corners[4]) %>%
dplyr::select(lon, lat) %>%
unique()
# Add on the nearest OISST coords
GLORYS_grid <- GLORYS_grid_base %>%
mutate(OISST_index = as.vector(knnx.index(as.matrix(OISST_grid[,c("lon", "lat")]),
as.matrix(GLORYS_grid_base[,c("lon", "lat")]), k = 1))) %>%
left_join(OISST_grid, by = "OISST_index") %>%
dplyr::select(-OISST_index) %>%
dplyr::rename(lon = lon.x, lat = lat.x, lon_OISST = lon.y, lat_OISST = lat.y)
saveRDS(GLORYS_grid, "metadata/GLORYS_grid.Rda")
# Load one ERA5 file to get the lon/lat coords
ERA5_grid <- tidync(ERA5_lhf_files[1]) %>%
hyper_filter(latitude = dplyr::between(latitude, NWA_corners[3], NWA_corners[4]),
longitude = dplyr::between(longitude, NWA_corners[1]+360, NWA_corners[2]+360),
time = index == 1) %>%
hyper_tibble() %>%
dplyr::rename(lon = longitude, lat = latitude) %>%
dplyr::select(lon, lat) %>%
mutate(lon = lon-360) %>% # Change back to +-180 scale
mutate(lon = lon+0.125, lat = lat-0.125) %>% # Regrid to match OISST coords
filter(lon >= NWA_corners[1], lon <= NWA_corners[2],
lat >= NWA_corners[3], lat <= NWA_corners[4]) %>%
unique()
saveRDS(ERA5_grid, "metadata/ERA5_grid.Rda")
# Test visuals
# Choose one
# OISST_grid %>%
# GLORYS_grid %>%
ERA5_grid %>%
ggplot(aes(x = lon, y = lat)) +
geom_raster() +
coord_quickmap(xlim = NWA_corners[1:2], ylim = NWA_corners[3:4], expand = F)
# Function for finding and cleaning up points within a given region polygon
points_in_region <- function(region_in, product_grid){
region_sub <- NWA_coords %>%
filter(region == region_in)
coords_in <- product_grid %>%
mutate(in_grid = sp::point.in.polygon(point.x = product_grid[["lon"]], point.y = product_grid[["lat"]],
pol.x = region_sub[["lon"]], pol.y = region_sub[["lat"]])) %>%
filter(in_grid >= 1) %>%
mutate(region = region_in) %>%
dplyr::select(lon, lat, region)
return(coords_in)
}
# Run the function
registerDoParallel(cores = 10)
OISST_regions <- plyr::ldply(unique(NWA_coords$region), points_in_region,
.parallel = T, product_grid = OISST_grid)
saveRDS(OISST_regions, "metadata/OISST_regions.Rda")
GLORYS_regions <- plyr::ldply(unique(NWA_coords$region), points_in_region,
.parallel = T, product_grid = GLORYS_grid)
saveRDS(GLORYS_regions, "metadata/GLORYS_regions.Rda")
ERA5_regions <- plyr::ldply(unique(NWA_coords$region), points_in_region,
.parallel = T, product_grid = OISST_grid) # We only want ocean pixels
saveRDS(ERA5_regions, "metadata/ERA5_regions.Rda")
# Test visuals
# Choose one
# OISST_regions %>%
# GLORYS_regions %>%
ERA5_regions %>%
ggplot(aes(x = lon, y = lat)) +
geom_raster(aes(fill = region)) +
scale_x_continuous(breaks = seq(-80, -40, 2)) +
scale_y_continuous(breaks = seq(32, 52, 2)) +
coord_quickmap(xlim = NWA_corners[1:2], ylim = NWA_corners[3:4], expand = T)
OISST_regions <- readRDS("metadata/OISST_regions.Rda")
GLORYS_regions <- readRDS("metadata/GLORYS_regions.Rda")
ERA5_regions <- readRDS("metadata/ERA5_regions.Rda")
# Combine for visual
ALL_regions <- rbind(OISST_regions, GLORYS_regions, ERA5_regions) %>%
mutate(product = c(rep("OISST", nrow(OISST_regions)),
rep("GLORYS", nrow(GLORYS_regions)),
rep("ERA5", nrow(ERA5_regions))))
# Visualise to ensure success
ggplot(NWA_coords, aes(x = lon, y = lat)) +
geom_polygon(data = map_base, aes(group = group), show.legend = F) +
geom_point(data = ALL_regions, aes(colour = region), size = 0.1) +
coord_quickmap(xlim = NWA_corners[1:2], ylim = NWA_corners[3:4], expand = F) +
labs(x = NULL, y = NULL, colour = "Region") +
facet_wrap(~product) +
guides(colour = guide_legend(override.aes = list(shape = 15, size = 5)))
ggsave(filename = "output/NWA_product_regions.pdf", height = 5, width = 18)
With our pixels per region determined we may now go about creating the average time series for each region from the OISST, GLORYS, and ERA5 data.
Up first for processing are the OISST data. The full brick of data within the study area will be loaded first. Then the pixels within the regions are pulled out and individual time series are made from each and saved. Finally the climatologies and anomalies for each pixel in the study area are calculated and saved. The region clims+anoms are calculated for all of the data at the same time near the end of this vignette.
# The files with data in the study area
OISST_files_sub <- data.frame(files = OISST_files,
lon = c(seq(0.125, 179.875, by = 0.25),
seq(-179.875, -0.125, by = 0.25))) %>%
filter(lon >= NWA_corners[1], lon <= NWA_corners[2]) %>%
mutate(files = as.character(files))
# Load the full grid for the study area
system.time(
OISST_all <- plyr::ldply(OISST_files_sub$files, .fun = load_OISST, .parallel = TRUE)
) # 16 seconds
# Check that the coords were subsetted correctly
NWA_corners
min(OISST_all$lon); max(OISST_all$lon)
min(OISST_all$lat); max(OISST_all$lat)
# test visual
OISST_all %>%
filter(t == "1993-01-01") %>%
ggplot(aes(x = lon, y = lat)) +
geom_raster(aes(fill = temp)) +
scale_x_continuous(breaks = seq(-80, -40, 2)) +
scale_y_continuous(breaks = seq(32, 52, 2)) +
coord_quickmap(xlim = NWA_corners[1:2], ylim = NWA_corners[3:4], expand = T)
# Create the region time series and save
system.time(
OISST_all_ts <- OISST_all %>%
right_join(OISST_regions, by = c("lon", "lat")) %>%
group_by(region, t) %>%
summarise(temp = mean(temp, na.rm = T), .groups = "drop")
) # 14 seconds
saveRDS(OISST_all_ts, "data/OISST_all_ts.Rda")
# test visual
OISST_all_ts %>%
ggplot(aes(x = t, y = temp)) +
geom_line(aes(colour = region)) +
facet_wrap(~region) +
labs(x = NULL, y = "Temp. (C)")
# Calculate the clims and anoms for each pixel
registerDoParallel(cores = 25)
system.time(
OISST_all_anom <- OISST_all %>%
mutate(var = "temp") %>%
dplyr::rename(val = temp) %>%
plyr::ddply(., c("lon", "lat", "var"), calc_clim_anom, .parallel = T, point_accuracy = 2)
) # 214 seconds
saveRDS(OISST_all_anom, "data/OISST_all_anom.Rda")
# test visual
OISST_all_anom %>%
filter(t == "1998-06-18") %>%
ggplot(aes(x = lon, y = lat)) +
geom_polygon(data = map_base, aes(group = group), show.legend = F) +
geom_tile(aes(fill = anom)) +
scale_x_continuous(breaks = seq(-80, -40, 2)) +
scale_y_continuous(breaks = seq(32, 52, 2)) +
scale_fill_gradient2(low = "blue", high = "red") +
labs(x = NULL, y = NULL) +
coord_quickmap(xlim = NWA_corners[1:2], ylim = NWA_corners[3:4], expand = T)
We are using the 1/12 degree GLORYS product for the calculations of the region time series in order to better capture the sub-mesoscale processes that may be associated with the driving of MHWs. For the broad synoptic scale data given to the SOM we will be using the 1/4 degree GLORYS product as it needs to be the same resolution as the other products being used.
# NB: This is very RAM heavy, be careful with core use
# Process and save the region time series
registerDoParallel(cores = 25)
system.time(
GLORYS_all_ts <- plyr::ldply(GLORYS_files, load_GLORYS, .parallel = T, region = T) %>%
dplyr::arrange(region, t) %>%
mutate(cur_spd = round(sqrt(u^2 + v^2), 4),
cur_dir = round((270-(atan2(v, u)*(180/pi)))%%360))
) # 202 seconds on 25 cores
saveRDS(GLORYS_all_ts, "data/GLORYS_all_ts.Rda")
# test visual
GLORYS_all_ts %>%
ggplot(aes(x = t, y = mld)) +
geom_line(aes(colour = region)) +
facet_wrap(~region) +
labs(x = NULL, y = "MLD (m)")
# Load and prep the GLORYS data for the entire study area
registerDoParallel(cores = 25)
system.time(
GLORYS_all <- plyr::ldply(GLORYS_files, load_GLORYS,
.parallel = T, .paropts = c(.inorder = FALSE))
) # 293 seconds on 25 cores
# test visual
GLORYS_all %>%
filter(t == "1993-01-01") %>%
ggplot(aes(x = lon, y = lat)) +
geom_raster(aes(fill = mld)) +
scale_x_continuous(breaks = seq(-80, -40, 2)) +
scale_y_continuous(breaks = seq(32, 52, 2)) +
coord_quickmap(xlim = NWA_corners[1:2], ylim = NWA_corners[3:4], expand = T)
# Calculates clims+anoms and save
registerDoParallel(cores = 25)
system.time(
GLORYS_all_anom <- GLORYS_all %>%
pivot_longer(cols = c(-lon, -lat, -t), names_to = "var", values_to = "val") %>%
plyr::ddply(., c("lon", "lat", "var"), calc_clim_anom, .parallel = T,
point_accuracy = 6, .paropts = c(.inorder = FALSE))
) # 732 seconds on 25 cores
saveRDS(GLORYS_all_anom, "data/GLORYS_all_anom.Rda")
# test visual
GLORYS_all_anom %>%
filter(t == "1993-06-18",
var == "mld") %>%
ggplot(aes(x = lon, y = lat)) +
geom_polygon(data = map_base, aes(group = group), show.legend = F) +
geom_tile(aes(fill = anom)) +
scale_x_continuous(breaks = seq(-80, -40, 2)) +
scale_y_continuous(breaks = seq(32, 52, 2)) +
scale_fill_gradient2(low = "blue", high = "red") +
labs(x = NULL, y = NULL) +
coord_quickmap(xlim = NWA_corners[1:2], ylim = NWA_corners[3:4], expand = T)
Note that the ERA5 data are on an hourly 0.25 degree spatiotemporal grid. This loading process constrains them to a daily 0.25 degree grid that matches the OISST data before finally converting them to a single time series per region.
# See the code/workflow script for the code used for ERA5 data prep
# There is too much code to run from an RMarkdown document as each variable must loaded individually
We will be using the SST values from GLORYS for calculating the MHWs and will use the standard Hobday definition with a base period of 1993-01-01 to 2018-12-25. We are using an uneven length year as the data do not quite extend to the end of December. It was decided that the increased accuracy of the climatology from the 2018 year outweighed the negative consideration of having a clim period that excludes a few days of winter.
# Load the data
OISST_all_ts <- readRDS("data/OISST_all_ts.Rda")
# Calculate the MHWs
OISST_region_MHW <- OISST_all_ts %>%
group_by(region) %>%
nest() %>%
mutate(clims = map(data, ts2clm,
climatologyPeriod = c("1993-01-01", "2018-12-25")),
events = map(clims, detect_event),
cats = map(events, category, S = FALSE)) %>%
select(-data, -clims)
# Save
saveRDS(OISST_region_MHW, "data/OISST_region_MHW.Rda")
saveRDS(OISST_region_MHW, "shiny/OISST_region_MHW.Rda")
There were some minor changes between the OISST grid used for the original SOM work and this version of the project. So first we want to compare the results of both to see by how much they differ. It should be very little.
# New and old MHW results
OISST_MHW_new <- readRDS("data/OISST_region_MHW.Rda") %>%
select(-cats) %>%
unnest(events) %>%
filter(row_number() %% 2 == 0) %>%
unnest(events)
OISST_MHW_old <- readRDS("../MHWNWA/data/OISST_region_MHW.Rda") %>%
select(-cats) %>%
unnest(events) %>%
filter(row_number() %% 2 == 0) %>%
unnest(events)
## Max count of events per region
# New
OISST_MHW_new %>%
group_by(region) %>%
summarise(count = max(event_no))
# A tibble: 6 x 2
region count
<chr> <int>
1 cbs 53
2 gm 47
3 gsl 49
4 mab 57
5 nfs 44
6 ss 41
# Old
OISST_MHW_old %>%
group_by(region) %>%
summarise(count = max(event_no))
# A tibble: 6 x 2
region count
<chr> <dbl>
1 cbs 56
2 gm 45
3 gsl 48
4 mab 55
5 nfs 44
6 ss 41
## Average metrics
# New
OISST_MHW_new %>%
dplyr::select(region, duration, intensity_mean, intensity_max, intensity_cumulative, rate_onset, rate_decline) %>%
group_by(region) %>%
summarise_all("mean", na.rm = T)
# A tibble: 6 x 7
region duration intensity_mean intensity_max intensity_cumul… rate_onset
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 cbs 13.3 1.76 2.12 24.7 0.172
2 gm 14.6 1.79 2.13 26.9 0.146
3 gsl 12.7 1.63 2.05 21.6 0.168
4 mab 13.4 1.76 2.12 25.0 0.161
5 nfs 15.8 1.69 2.07 28.4 0.153
6 ss 18.1 1.99 2.43 37.7 0.192
# … with 1 more variable: rate_decline <dbl>
# Old
OISST_MHW_old %>%
dplyr::select(region, duration, intensity_mean, intensity_max, intensity_cumulative, rate_onset, rate_decline) %>%
group_by(region) %>%
summarise_all("mean", na.rm = T)
# A tibble: 6 x 7
region duration intensity_mean intensity_max intensity_cumul… rate_onset
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 cbs 12.7 1.73 2.09 23.6 0.178
2 gm 15.7 1.81 2.18 28.9 0.166
3 gsl 13.0 1.66 2.08 22.2 0.185
4 mab 14.4 1.77 2.13 26.8 0.155
5 nfs 15.7 1.69 2.06 28.2 0.154
6 ss 17.8 1.98 2.41 37.3 0.196
# … with 1 more variable: rate_decline <dbl>
We can see from the results above that there are some minor differences between the results when using the different grids. This is regrettable as it will almost certainly have some sort of impact on the SOM results. But I think it is necessary that we press forward with results based on the OISST grid as it makes the most sense w.r.t. picking pixels that fall within the region polygons. This will also allow for better subsetting of pixels within the chosen study area.
Because the MHW results themselves are not really the focus of the paper, I will quickly go over them here. Specifically we are looking at the summary stats for MHWs per region and season, and how they may differ between those classifying groups. The stats created here are used in the results section of the manuscript and some of the musings will be used for the discussion.
# Annual count of MHWs - first MHW
OISST_MHW_event %>%
mutate(year = year(date_peak)) %>%
group_by(year) %>%
summarise(count = n()) %>%
data.frame()
year count
1 1994 4
2 1998 1
3 1999 20
4 2000 6
5 2001 2
6 2002 2
7 2003 5
8 2004 1
9 2005 6
10 2006 12
11 2007 6
12 2008 8
13 2009 9
14 2010 15
15 2011 10
16 2012 61
17 2013 19
18 2014 21
19 2015 24
20 2016 28
21 2017 17
22 2018 14
# Differences between regions
summary_region <- OISST_MHW_event %>%
dplyr::select(region, duration, intensity_mean, intensity_max,
intensity_cumulative, rate_onset, rate_decline) %>%
mutate(region = toupper(region)) %>%
dplyr::rename(group = region) %>%
group_by(group) %>%
# mutate(count = n()) %>%
ungroup() %>%
pivot_longer(duration:rate_decline) %>%
group_by(group, name) %>%
summarise(value_mean = round(mean(value), 2),
value_sd = round(sd(value), 2), .groups = "drop") %>%
unite("value_summary", value_mean:value_sd, sep = "±") %>%
pivot_wider(names_from = name, values_from = value_summary) %>%
dplyr::rename(i_cum = intensity_cumulative, i_max = intensity_max,
i_mean = intensity_mean, r_decline = rate_decline, r_onset = rate_onset) %>%
dplyr::select(group, duration, i_mean, i_max, i_cum, r_onset, r_decline)
# Differences between seasons
summary_season <- OISST_MHW_event %>%
ungroup() %>%
dplyr::select(season, duration, intensity_mean, intensity_max,
intensity_cumulative, rate_onset, rate_decline) %>%
dplyr::rename(group = season) %>%
group_by(group) %>%
# mutate(count = n()) %>%
ungroup() %>%
pivot_longer(duration:rate_decline) %>%
group_by(group, name) %>%
summarise(value_mean = round(mean(value), 2),
value_sd = round(sd(value), 2), .groups = "drop") %>%
unite("value_summary", value_mean:value_sd, sep = "±") %>%
pivot_wider(names_from = name, values_from = value_summary) %>%
dplyr::rename(i_cum = intensity_cumulative, i_max = intensity_max,
i_mean = intensity_mean, r_decline = rate_decline, r_onset = rate_onset) %>%
dplyr::select(group, duration, i_mean, i_max, i_cum, r_onset, r_decline)
# Table showing the mean +- SD per region and season
summary_region_season <- rbind(summary_region, summary_season)
knitr::kable(summary_region_season)
group | duration | i_mean | i_max | i_cum | r_onset | r_decline |
---|---|---|---|---|---|---|
CBS | 13.3±11.32 | 1.76±0.46 | 2.12±0.7 | 24.72±24.31 | 0.17±0.17 | 0.2±0.2 |
GM | 14.57±9.39 | 1.79±0.32 | 2.13±0.51 | 26.92±19.1 | 0.15±0.11 | 0.16±0.13 |
GSL | 12.73±10.86 | 1.63±0.61 | 2.05±0.84 | 21.64±19.6 | 0.17±0.12 | 0.24±0.23 |
MAB | 13.44±15.21 | 1.76±0.33 | 2.12±0.59 | 25.03±32.66 | 0.16±0.09 | 0.18±0.17 |
NFS | 15.82±16.77 | 1.69±0.44 | 2.07±0.68 | 28.41±34.69 | 0.15±0.12 | 0.17±0.21 |
SS | 18.12±16.32 | 1.99±0.31 | 2.43±0.51 | 37.66±35.73 | 0.19±0.16 | 0.19±0.25 |
Spring | 13.28±10.24 | 1.93±0.4 | 2.43±0.69 | 26.76±23.56 | 0.2±0.13 | 0.24±0.16 |
Summer | 13.04±11.16 | 1.97±0.33 | 2.38±0.53 | 26.86±25.47 | 0.21±0.15 | 0.27±0.28 |
Autumn | 15.84±13.49 | 1.68±0.26 | 2.01±0.45 | 28.19±27.48 | 0.12±0.07 | 0.13±0.09 |
Winter | 16.74±19.08 | 1.32±0.45 | 1.61±0.68 | 26.04±38.34 | 0.11±0.09 | 0.09±0.05 |
The analyses to come are going to be performed on anomaly values, not the original time series. In order to calculate the anomalies we are first going to need the climatologies for each variable. We will use the Hobday definition of climatology creation and then subtract the expected climatology from the observed values. We are again using the 1993-01-01 to 2018-12-25 base period for these calculations to ensure consistency throughout the project.
# Load the data
GLORYS_all_ts <- readRDS("data/GLORYS_all_ts.Rda")
ERA5_all_ts <- readRDS("data/ERA5_all_ts.Rda")
OISST_all_ts <- readRDS("data/OISST_all_ts.Rda")
ALL_ts <- left_join(ERA5_all_ts, GLORYS_all_ts, by = c("region", "t")) %>%
left_join(OISST_all_ts, by = c("region", "t")) %>%
filter(t <= "2018-12-31")
# Calculate GLORYS clims and anoms
# Also give better names to the variables
# Also convert the Qx terms from seconds to days by multiplying by 86,400
ALL_ts_anom <- ALL_ts %>%
dplyr::rename(lwr = msnlwrf, swr = msnswrf, lhf = mslhf,
shf = msshf, mslp = msl, sst = temp.y) %>%
dplyr::select(-wind_dir, -cur_dir, -temp.x) %>%
mutate(qnet_mld = (qnet*86400)/(mld*1024*4000),
lwr_mld = (lwr*86400)/(mld*1024*4000),
swr_mld = (swr*86400)/(mld*1024*4000),
lhf_mld = (lhf*86400)/(mld*1024*4000),
shf_mld = (shf*86400)/(mld*1024*4000),
mld_1 = 1/mld) %>%
pivot_longer(cols = c(-region, -t), names_to = "var", values_to = "val") %>%
group_by(region, var) %>%
nest() %>%
mutate(clims = map(data, ts2clm, y = val, roundClm = 10,
climatologyPeriod = c("1993-01-01", "2018-12-25"))) %>%
dplyr::select(-data) %>%
unnest(cols = clims) %>%
mutate(anom = val-seas) %>%
ungroup()
# Save
saveRDS(ALL_ts_anom, "data/ALL_ts_anom.Rda")
saveRDS(ALL_ts_anom, "shiny/ALL_ts_anom.Rda")
We also need to create cumulative heatflux terms as well as a few other choice variables. This is done by taking the first day during the MHW and adding the daily values together cumulatively until the end of the event. The daily values are first divided by the MLD on that day as seen above. The MLD value used to divide the daily variables accounts for the water density and specific heat constant: Q/(rho x Cp x hmld), where rho = 1042 and Cp ~= 4000. Th Qnet term calculated this way approximates the air-sea flux term.
The movement terms aren’t very useful and may not be worth including as they don’t really show advection. So rather one can say that the parts of the heating that aren’t explained by anything else could be attributed to advection through the process of elimination. For the moment they are still left in here.
# We're going to switch over to the NOAA OISST data for MHWs now
ALL_ts_anom_cum <- ALL_ts_anom %>%
dplyr::select(region, var, t, anom) %>%
pivot_wider(id_cols = c("region", "t"), names_from = var, values_from = anom) %>%
dplyr::select(region:tcc, mslp, qnet, p_e, mld, mld_1, qnet_mld:shf_mld) %>%
left_join(OISST_MHW_clim[,c("region", "t", "event_no")], by = c("region", "t")) %>%
filter(event_no > 0) %>%
group_by(region, event_no) %>%
mutate_if(is.numeric, cumsum) %>%
ungroup() %>%
dplyr::select(region, event_no, t, everything()) %>%
pivot_longer(cols = c(-region, -event_no, -t), names_to = "var", values_to = "anom") %>%
mutate(var = paste0(var,"_cum")) %>%
dplyr::select(region, var, event_no, t, anom)
# Save
saveRDS(ALL_ts_anom_cum, "data/ALL_ts_anom_cum.Rda")
saveRDS(ALL_ts_anom_cum, "shiny/ALL_ts_anom_cum.Rda")
In the MHWs vs. heat flux vignette we will take the periods of time over which MHWs occurred per region and pair those up with the GLORYS and ERA5 data. This will be used to investigate which drivers are best related to the onset and decline of MHWs. In the SOM vignette we will look for how the mean synoptic states during MHWs can best be clustered together.
Richaud, B., Kwon, Y.-O., Joyce, T. M., Fratantoni, P. S., and Lentz, S. J. (2016). Surface and bottom temperature and salinity climatology along the continental shelf off the canadian and us east coasts. Continental Shelf Research 124, 165–181.
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.6 LTS
Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so
locale:
[1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8
[5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8
[7] LC_PAPER=en_CA.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] FNN_1.1.3 gridExtra_2.3 ggpubr_0.4.0
[4] doParallel_1.0.15 iterators_1.0.12 foreach_1.5.0
[7] Metrics_0.1.4 yasomi_0.3 proxy_0.4-24
[10] e1071_1.7-3 ggraph_2.0.3 correlation_0.3.0
[13] tidync_0.2.4 heatwaveR_0.4.4.9000 lubridate_1.7.9
[16] data.table_1.13.0 forcats_0.5.0 stringr_1.4.0
[19] dplyr_1.0.1 purrr_0.3.4 readr_1.3.1
[22] tidyr_1.1.1 tibble_3.0.3 ggplot2_3.3.2
[25] tidyverse_1.3.0
loaded via a namespace (and not attached):
[1] colorspace_1.4-1 ggsignif_0.6.0 ellipsis_0.3.1 class_7.3-17
[5] rio_0.5.16 rprojroot_1.3-2 parameters_0.8.2 fs_1.5.0
[9] rstudioapi_0.11 farver_2.0.3 graphlayouts_0.7.0 ggrepel_0.8.2
[13] fansi_0.4.1 xml2_1.3.2 codetools_0.2-16 ncdf4_1.17
[17] knitr_1.29 polyclip_1.10-0 jsonlite_1.7.0 workflowr_1.6.2
[21] broom_0.7.0 dbplyr_1.4.4 ggforce_0.3.2 effectsize_0.3.2
[25] compiler_4.0.2 httr_1.4.2 backports_1.1.8 assertthat_0.2.1
[29] lazyeval_0.2.2 cli_2.0.2 later_1.1.0.1 tweenr_1.0.1
[33] htmltools_0.5.0 tools_4.0.2 igraph_1.2.5 gtable_0.3.0
[37] glue_1.4.1 Rcpp_1.0.5 carData_3.0-4 cellranger_1.1.0
[41] RNetCDF_2.3-1 vctrs_0.3.2 insight_0.9.0 xfun_0.16
[45] openxlsx_4.1.5 rvest_0.3.6 lifecycle_0.2.0 ncmeta_0.2.5
[49] rstatix_0.6.0 MASS_7.3-51.6 scales_1.1.1 tidygraph_1.2.0
[53] hms_0.5.3 promises_1.1.1 yaml_2.2.1 curl_4.3
[57] stringi_1.4.6 highr_0.8 bayestestR_0.7.2 zip_2.0.4
[61] rlang_0.4.7 pkgconfig_2.0.3 evaluate_0.14 labeling_0.3
[65] htmlwidgets_1.5.1 tidyselect_1.1.0 magrittr_1.5 R6_2.4.1
[69] generics_0.0.2 DBI_1.1.0 pillar_1.4.6 haven_2.3.1
[73] whisker_0.4 foreign_0.8-79 withr_2.2.0 abind_1.4-5
[77] modelr_0.1.8 crayon_1.3.4 car_3.0-8 utf8_1.1.4
[81] plotly_4.9.2.1 rmarkdown_2.3 viridis_0.5.1 grid_4.0.2
[85] readxl_1.3.1 blob_1.2.1 git2r_0.27.1 reprex_0.3.0
[89] digest_0.6.25 httpuv_1.5.4 munsell_0.5.0 viridisLite_0.3.0