Last updated: 2020-11-24
Checks: 7 0
Knit directory: Cant_eMLR/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20200707)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version faf3903. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rproj.user/
Ignored: data/
Ignored: dump/
Untracked files:
Untracked: code/plotting_functions_old.R
Untracked: code/test_scripts/read_World_Ocean_Atlas_2013_Clement.Rmd
Unstaged changes:
Deleted: analysis/read_World_Ocean_Atlas_2013_Clement.Rmd
Modified: code/mapping_functions.R
Modified: code/plotting_functions.R
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/read_GLODAPv2_2020.Rmd
) and HTML (docs/read_GLODAPv2_2020.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | faf3903 | jens-daniel-mueller | 2020-11-24 | cleaned plotting and mapping functions |
html | b9cbd4f | jens-daniel-mueller | 2020-11-20 | Build site. |
html | e563eef | jens-daniel-mueller | 2020-11-20 | Build site. |
html | 14c98aa | jens-daniel-mueller | 2020-11-20 | Build site. |
Rmd | caca4af | jens-daniel-mueller | 2020-11-20 | rebuild with child setup document |
html | d971a86 | jens-daniel-mueller | 2020-11-20 | Build site. |
Rmd | b1e3815 | jens-daniel-mueller | 2020-11-20 | changed subsetting |
html | b164378 | jens-daniel-mueller | 2020-11-20 | Build site. |
Rmd | 13e474c | jens-daniel-mueller | 2020-11-20 | test child document |
html | da23a48 | jens-daniel-mueller | 2020-11-20 | Build site. |
Rmd | 5d76688 | jens-daniel-mueller | 2020-11-20 | test child document |
html | 2116b56 | jens-daniel-mueller | 2020-11-18 | Build site. |
html | 5ee4361 | jens-daniel-mueller | 2020-11-17 | Build site. |
Rmd | 3a67ed0 | jens-daniel-mueller | 2020-11-17 | test rebuild |
html | fc70a34 | jens-daniel-mueller | 2020-11-10 | Build site. |
html | b170b79 | jens-daniel-mueller | 2020-11-03 | Build site. |
html | 76fd767 | jens-daniel-mueller | 2020-10-14 | Build site. |
Rmd | 4e84499 | jens-daniel-mueller | 2020-10-14 | GLODAP link in new tab |
html | 411a35b | jens-daniel-mueller | 2020-10-02 | Build site. |
Rmd | d0dd42e | jens-daniel-mueller | 2020-10-02 | included tco2 f flag 2 data |
html | 0d38979 | jens-daniel-mueller | 2020-10-01 | Build site. |
html | 80e9b77 | jens-daniel-mueller | 2020-09-30 | Build site. |
Rmd | 1794dc7 | jens-daniel-mueller | 2020-09-30 | added 3d globe vis |
html | 4578cfa | jens-daniel-mueller | 2020-09-19 | Build site. |
html | d59f716 | jens-daniel-mueller | 2020-09-18 | Build site. |
Rmd | 27809b8 | jens-daniel-mueller | 2020-09-18 | restricted tco2 f flag to 2 |
html | f5a20ac | jens-daniel-mueller | 2020-09-17 | Build site. |
html | cf64460 | jens-daniel-mueller | 2020-09-17 | Build site. |
html | d43d9e2 | jens-daniel-mueller | 2020-09-16 | Build site. |
html | 1f3319a | jens-daniel-mueller | 2020-09-16 | Build site. |
html | da91d98 | jens-daniel-mueller | 2020-09-10 | Build site. |
html | a34a239 | jens-daniel-mueller | 2020-09-10 | Build site. |
html | c3ffdfd | jens-daniel-mueller | 2020-09-08 | Build site. |
Rmd | 04ee730 | jens-daniel-mueller | 2020-09-08 | rebuild after revision |
html | 10f7e8f | jens-daniel-mueller | 2020-09-08 | Build site. |
Rmd | 03e5b6b | jens-daniel-mueller | 2020-09-08 | cleaned script |
html | 95d4e9b | jens-daniel-mueller | 2020-09-08 | Build site. |
Rmd | d3ddb89 | jens-daniel-mueller | 2020-09-08 | considered qc flag |
html | a50f053 | jens-daniel-mueller | 2020-09-07 | Build site. |
Rmd | fe673db | jens-daniel-mueller | 2020-09-07 | fitted model by era and saved era coeffcients |
html | da445a6 | jens-daniel-mueller | 2020-09-04 | Build site. |
html | de4515a | jens-daniel-mueller | 2020-09-04 | Build site. |
Rmd | 8c0f6ac | jens-daniel-mueller | 2020-09-04 | Revised data cleaning |
html | 5511301 | jens-daniel-mueller | 2020-09-04 | Build site. |
Rmd | 41d3ba3 | jens-daniel-mueller | 2020-09-04 | explanation added |
html | fd28faa | jens-daniel-mueller | 2020-09-04 | Build site. |
Rmd | d998206 | jens-daniel-mueller | 2020-09-04 | era as factor |
html | 1dcd293 | jens-daniel-mueller | 2020-09-04 | Build site. |
Rmd | bac1f3a | jens-daniel-mueller | 2020-09-04 | no flagging removal, but plotted |
html | fa11a74 | jens-daniel-mueller | 2020-09-02 | Build site. |
Rmd | da4907d | jens-daniel-mueller | 2020-09-02 | all flagged variables |
html | 429aab3 | jens-daniel-mueller | 2020-09-01 | Build site. |
html | f4216dd | jens-daniel-mueller | 2020-09-01 | Build site. |
Rmd | 8f3ce45 | jens-daniel-mueller | 2020-09-01 | rebuild without PO4 star selection, oxygen only |
html | b41a87d | jens-daniel-mueller | 2020-09-01 | Build site. |
Rmd | 4dcad9a | jens-daniel-mueller | 2020-09-01 | plots subsetting maps with facets |
html | 86bada3 | jens-daniel-mueller | 2020-09-01 | Build site. |
Rmd | a0b3cf0 | jens-daniel-mueller | 2020-09-01 | only cstar flagging checked in map |
html | d895004 | jens-daniel-mueller | 2020-09-01 | Build site. |
Rmd | be824b0 | jens-daniel-mueller | 2020-09-01 | talk subsetting stats flagging resolved |
html | 3d9b20f | jens-daniel-mueller | 2020-09-01 | Build site. |
Rmd | b047c4b | jens-daniel-mueller | 2020-09-01 | subsetting stats for talk and po4 |
html | 4049eb0 | jens-daniel-mueller | 2020-09-01 | Build site. |
Rmd | 782c595 | jens-daniel-mueller | 2020-09-01 | cleaning process maps |
html | c575584 | jens-daniel-mueller | 2020-08-31 | Build site. |
Rmd | 824d7b7 | jens-daniel-mueller | 2020-08-31 | add tco2 sampling grid to map |
html | 13a76d5 | jens-daniel-mueller | 2020-08-28 | Build site. |
Rmd | 2e6a4ca | jens-daniel-mueller | 2020-08-28 | XXX |
html | 27404de | jens-daniel-mueller | 2020-08-27 | Build site. |
html | b6d0e6a | jens-daniel-mueller | 2020-08-27 | Build site. |
html | f40e48b | jens-daniel-mueller | 2020-08-26 | Build site. |
html | ec20f40 | jens-daniel-mueller | 2020-08-24 | Build site. |
html | 0772e79 | jens-daniel-mueller | 2020-08-21 | Build site. |
Rmd | e04b03d | jens-daniel-mueller | 2020-08-21 | Individual cruise plots with all relevant parameters and histograms in eMLR |
html | 5ffe187 | jens-daniel-mueller | 2020-08-20 | Build site. |
html | 1064ef8 | jens-daniel-mueller | 2020-08-19 | Build site. |
html | 3574489 | jens-daniel-mueller | 2020-08-18 | Build site. |
Rmd | 1a4d1a8 | jens-daniel-mueller | 2020-08-18 | parameters included in mapping |
html | 66382bf | jens-daniel-mueller | 2020-08-18 | Build site. |
Rmd | 7460b1f | jens-daniel-mueller | 2020-08-18 | assign A16 cruises to GO_SHIP |
html | 29a537a | jens-daniel-mueller | 2020-08-18 | Build site. |
Rmd | 7fb61d5 | jens-daniel-mueller | 2020-08-18 | rerun with all parameters in one file |
html | 4faaf29 | jens-daniel-mueller | 2020-08-18 | Build site. |
Rmd | a24a414 | jens-daniel-mueller | 2020-08-18 | re-run with all parameters in one list |
html | 9eaa2fb | jens-daniel-mueller | 2020-08-18 | Build site. |
Rmd | 079873a | jens-daniel-mueller | 2020-08-18 | rerun with parameters in list |
html | a3b6b68 | jens-daniel-mueller | 2020-08-13 | Build site. |
html | 00c2120 | jens-daniel-mueller | 2020-08-13 | Build site. |
html | dec27b2 | jens-daniel-mueller | 2020-08-13 | Build site. |
html | 6fd35c8 | jens-daniel-mueller | 2020-08-13 | Build site. |
Rmd | 634ae4e | jens-daniel-mueller | 2020-08-13 | cleaning and reference to parameters and masks |
html | bf69270 | jens-daniel-mueller | 2020-08-13 | Build site. |
html | 1176c9a | jens-daniel-mueller | 2020-08-12 | Build site. |
html | 1a078e7 | jens-daniel-mueller | 2020-08-12 | Build site. |
Rmd | 6344282 | jens-daniel-mueller | 2020-08-12 | updated lon conversion, variable names and masks |
html | f143b2d | jens-daniel-mueller | 2020-08-12 | Build site. |
html | 2d179e7 | jens-daniel-mueller | 2020-08-12 | Build site. |
html | 5d33341 | jens-daniel-mueller | 2020-08-11 | Build site. |
html | 8a010ca | jens-daniel-mueller | 2020-08-11 | Build site. |
html | a01041a | jens-daniel-mueller | 2020-08-11 | Build site. |
html | 1b3f5e2 | jens-daniel-mueller | 2020-08-11 | Build site. |
Rmd | e646f85 | jens-daniel-mueller | 2020-08-11 | parameterization in separate file |
html | e18e59a | jens-daniel-mueller | 2020-08-10 | Build site. |
html | 7d7900a | jens-daniel-mueller | 2020-08-07 | Build site. |
html | ddf44d7 | jens-daniel-mueller | 2020-07-31 | Build site. |
Rmd | b4fa5a8 | jens-daniel-mueller | 2020-07-31 | link to all cruise plots included |
html | 243f31c | jens-daniel-mueller | 2020-07-31 | Build site. |
Rmd | 977782e | jens-daniel-mueller | 2020-07-31 | minor plot update, open questions added |
html | 9dc5d7f | jens-daniel-mueller | 2020-07-29 | Build site. |
html | 21524b4 | jens-daniel-mueller | 2020-07-29 | Build site. |
html | ff17968 | jens-daniel-mueller | 2020-07-29 | Build site. |
Rmd | cb5074e | jens-daniel-mueller | 2020-07-29 | subset Cant to GLODAP before merging |
html | 8d71a56 | jens-daniel-mueller | 2020-07-29 | Build site. |
Rmd | 82db969 | jens-daniel-mueller | 2020-07-29 | format and plots revised |
html | 2e2412d | jens-daniel-mueller | 2020-07-29 | Build site. |
Rmd | a9fbf3a | jens-daniel-mueller | 2020-07-29 | formating |
html | 5274d0b | jens-daniel-mueller | 2020-07-29 | Build site. |
Rmd | 304aa74 | jens-daniel-mueller | 2020-07-29 | plot improvements |
html | 75da185 | jens-daniel-mueller | 2020-07-29 | Build site. |
Rmd | 26db597 | jens-daniel-mueller | 2020-07-29 | cleaned code and output |
html | 2e08795 | jens-daniel-mueller | 2020-07-24 | Build site. |
html | c84424c | jens-daniel-mueller | 2020-07-23 | Build site. |
Rmd | 8d47b71 | jens-daniel-mueller | 2020-07-23 | era names changed |
html | 2df2065 | jens-daniel-mueller | 2020-07-23 | Build site. |
Rmd | fa350cf | jens-daniel-mueller | 2020-07-23 | predictor correlation plots, bin2d map plots |
html | 9d1d67d | jens-daniel-mueller | 2020-07-23 | Build site. |
Rmd | 3b6658b | jens-daniel-mueller | 2020-07-23 | predictor correlation plots, bin2d map plots |
html | 2e3691a | jens-daniel-mueller | 2020-07-23 | Build site. |
Rmd | 26bdc0a | jens-daniel-mueller | 2020-07-23 | new era label, predictor correlation check started |
html | 556e6cc | jens-daniel-mueller | 2020-07-23 | Build site. |
html | c1289a2 | jens-daniel-mueller | 2020-07-23 | Build site. |
html | 2890e73 | jens-daniel-mueller | 2020-07-23 | Build site. |
html | fdfa7b9 | jens-daniel-mueller | 2020-07-22 | Build site. |
html | bb9c002 | jens-daniel-mueller | 2020-07-21 | Build site. |
Rmd | d2ed0f8 | jens-daniel-mueller | 2020-07-21 | harmonied lat lon labeling |
html | b47adc2 | jens-daniel-mueller | 2020-07-20 | Build site. |
Rmd | 366d7d5 | jens-daniel-mueller | 2020-07-20 | update plots |
html | 25812b9 | jens-daniel-mueller | 2020-07-20 | Build site. |
Rmd | e45a882 | jens-daniel-mueller | 2020-07-20 | basins included, higher grid resolution for maps |
html | 6a6431e | jens-daniel-mueller | 2020-07-20 | Build site. |
Rmd | 50be8de | jens-daniel-mueller | 2020-07-20 | formating |
html | b9769e8 | jens-daniel-mueller | 2020-07-20 | Build site. |
Rmd | 82120fe | jens-daniel-mueller | 2020-07-20 | nitrate, gamma included, 65N removed |
html | fe55909 | jens-daniel-mueller | 2020-07-19 | Build site. |
Rmd | 66540a9 | jens-daniel-mueller | 2020-07-19 | formating and color scale |
html | 61b3390 | jens-daniel-mueller | 2020-07-18 | Build site. |
Rmd | d3032c7 | jens-daniel-mueller | 2020-07-18 | formating |
html | 22b588c | jens-daniel-mueller | 2020-07-18 | Build site. |
html | 3964fd7 | jens-daniel-mueller | 2020-07-17 | Build site. |
Rmd | 4685ad7 | jens-daniel-mueller | 2020-07-17 | 150m depth limit implemented |
html | 56c3ed9 | jens-daniel-mueller | 2020-07-14 | Build site. |
Rmd | aed89af | jens-daniel-mueller | 2020-07-14 | added Rmd for MappedClimatologies |
html | 74d4abd | jens-daniel-mueller | 2020-07-14 | Build site. |
html | 1c511ce | jens-daniel-mueller | 2020-07-14 | Build site. |
Rmd | e03016e | jens-daniel-mueller | 2020-07-14 | split read in per data set |
Loading libraries specific to the the analysis performed in this section.
library(lubridate)
Main data source for this project is GLODAPv2.2020_Merged_Master_File.csv
downloaded from glodap.info in June 2020.
GLODAP <-
read_csv(
here::here(
"data/input/GLODAPv2_2020",
"GLODAPv2.2020_Merged_Master_File.csv"
),
na = "-9999",
col_types = cols(.default = col_double())
)
# select relevant columns
GLODAP <- GLODAP %>%
select(cruise:talkqc)
# create date column
GLODAP <- GLODAP %>%
mutate(date = ymd(paste(year, month, day))) %>%
relocate(date)
# harmonize column names
GLODAP <- GLODAP %>%
rename(sal = salinity,
tem = temperature)
# harmonize coordinates
GLODAP <- GLODAP %>%
rename(lon = longitude,
lat = latitude) %>%
mutate(lon = if_else(lon < 20, lon + 360, lon))
# remove irrelevant columns
GLODAP <- GLODAP %>%
select(-c(month:minute,
maxsampdepth, bottle, sigma0:sigma4,
nitrite:nitritef))
For merging with other data sets, all observations were grouped into latitude intervals of:
GLODAP <- m_grid_horizontal(GLODAP)
Samples were assigned to following eras:
JGOFS_WOCE: 1981 - 1999
GO_SHIP: 2000 - 2012
new_era: > 2013
GLODAP <- GLODAP %>%
filter(year >= parameters$year_JGOFS_start) %>%
mutate(era = "JGOFS_WOCE",
era = if_else(year > parameters$year_JGOFS_end, "GO_SHIP", era),
era = if_else(year > parameters$year_GOSHIP_end, "new_era", era))
GLODAP <- GLODAP %>%
mutate(era = factor(era, c("JGOFS_WOCE", "GO_SHIP", "new_era")))
Observations collected shallower than:
were excluded from the analysis to avoid seasonal bias.
GLODAP <- GLODAP %>%
filter(depth >= parameters$depth_min)
Following restriction was considered, but is currently not implemented:
GLODAP <- GLODAP %>%
filter(bottomdepth >= parameters$bottomdepth_min)
The basin mask from the World Ocean Atlas was used. For details consult the data base subsection for WOA18 data.
Please note that some GLODAP observations were made outside the WOA18 basin mask (i.e. in marginal seas) and will be removed for further analysis.
# use only data inside basinmask
GLODAP <- inner_join(GLODAP, basinmask)
Only rows (samples) for which all relevant parameters are available were selected, ie NA’s were removed.
According to Olsen et al (2020), flags within the merged master file identify:
f:
qc:
Following flagging criteria were taken into account:
Summary statistics were calculated during cleaning process.
The vast majority of rows is removed due to missing tco2 observations.
GLODAP <- GLODAP %>%
filter(!is.na(tco2))
##
GLODAP_stats <- GLODAP %>%
summarise(tco2_values = n())
##
GLODAP_obs_grid <- GLODAP %>%
count(lat, lon, era) %>%
mutate(cleaning_level = "tco2_values")
GLODAP_obs <- GLODAP %>%
group_by(lat, lon) %>%
summarise(n = n()) %>%
ungroup()
map +
geom_raster(data = basinmask, aes(lon, lat, fill = basin)) +
geom_raster(data = GLODAP_obs, aes(lon, lat)) +
scale_fill_brewer(palette = "Dark2") +
theme(legend.position = "top",
legend.title = element_blank())
rm(GLODAP_obs)
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era, tco2f)
map +
geom_raster(data = GLODAP_obs_grid_temp, aes(lon, lat, fill = n)) +
scale_fill_viridis_c(option = "magma",
direction = -1,
trans = "log10") +
facet_grid(era ~ tco2f) +
theme(legend.position = "top")
rm(GLODAP_obs_grid_temp)
GLODAP <- GLODAP %>%
filter(tco2f %in% parameters$flag_f)
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era, tco2qc)
map +
geom_raster(data = GLODAP_obs_grid_temp, aes(lon, lat, fill = n)) +
scale_fill_viridis_c(option = "magma",
direction = -1,
trans = "log10") +
facet_grid(era ~ tco2qc) +
theme(legend.position = "top")
##
GLODAP <- GLODAP %>%
filter(tco2qc %in% parameters$flag_qc)
GLODAP_stats_temp <- GLODAP %>%
summarise(tco2_flag = n())
GLODAP_stats <- cbind(GLODAP_stats, GLODAP_stats_temp)
rm(GLODAP_stats_temp)
##
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era) %>%
mutate(cleaning_level = "tco2_flag")
GLODAP_obs_grid <-
bind_rows(GLODAP_obs_grid, GLODAP_obs_grid_temp)
rm(GLODAP_obs_grid_temp)
Quite a few tco2 observations are discarded, due to missing talk data, in particular in the JGOFS/WOCE era. However, there seems to be a high number of observations remaining from the affected cruises, so that the coverage does not seem to be reduced drastically.
GLODAP <- GLODAP %>%
mutate(talkna = if_else(is.na(talk), "NA", "Value"))
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era, talkna)
map +
geom_raster(data = GLODAP_obs_grid_temp, aes(lon, lat, fill = n)) +
scale_fill_viridis_c(option = "magma",
direction = -1,
trans = "log10") +
facet_grid(era ~ talkna) +
theme(legend.position = "top")
GLODAP <- GLODAP %>%
select(-talkna) %>%
filter(!is.na(talk))
##
GLODAP_stats_temp <- GLODAP %>%
summarise(talk_values = n())
GLODAP_stats <- cbind(GLODAP_stats, GLODAP_stats_temp)
rm(GLODAP_stats_temp)
##
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era) %>%
mutate(cleaning_level = "talk_values")
GLODAP_obs_grid <-
bind_rows(GLODAP_obs_grid, GLODAP_obs_grid_temp)
rm(GLODAP_obs_grid_temp)
Restricting the f flag to 2 would results in data gaps in the south-east Pacific. Interpolated or calculated data (f flag 0) are therefore included.
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era, talkf)
map +
geom_raster(data = GLODAP_obs_grid_temp, aes(lon, lat, fill = n)) +
scale_fill_viridis_c(option = "magma",
direction = -1,
trans = "log10") +
facet_grid(era ~ talkf) +
theme(legend.position = "top",
legend.title = element_blank())
# ###
GLODAP <- GLODAP %>%
filter(talkf %in% parameters$flag_f)
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era, talkqc)
map +
geom_raster(data = GLODAP_obs_grid_temp, aes(lon, lat, fill = n)) +
scale_fill_viridis_c(option = "magma",
direction = -1,
trans = "log10") +
facet_grid(era ~ talkqc) +
theme(legend.position = "top",
legend.title = element_blank())
###
GLODAP <- GLODAP %>%
filter(talkqc %in% parameters$flag_qc)
##
GLODAP_stats_temp <- GLODAP %>%
summarise(talk_flag = n())
GLODAP_stats <- cbind(GLODAP_stats, GLODAP_stats_temp)
rm(GLODAP_stats_temp)
##
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era) %>%
mutate(cleaning_level = "talk_flag")
GLODAP_obs_grid <-
bind_rows(GLODAP_obs_grid, GLODAP_obs_grid_temp)
rm(GLODAP_obs_grid_temp)
Quite a few tco2/talk observations are discarded, due to missing phosphate data. However, there seems to be a high number of observations remaining from the affected cruises, so that the coverage does not seem to be reduced drastically.
GLODAP <- GLODAP %>%
mutate(phosphatena = if_else(is.na(phosphate), "NA", "Value"))
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era, phosphatena)
map +
geom_raster(data = GLODAP_obs_grid_temp, aes(lon, lat, fill = n)) +
scale_fill_viridis_c(option = "magma",
direction = -1,
trans = "log10") +
facet_grid(era ~ phosphatena) +
theme(legend.position = "top")
GLODAP <- GLODAP %>%
select(-phosphatena) %>%
filter(!is.na(phosphate))
##
GLODAP_stats_temp <- GLODAP %>%
summarise(phosphate_values = n())
GLODAP_stats <- cbind(GLODAP_stats, GLODAP_stats_temp)
rm(GLODAP_stats_temp)
##
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era) %>%
mutate(cleaning_level = "phosphate_values")
GLODAP_obs_grid <-
bind_rows(GLODAP_obs_grid, GLODAP_obs_grid_temp)
rm(GLODAP_obs_grid_temp)
Restricting the f flag to 2 would not drastically reduce available data. Interpolated or calculated data (f flag 0) are therefore included.
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era, phosphatef)
map +
geom_raster(data = GLODAP_obs_grid_temp, aes(lon, lat, fill = n)) +
scale_fill_viridis_c(option = "magma",
direction = -1,
trans = "log10") +
facet_grid(era~phosphatef) +
theme(legend.position = "top",
legend.title = element_blank())
###
GLODAP <- GLODAP %>%
filter(phosphatef %in% parameters$flag_f)
Phosphate data for which secondary quality was not applied (qc flag 0) were removed.
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era, phosphateqc)
map +
geom_raster(data = GLODAP_obs_grid_temp, aes(lon, lat, fill = n)) +
scale_fill_viridis_c(option = "magma",
direction = -1,
trans = "log10") +
facet_grid(era~phosphateqc) +
theme(legend.position = "top",
legend.title = element_blank())
###
GLODAP <- GLODAP %>%
filter(phosphateqc %in% parameters$flag_qc)
##
GLODAP_stats_temp <- GLODAP %>%
summarise(phosphate_flag = n())
GLODAP_stats <- cbind(GLODAP_stats, GLODAP_stats_temp)
rm(GLODAP_stats_temp)
##
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era) %>%
mutate(cleaning_level = "phosphate_flag")
GLODAP_obs_grid <-
bind_rows(GLODAP_obs_grid, GLODAP_obs_grid_temp)
rm(GLODAP_obs_grid_temp)
Rows with missing eMLR variables were removed, only the qc flag was considered.
GLODAP <- GLODAP %>%
filter(!is.na(tem))
##
GLODAP <- GLODAP %>%
filter(!is.na(sal))
GLODAP <- GLODAP %>%
filter(salinityf %in% parameters$flag_f)
GLODAP <- GLODAP %>%
filter(salinityqc %in% parameters$flag_qc)
##
GLODAP <- GLODAP %>%
filter(!is.na(silicate))
GLODAP <- GLODAP %>%
filter(silicatef %in% parameters$flag_f)
GLODAP <- GLODAP %>%
filter(silicateqc %in% parameters$flag_qc)
##
GLODAP <- GLODAP %>%
filter(!is.na(oxygen))
GLODAP <- GLODAP %>%
filter(oxygenf %in% parameters$flag_f)
GLODAP <- GLODAP %>%
filter(oxygenqc %in% parameters$flag_qc)
##
GLODAP <- GLODAP %>%
filter(!is.na(aou))
GLODAP <- GLODAP %>%
filter(aouf %in% parameters$flag_f)
##
GLODAP <- GLODAP %>%
filter(!is.na(nitrate))
GLODAP <- GLODAP %>%
filter(nitratef %in% parameters$flag_f)
GLODAP <- GLODAP %>%
filter(nitrateqc %in% parameters$flag_qc)
##
GLODAP <- GLODAP %>%
filter(!is.na(depth))
GLODAP <- GLODAP %>%
filter(!is.na(gamma))
##
GLODAP_stats_temp <- GLODAP %>%
summarise(eMLR_variables = n())
GLODAP_stats <- cbind(GLODAP_stats, GLODAP_stats_temp)
rm(GLODAP_stats_temp)
##
GLODAP_obs_grid_temp <- GLODAP %>%
count(lat, lon, era) %>%
mutate(cleaning_level = "eMLR_variables")
GLODAP_obs_grid <-
bind_rows(GLODAP_obs_grid, GLODAP_obs_grid_temp)
rm(GLODAP_obs_grid_temp)
GLODAP <- GLODAP %>%
select(-ends_with(c("f", "qc")))
For harmonization with Gruber et al. (2019), cruises 1041 (A16N) and 1042 (A16S) were grouped into the GO_SHIP area despite taking place in 2013/14.
GLODAP_cruises <- GLODAP %>%
filter(basin == "Atlantic",
year %in% c(2013, 2014)) %>%
count(lat, lon, cruise)
map +
geom_raster(data = GLODAP_cruises, aes(lon, lat, fill = as.factor(cruise))) +
scale_fill_brewer(palette = "Dark2") +
theme(legend.position = "top",
legend.title = element_blank())
rm(GLODAP_cruises)
GLODAP <- GLODAP %>%
mutate(era = as.character(era)) %>%
mutate(era = if_else(cruise %in% c(1041, 1042),
"GO_SHIP", era))
GLODAP <- GLODAP %>%
mutate(era = factor(era, c("JGOFS_WOCE", "GO_SHIP", "new_era")))
GLODAP_obs_grid_clean <- GLODAP %>%
count(lat, lon) %>%
select(-n)
GLODAP_stats_long <- GLODAP_stats %>%
pivot_longer(1:length(GLODAP_stats),
names_to = "parameter",
values_to = "n")
GLODAP_stats_long %>% write_csv(
here::here(
"data/interim",
"GLODAPv2.2020_stats.csv"
)
)
##
GLODAP_obs_grid_clean %>%
write_csv(
here::here(
"data/interim",
"GLODAPv2.2020_clean_obs_grid.csv"
)
)
##
GLODAP <- GLODAP %>%
select(year, date, era, basin, lat, lon, cruise,
bottomdepth, depth, tem, sal, gamma,
tco2, talk, phosphate,
oxygen, aou, nitrate, silicate)
GLODAP %>% write_csv(
here::here(
"data/interim",
"GLODAPv2.2020_clean.csv"
)
)
Number of observations at various steps of data cleaning.
GLODAP_stats_long <- GLODAP_stats_long %>%
mutate(parameter = fct_reorder(parameter, n))
GLODAP_stats_long %>%
ggplot(aes(parameter, n/1000)) +
geom_col() +
coord_flip() +
labs(y = "n (1000)") +
theme(axis.title.y = element_blank())
rm(GLODAP_stats_long)
For the following plots, the cleaned data set was re-opened and observations were gridded spatially to intervals of:
GLODAP <- m_grid_horizontal_coarse(GLODAP)
GLODAP_histogram_lat <- GLODAP %>%
group_by(era, lat_grid, basin) %>%
tally() %>%
ungroup()
GLODAP_histogram_lat %>%
ggplot(aes(lat_grid, n, fill = era)) +
geom_col() +
scale_fill_brewer(palette = "Dark2") +
facet_wrap( ~ basin) +
coord_flip(expand = 0) +
theme(legend.position = "top",
legend.title = element_blank())
rm(GLODAP_histogram_lat)
GLODAP_histogram_year <- GLODAP %>%
group_by(year, basin) %>%
tally() %>%
ungroup()
era_median_year <- GLODAP %>%
group_by(era) %>%
summarise(t_ref = median(year)) %>%
ungroup()
era_median_year
# A tibble: 3 x 2
era t_ref
<fct> <dbl>
1 JGOFS_WOCE 1994
2 GO_SHIP 2008
3 new_era 2016
GLODAP_histogram_year %>%
ggplot() +
geom_vline(xintercept = c(
parameters$year_JGOFS_end + 0.5,
parameters$year_GOSHIP_end + 0.5
)) +
geom_col(aes(year, n, fill = basin)) +
geom_point(
data = era_median_year,
aes(t_ref, 0, shape = "Median year"),
size = 2,
fill = "white"
) +
scale_fill_brewer(palette = "Dark2") +
scale_shape_manual(values = 24, name = "") +
scale_y_continuous() +
coord_cartesian(expand = 0) +
theme(
legend.position = "top",
legend.direction = "vertical",
legend.title = element_blank(),
axis.title.x = element_blank()
)
rm(GLODAP_histogram_year,
era_median_year)
GLODAP_hovmoeller_year <- GLODAP %>%
group_by(year, lat_grid, basin) %>%
tally() %>%
ungroup()
GLODAP_hovmoeller_year %>%
ggplot(aes(year, lat_grid, fill = log10(n))) +
geom_tile() +
geom_vline(xintercept = c(1999.5, 2012.5)) +
scale_fill_viridis_c(option = "magma", direction = -1) +
facet_wrap( ~ basin, ncol = 1) +
theme(legend.position = "top",
axis.title.x = element_blank())
rm(GLODAP_hovmoeller_year)
GLODAP_obs_grid <- GLODAP_obs_grid %>%
mutate(era = factor(era, c("JGOFS_WOCE", "GO_SHIP", "new_era")),
cleaning_level = factor(
cleaning_level,
unique(GLODAP_obs_grid$cleaning_level)
))
map +
geom_raster(data = GLODAP_obs_grid %>%
filter(cleaning_level == "tco2_values") %>%
select(-cleaning_level),
aes(lon, lat, fill = "tco2_values")) +
geom_raster(data = GLODAP_obs_grid %>%
filter(cleaning_level != "tco2_values"),
aes(lon, lat, fill = "subset")) +
scale_fill_manual(values = c("red", "grey"), name = "") +
theme(legend.position = "top",
axis.title = element_blank())
map +
geom_raster(data = GLODAP_obs_grid %>%
filter(cleaning_level == "tco2_values") %>%
select(-cleaning_level),
aes(lon, lat, fill = "tco2_values")) +
geom_raster(data = GLODAP_obs_grid %>%
filter(cleaning_level != "tco2_values"),
aes(lon, lat, fill = "subset")) +
scale_fill_manual(values = c("red", "grey"), name = "") +
facet_grid(cleaning_level ~ era) +
theme(legend.position = "top",
axis.title = element_blank())
Grey pixels refer to sampling locations filtered for availability of tco2 data only.
GLODAP <- GLODAP %>%
mutate(era = factor(era, c("JGOFS_WOCE", "GO_SHIP", "new_era")))
GLODAP_tco2_grid <- GLODAP %>%
count(lat, lon, era)
map +
geom_raster(data = GLODAP_tco2_grid, aes(lon, lat), fill = "grey80") +
geom_bin2d(data = GLODAP,
aes(lon, lat),
binwidth = c(1,1)) +
scale_fill_viridis_c(option = "magma", direction = -1, trans = "log10",
name = "log10(eMLR_variables)") +
facet_wrap(~era, ncol = 1) +
theme(legend.position = "top",
axis.title = element_blank())
source("code/eda.R")
eda(GLODAP, "GLODAPv2_2020_clean")
rm(eda)
The output of an automated Exploratory Data Analysis (EDA) performed with the package DataExplorer
can be accessed here:
Link to EDA report of GLODAPv2_2020 raw data
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_Germany.1252 LC_CTYPE=English_Germany.1252
[3] LC_MONETARY=English_Germany.1252 LC_NUMERIC=C
[5] LC_TIME=English_Germany.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lubridate_1.7.9 metR_0.7.0 scico_1.2.0 patchwork_1.0.1
[5] collapse_1.3.2 forcats_0.5.0 stringr_1.4.0 dplyr_1.0.0
[9] purrr_0.3.4 readr_1.3.1 tidyr_1.1.0 tibble_3.0.3
[13] ggplot2_3.3.2 tidyverse_1.3.0 workflowr_1.6.2
loaded via a namespace (and not attached):
[1] httr_1.4.2 viridisLite_0.3.0 jsonlite_1.7.0 here_0.1
[5] modelr_0.1.8 Formula_1.2-3 assertthat_0.2.1 blob_1.2.1
[9] cellranger_1.1.0 yaml_2.2.1 pillar_1.4.6 backports_1.1.8
[13] lattice_0.20-41 glue_1.4.1 digest_0.6.25 RColorBrewer_1.1-2
[17] promises_1.1.1 checkmate_2.0.0 rvest_0.3.6 colorspace_1.4-1
[21] sandwich_2.5-1 htmltools_0.5.0 httpuv_1.5.4 Matrix_1.2-18
[25] pkgconfig_2.0.3 broom_0.7.0 haven_2.3.1 xtable_1.8-4
[29] scales_1.1.1 whisker_0.4 later_1.1.0.1 git2r_0.27.1
[33] farver_2.0.3 generics_0.0.2 ellipsis_0.3.1 withr_2.2.0
[37] cli_2.0.2 magrittr_1.5 crayon_1.3.4 readxl_1.3.1
[41] evaluate_0.14 fs_1.4.2 fansi_0.4.1 xml2_1.3.2
[45] tools_4.0.2 data.table_1.13.0 hms_0.5.3 lifecycle_0.2.0
[49] munsell_0.5.0 reprex_0.3.0 compiler_4.0.2 lfe_2.8-5.1
[53] rlang_0.4.7 grid_4.0.2 rstudioapi_0.11 labeling_0.3
[57] rmarkdown_2.3 gtable_0.3.0 DBI_1.1.0 R6_2.4.1
[61] zoo_1.8-8 knitr_1.30 utf8_1.1.4 rprojroot_1.3-2
[65] stringi_1.4.6 parallel_4.0.2 Rcpp_1.0.5 vctrs_0.3.2
[69] dbplyr_1.4.4 tidyselect_1.1.0 xfun_0.16