Last updated: 2021-06-29
Checks: 1 1
Knit directory: globalIRmap/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 025ac83. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: R/.Rhistory
Ignored: analysis/.Rhistory
Ignored: renv/library/
Ignored: renv/staging/
Untracked files:
Untracked: .drake/
Untracked: .gitignore
Untracked: figtabres.docx
Untracked: schema.ini
Unstaged changes:
Modified: analysis/methods_gettingstarted.Rmd
Modified: analysis/methods_refdisdat.Rmd
Modified: analysis/methods_riveratlas.Rmd
Modified: analysis/methods_workflow.Rmd
Modified: log/drake.log
Deleted: output/README.md
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/methods_workflow.Rmd
) and HTML (docs/methods_workflow.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
html | 6e12f71 | messamat | 2021-06-14 | Build site. |
Rmd | 5e433c3 | messamat | 2021-06-14 | Publish new pages |
html | 9b48bde | messamat | 2021-01-06 | Build site. |
Rmd | f1d9dcf | messamat | 2021-01-06 | Start building up workflowr website, start incorporating mandrake (but wait as very unstable still), plan gauge selection documentation |
html | f1d9dcf | messamat | 2021-01-06 | Start building up workflowr website, start incorporating mandrake (but wait as very unstable still), plan gauge selection documentation |
This study leverages the respective strengths of R (for data wrangling, statistics, and figure-making) and Python (for spatial analysis and mapping). As a result, re-producing it requires going back and forth between these two languages and platforms. At the broadest level, the main steps of this analysis were the following:
1. Python — Pre-process and format global river network environmental attributes: for more information, see this tab on this website and the corresponding Github repository.
2. Python — Compile and pre-process global river network; download and spatially pre-process streamflow gauging stations (reference data for model training and testing, for more information, see this tab), national hydrographic datasets, and on-the-ground visual observations of flow intermittence: globalIRmap_py Github repository.
3. R — QA/QC streamflow gauging station records; develop and validate random forest models, compare predictions to hydrographic datasets and on-the-ground observations, generate tables, make non-spatial figures and generate tabular predictions: globalIRmap Github repository.
4. Python — Join tabular predictions of flow intermittence to global river network, join predictions to on-the-ground observations of flow intermittence: globalIRmap_py Github repository.
5. ArcMap — Create maps.
Below, we briefly explain how each of these steps was implemented, but additional data not currently available publicly are needed to fully reproduce the analysis. Please contact mathis.messager@mail.mcgill.ca and/or bernhard.lehner(at)mcgill.ca for additional information should you want to re-produce the results from this study. In addition please note that processing these data takes weeks of continuous computing on a normal workstation.
utility_functions.py:
- import key modules.
- defines utility functions used throughout the analysis.
- defines the basic folder structure of the analysis.
runUplandWeighting.py:
- define functions for routing data on river network
Downloading data requires the creation of a file called “configs.json” with login information for earthdata and alos. For guidance on formatting the json configuration file, see here.
Execute:
1. scripts for downloading data in any order
2. format_MODISmosaic.py
3. format_HydroSHEDS.py
4. format_WorldClim2.py
5. other formatting scripts in any order
6. runUplandWeighting_batch.py
7. runHydroATLASStatistics.py
utility_functions.py:
- imports key modules.
- defines utility functions used throughout the analysis.
- defines the basic folder structure of the analysis.
setup_localIRformatting.py: - defines folder structure for formatting data to compare modeled estimates of global flow intermittence to national hydrographic datasets (Comparison_databases) and to in-situ/field-based observations of flow intermittence (Insitu_databases). - defines functions used in formatting data for the comparisons
Execute:
1. scripts for downloading data in any order
2. format_RiverATLAS.py
3. format_stations.py
4. format_FROndeEaudata.py
5. format_PNWdata.py