Last updated: 2021-06-24

Checks: 2 0

Knit directory: globalIRmap/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 0e72638. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    R/.Rhistory
    Ignored:    analysis/.Rhistory
    Ignored:    renv/library/
    Ignored:    renv/staging/

Untracked files:
    Untracked:  .drake/
    Untracked:  .gitignore
    Untracked:  figtabres.docx
    Untracked:  schema.ini

Unstaged changes:
    Modified:   log/drake.log
    Deleted:    output/README.md

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/methods_riveratlas.Rmd) and HTML (docs/methods_riveratlas.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 8ad58ab messamat 2021-06-24 Add documentation, format project for package building
html 8ad58ab messamat 2021-06-24 Add documentation, format project for package building
html 6e12f71 messamat 2021-06-14 Build site.
Rmd 5e433c3 messamat 2021-06-14 Publish new pages

Global underpinning hydrography

We predicted the distribution of IRES for river reaches in the global RiverATLAS database. RiverATLAS is a widely used representation of the global river network built on the hydrographic database HydroSHEDS. Rivers are delineated on the basis of drainage direction and flow accumulation maps derived from elevation data at a pixel resolution of 3 arcseconds (~90 m at the equator) and subsequently upscaled to 15 arcseconds (~500 m at the equator). In this study, we only included river reaches with a modelled MAF ≥ 0.1 m3 s−1 and excluded: i) smaller streams (owing to increasing uncertainties in their geospatial location and flow estimates derived from global datasets and models); and ii) sections of river reaches within lakes (identified based on HydroLAKES polygons).

Hydro-environmental predictor variables

The primary source of predictor variables was the global RiverATLAS database, version 1.0, which is a subset of the broader HydroATLAS product. RiverATLAS provides hydro-environmental information for all rivers of the world, both within their contributing local reach catchment and across the entire upstream drainage area of every reach. This information was derived by aggregating and reformatting original data from well established global digital maps, and by accumulating them along the drainage network from headwaters to ocean outlets.

We complemented the RiverATLAS v1.0 data with three additional sets of variables. The first set of variables describes the inter-annual open surface water dynamics as determined by remote sensing imagery from 1999 to 2019 (Pickens et al. 2019). In the original dataset, each 30-m-resolution pixel has been covered by water sometime during this time period was assigned one of seven ‘interannual dynamic classes’ (for example, permanent water, stable seasonal, high-frequency changes) on the basis of a time series analysis of the annual percentage of open water in the pixel. We computed the percent coverage of each of these interannual dynamic classes relative to the total area of surface water within the contributing local catchment and across the entire upstream drainage area of every river reach.

Second, we replaced the soil and climate characteristics in RiverATLAS v1.0 with updated datasets. Specifically, we computed the average texture of the top 100 cm of soil based on SoilGrids250m v2. We also updated the climate variables with WorldClim v2 (adding all bioclimatic variables to the existing set of variables) as well as the second version of the Global Aridity Index and Global Reference Evapotranspiration (Global-PET) datasets. Finally, we updated the Climate Moisture Index (CMI), computed from the annual precipitation and potential evapotranspiration datasets provided by the WorldClim v2 and Global-PET v2 databases, respectively.

We derived a third set of variables by combining multiple variables already included in the model through algebraic operations. These metrics included the runoff coefficient (that is, the ratio of MAF and mean annual precipitation), specific discharge (that is, MAF per unit drainage area), and various temporal (for example, minimum annual/ maximum annual discharge) and spatial (for example, mean elevation in local reach catchment/mean elevation in upstream drainage area) ratios.

Pre-formatting of spatial data

One main github repository was used to pre-format spatial data for this analysis: globalIRmap_HydroATLAS_py, which contains Python code used for pre-processing hydro-environmental data. The output of this pre-processing is available and compiled as the attributes in the global river network and streamflow gauging stations datasets (available in the figshare data repository).

See the Getting started and Workflow pages of this website for more information on folder structure used in this project and broader guidance on workflow.

Here, we briefly explain how this code was used, but additional data (~100GB) not currently available publicly are needed to fully reproduce the analysis. Please contact and/or for additional information should you want to re-produce the results from this study. In addition please note that processing these data takes weeks of continuous computing on a normal workstation.

Github repository structure

Set-up

utility_functions.py:
- import key modules.
- defines utility functions used throughout the analysis.
- defines the basic folder structure of the analysis.

runUplandWeighting.py:
- define functions for routing data on river network

Download data

Downloading data requires the creation of a file called “configs.json” with login information for earthdata and alos. For guidance on formatting the json configuration file, see here.

Pre-format data

  • format_HydroSHEDS.py: create
    • a coastal band raster (~ 10 pixels inland at ~450 m resolution)
    • HydroSHEDS regions of contiguous land surfaces in raster and polygon format
  • format_MODISmosaic.py: extract and mosaic MODIS ocean mask.
  • format_GLAD.py: format surface water dynamics dataset, removing ocean pixels and aggregating data from 30 m resolution to 15 sec (~450 m) resolution (i.e., computing statistics of e.g. percentage area of seasonal surface water).
  • format_WorldClim2.py: resample WorldClim2 rasters (30 sec native resolution) to HydroSHEDS resolution (15 sec) and fill gaps.
  • format_GAIandCMIv2.py:
    • compute Climate Moisture Index (based on WorldClimv2 precipitation and GAIv2 potential evapotranspiration data)
    • resample GAI and CMI rasters to HydroSHEDS resolution (15 sec)
  • format_SoilGrids250m.py: mosaic tiles, compute aggregate texture values for (0-100 cm), reproject and aggregate rasters (250 m) to HydroSHEDS resolution (15 sec).
  • format_worldpop.py: aggregate (from 3 sec to 15 sec resolution)and mosaick country population rasters, associate each population pixel to a river reach (with long-term mean annual flow > 0.1 m3/s), and compute population that is closest to each reach.

Associate hydro-environmental attributes to RiverATLAS river reaches

  • runUplandWeighting_batch.py: route hydro-environmental characteristics along the global river network to yield rasters of the average value of a given hydro-environmental characteristic (e.g., global aridity index) across the entire upstream area of each pixel. Compute rasters for worldclim, GAI, CMI, soilgrids textures from 0 to 100 cm, and surface water dynamics.
  • runHydroATLASStatistics.py: create statistics tables of hydro-environmental attributes for every river reach in RiverATLAS. This code requires a fair amount of manual adjustment of local paths and must direct to a local master spreadsheet with the parameters of all statistics to compute. Please contact for more information and for an example of such a table.

Workflow

Execute:
1. scripts for downloading data in any order
2. format_MODISmosaic.py
3. format_HydroSHEDS.py
4. format_WorldClim2.py
5. other formatting scripts in any order
6. runUplandWeighting_batch.py
7. runHydroATLASStatistics.py