Last updated: 2019-04-09

Checks: 6 0

Knit directory: rrtools-repro-research/

This reproducible R Markdown analysis was created with workflowr (version 1.2.0). The Report tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20181015) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    analysis/.DS_Store
    Ignored:    analysis/data/
    Ignored:    analysis/package.Rmd
    Ignored:    assets/
    Ignored:    docs/.DS_Store
    Ignored:    docs/assets/Boettiger-2018-Ecology_Letters.pdf
    Ignored:    docs/assets/Packaging-Data-Analytical Work-Reproducibly-Using-R-and-Friends.pdf
    Ignored:    docs/css/
    Ignored:    libs/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
html 95a9aa0 annakrystalli 2018-11-10 Build site.
html 97818bf annakrystalli 2018-11-10 Build site.
html 2c1e957 annakrystalli 2018-10-31 Build site.
html c26c936 annakrystalli 2018-10-31 Build site.
Rmd d3f45b6 annakrystalli 2018-10-31 add intro, re-publish
html 52adf4f annakrystalli 2018-10-30 Build site.
html 921a7f8 annakrystalli 2018-10-30 commit docs
Rmd f1468ac annakrystalli 2018-10-30 commit Rmd

👋 Hello and welcome

me: Dr Anna Krystalli

  • Research Software Engineer, University of Sheffield
    • twitter @annakrystalli
    • github @annakrystalli
    • email a.krystalli[at]sheffield.ac.uk
  • Editor rOpenSci

Background

  • Research is increasingly computational

  • Code and data are important research outputs
    • yet, we still focus mainly on curating papers.
  • Calls for openness
    • stick: reproducibility crisis
    • carrot: huge rewards from working open
Yet we lack conventions and technical infrastructure for such openness.


Enter the Research Compendium

The goal of a research compendium is to provide a standard and easily recognizable way for organizing the digital materials of a project to enable others to inspect, reproduce, and extend the research.

Three Generic Principles

  1. Organize its files according to prevailing conventions:
    • help other people recognize the structure of the project,
    • supports tool building which takes advantage of the shared structure.
  2. Separate of data, method, and output, while making the relationship between them clear.

  3. Specify the computational environment that was used for the original analysis.


R community response

R packages can be used as a research compendium for organising and sharing files!

  1. _Wickham, H. (2017) Research compendia. Note prepared for the 2017 rOpenSci Unconf_

  2. Ben Marwick, Carl Boettiger & Lincoln Mullen (2018) Packaging Data Analytical Work Reproducibly Using R (and Friends), The American Statistician, 72:1, 80-88, DOI: <10.1080/00031305.2017.1375986>

Example use of the R package structure for a research compendium (source Marwick et al, 2018)


Enter rrtools

The goal of rrtools is to provide instructions, templates, and functions for making a basic compendium suitable for writing reproducible research with R.

rrtools build on tools & conventions for R package development to

  • organise files
  • manage dependencies
  • share code
  • document code
  • check and test code

rrtools extends and works with a number of R packages:

  • devtools: functions for package development

  • usethis: automates repetitive tasks that arise during project setup and development

  • bookdown: facilitates writing books and long-form articles/reports with R Markdown



Workshop approach

Live coding

The majority of the workshop I will be live coding 😨 so that you can follow along. You will get a lot more out of the workshop if you do.

However, handouts of the materials we’ll cover are available if you get stuck!

Workshop materials

Data

On github: https://github.com/annakrystalli/rrtools-wkshp-materials/

  • click on Clone or download

  • click on Download ZIP

  • Unzip the file

Handouts:

<bit.ly/rrtools_handouts>


Workshop aims and objectives

In this workshop we’ll use materials associated with a published paper (text, data and code) to create a research compendium around it.


By the end of the workshop, you should be able to:

  • Be able to Create a Research Compendium to manage and share resources associated with an academic publication.

  • Understand the basics of managing code as an R package.

  • Be able to produce a reproducible manuscript from a single rmarkdown document.

  • Appreciate the power of convention!


It’s like agreeing that we will all drive on the left or the right. A hallmark of civilization is following conventions that constrain your behavior a little, in the name of public safety.

Jenny Bryan on Project-oriented workflows


Level

Intermediate

Prerequisites:

Familiarity with Version Control through RStudio and rmarkdown.

System Requirements:

Pandoc (>= 1.17.2) LaTeX

If you don’t have LaTeX installed, consider installing TinyTeX, a custom LaTeX distribution based on TeX Live that is small in size but functions well in most cases, especially for R users.

Check docs before before installing.

Let’s dive in!



R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.3

Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] workflowr_1.2.0   Rcpp_1.0.1        lubridate_1.7.4  
 [4] emo_0.0.0.9000    crayon_1.3.4      assertthat_0.2.0 
 [7] digest_0.6.18     rprojroot_1.3-2   backports_1.1.3  
[10] git2r_0.24.0.9001 magrittr_1.5      evaluate_0.13    
[13] rlang_0.3.1       stringi_1.3.1     rstudioapi_0.9.0 
[16] fs_1.2.7          whisker_0.3-2     rmarkdown_1.12   
[19] tools_3.5.2       stringr_1.4.0     glue_1.3.1       
[22] purrr_0.3.2       xfun_0.5          yaml_2.2.0       
[25] compiler_3.5.2    htmltools_0.3.6   knitr_1.22