Last updated: 2021-04-20

Checks: 1 1

Knit directory: CassavaNIRS/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has staged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version fecea09. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    Hershberger_CassavaNIRS_2021.zip
    Ignored:    analysis/.DS_Store
    Ignored:    code/.DS_Store
    Ignored:    data/.DS_Store
    Ignored:    data/Cassavabase_phenotypes_20210419.csv
    Ignored:    data/Corrected_metadata/
    Ignored:    data/README.html
    Ignored:    data/README.txt
    Ignored:    data/Spectra/
    Ignored:    data/TrialNameKey.csv

Unstaged changes:
    Modified:   analysis/index.Rmd

Staged changes:
    Modified:   .gitignore
    New:        CITATION
    Modified:   README.md
    Modified:   analysis/about.Rmd
    Modified:   analysis/index.Rmd
    Deleted:    analysis/license.Rmd
    New:        analysis/manuscript_data_curation_Breedbase.Rmd
    New:        analysis/manuscript_predictions.Rmd
    New:        analysis/manuscript_subsampling.Rmd
    New:        analysis/manuscript_summary_figures.Rmd
    Modified:   code/README.md
    New:        code/run_subsampling.sh
    New:        code/server_CV.R
    New:        code/server_subsample_plsr.R
    New:        code/server_subsampling_generalized.R
    New:        code/server_within_trial_predictions_PLSR_RF_SVM.R
    New:        code/server_within_trial_predictions_RF_var_importance.R
    New:        code/subsampling_functions.R
    Modified:   data/README.md
    Modified:   output/README.md

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/index.Rmd) and HTML (docs/index.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd fecea09 Jenna Hershberger 2021-04-19 Start workflowr project.

PREPRINT

This repository documents all analyses, summary, tables, and figures associated with the following PREPRINT: Low-cost, handheld near-infrared spectroscopy for root dry matter content prediction in cassava

Abstract

Over 800 million people across the tropics rely on cassava as a major source of calories. While the root dry matter content (RDMC) of this starchy root crop is important for both producers and consumers, characterization of RDMC by traditional methods is time-consuming and laborious for breeding programs. Alternate phenotyping methods have been proposed but lack the accuracy, cost, or speed ultimately needed for cassava breeding programs. For this reason, we investigated the use of a low-cost, handheld NIR spectrometer for field-based RDMC prediction in cassava. Oven-dried measurements of RDMC were paired with 21,044 scans of roots of 376 diverse clones from 10 field trials in Nigeria and grouped into training and test sets based on cross-validation schemes relevant to plant breeding programs. Mean partial least squares regression model performance ranged from R2p = 0.62 - 0.89 for within-trial predictions, which is within the range achieved with laboratory-grade spectrometers in previous studies. Relative to other factors, model performance was highly impacted by the inclusion of samples from the same environment in both the training and test sets. Random forest variable importance analysis of root spectra revealed increased importance in a region previously identified as predictive of water content in plants (~950 - 990 nm). With appropriate model calibration, the tested spectrometer will allow for field-based collection of spectral data with a smartphone for accurate RDMC prediction and potentially other quality traits, a step that could be easily integrated into existing harvesting workflows of cassava breeding programs.

Manuscript