Preparation

Darius Goergen

Last updated: 2019-08-21

Checks: 6 1

Knit directory: polymeRID/

This reproducible R Markdown analysis was created with workflowr (version 1.4.0.9001). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20190729) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rprofile
    Ignored:    .Rproj.user/
    Ignored:    analysis/library.bib
    Ignored:    docs/figure/
    Ignored:    fun/
    Ignored:    output/20190810_1538/
    Ignored:    output/20190810_1546/
    Ignored:    output/20190810_1609/
    Ignored:    output/20190813_1044/
    Ignored:    output/logs/
    Ignored:    output/natural/
    Ignored:    output/nnet/
    Ignored:    output/svm/
    Ignored:    output/testRunII/
    Ignored:    output/testRunIII/
    Ignored:    packrat/lib-R/
    Ignored:    packrat/lib-ext/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/BH/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/FactoMineR/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/IDPmisc/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/KernSmooth/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/MASS/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Matrix/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/MatrixModels/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ModelMetrics/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/R6/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RColorBrewer/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RCurl/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rcpp/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppArmadillo/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppEigen/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppGSL/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppZiggurat/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rfast/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rgtsvm/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rmisc/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/SQUAREM/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/SparseM/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/abind/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/askpass/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/assertthat/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/backports/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/base64enc/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/baseline/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/bit/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/bit64/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/bitops/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/boot/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/callr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/car/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/carData/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/caret/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/cellranger/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/class/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/cli/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/clipr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/cluster/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/codetools/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/colorspace/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/config/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/cowplot/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/crayon/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/crosstalk/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/curl/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/data.table/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/dendextend/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/digest/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/doParallel/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/dplyr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/e1071/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ellipse/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ellipsis/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/evaluate/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/factoextra/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/fansi/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/flashClust/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/forcats/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/foreach/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/foreign/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/fs/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/generics/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/getPass/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggplot2/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggpubr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggrepel/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggsci/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggsignif/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/git2r/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/glue/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/gower/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/gridExtra/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/gtable/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/haven/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/hexbin/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/highr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/hms/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/htmltools/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/htmlwidgets/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/httpuv/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/httr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ipred/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/iterators/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/jsonlite/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/keras/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/kerasR/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/knitr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/labeling/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/later/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lattice/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lava/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lazyeval/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/leaps/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lme4/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/lubridate/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/magrittr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/maptools/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/markdown/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/mgcv/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/mime/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/minqa/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/munsell/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/nlme/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/nloptr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/nnet/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/numDeriv/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/openssl/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/openxlsx/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/packrat/tests/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/pbkrtest/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/pillar/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/pkgconfig/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/plogr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/plotly/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/plyr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/polynom/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/prettyunits/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/processx/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/prodlim/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/progress/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/promises/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/prospectr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/ps/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/purrr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/quantreg/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/randomForest/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/readr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/readxl/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/recipes/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rematch/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/reshape2/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/reticulate/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rio/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rlang/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rmarkdown/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rpart/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rprojroot/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rsconnect/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/rstudioapi/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/scales/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/scatterplot3d/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/shiny/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/sourcetools/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/sp/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/stringi/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/stringr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/survival/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/sys/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tensorflow/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tfruns/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tibble/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tidyr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tidyselect/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/timeDate/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/tinytex/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/utf8/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/vctrs/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/viridis/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/viridisLite/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/whisker/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/withr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/workflowr/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/xfun/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/xtable/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/yaml/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/zeallot/
    Ignored:    packrat/lib/x86_64-pc-linux-gnu/3.6.1/zip/
    Ignored:    packrat/src/
    Ignored:    polymeRID.Rproj
    Ignored:    smp/20190812_1723_NNET/files/
    Ignored:    smp/20190812_1723_NNET/plots/
    Ignored:    smp/20190812_1729_NNET/files/
    Ignored:    smp/20190812_1729_NNET/plots/
    Ignored:    smp/20190812_1731_NNET/files/
    Ignored:    smp/20190812_1731_NNET/plots/
    Ignored:    smp/20190812_1733_NNET/files/
    Ignored:    smp/20190812_1733_NNET/plots/
    Ignored:    smp/20190815_1847_FUSION/
    Ignored:    website/

Unstaged changes:
    Modified:   analysis/preparation.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
html f2ee83c goergen95 2019-08-19 Build site.
html d960dc2 goergen95 2019-08-19 included calibration
html b846f0b goergen95 2019-08-19 Build site.
Rmd de84a71 goergen95 2019-08-19 large update for website
html de84a71 goergen95 2019-08-19 large update for website
Rmd 66caad0 goergen95 2019-08-19 new shiny link
html 66caad0 goergen95 2019-08-19 new shiny link
Rmd f43f417 goergen95 2019-08-19 added plotly shiny app
html f43f417 goergen95 2019-08-19 added plotly shiny app
Rmd adc9d8e goergen95 2019-08-19 prepraratio plotly
html adc9d8e goergen95 2019-08-19 prepraratio plotly
Rmd 6a86688 goergen95 2019-08-19 prepraration without messages and warnings
html 6a86688 goergen95 2019-08-19 prepraration without messages and warnings
Rmd a90881b goergen95 2019-08-19 prepraration without shiny servers II
html a90881b goergen95 2019-08-19 prepraration without shiny servers II
Rmd fee623f goergen95 2019-08-19 prepraration without shiny servers
html fee623f goergen95 2019-08-19 prepraration without shiny servers
Rmd 807b758 goergen95 2019-08-19 test of rendering shiny app in preparation.html
html 807b758 goergen95 2019-08-19 test of rendering shiny app in preparation.html
html b125bc5 goergen95 2019-08-16 fixed error with pca in classification - now based of training data pca
html 2385fbc goergen95 2019-08-14 republish for layout change
Rmd 5d28ce0 goergen95 2019-08-14 changed citation note
html 5d28ce0 goergen95 2019-08-14 changed citation note
Rmd afd89c2 goergen95 2019-08-14 fixed error in preparation concering FUR class
html afd89c2 goergen95 2019-08-14 fixed error in preparation concering FUR class
Rmd c3f088e goergen95 2019-08-13 started exploration tab
html c3f088e goergen95 2019-08-13 started exploration tab
html c52182b goergen95 2019-08-13 rebuid website
html 6e92d01 goergen95 2019-08-13 Build site.
Rmd 9ca3d89 goergen95 2019-08-13 added website directory mirror
html 9ca3d89 goergen95 2019-08-13 added website directory mirror
html 6cfd689 goergen95 2019-08-13 Build site.
Rmd 5774923 goergen95 2019-08-13 included preparation

Reference Data

For this project we used a data base published by Primpke et al. (2018) online. The data base can be downloaded here. The authors state that the samples were collected based on the FTIR-spectrometer Bruker Tensor 27 System for the spectral range 4000 to 40 1/cm. Additionally, some data of polymer-based fibers and spectra of biological origins were received from the Bremer Faserinstitut. During pre-processing, they applied a concave rubberband correction based on ten iterations and 64 baseline points. They also excluded the C02 band between 2200 to 2420 1/cm by setting the data points to 0. This should be kept in mind, since the inclusion of additional reference samples requires the same procedure for the database to stay in a consistent state. The data shows a spectral resolution of 2.1 1/cm. Additional reference samples need to be resampled to the same spectral resolution and the same baseline correction should be applied.

To ensure consistency, the data base was read into R and the wavenumbers were saved in a separate file for the future use of resampling additional reference spectra.

library(openxlsx)
url = "https://static-content.springer.com/esm/art%3A10.1007%2Fs00216-018-1156-x/MediaObjects/216_2018_1156_MOESM2_ESM.xlsx"

data = openxlsx::read.xlsx(url)
# extract wavenumbers from first row
wavenumbers = as.numeric(names(data)[2:1864])
# saving wavenumbers to reference sample directory
saveRDS(wavenumbers, paste0(ref, "wavenumbers.rds"))

An important feature of any data base is the distribution of different classes. Here, we only print the 20 most common classes, because there are a lot of reference samples only found once or twice within the database.

data$Abbreviation = as.factor(data$Abbreviation)
summary(data$Abbreviation)[1:10]
  PES    PP  LDPE  HDPE   PET    PE Nylon    PA    PS   PUR 
   15    12    11    10     9     8     7     7     7     7 

Construction of the Database

We are interested in assigning the correct class to potential plastic particles. The most important classes to us found in the database are the ones of artificial polymer origin. However, sometimes particles of biological origins will also be subject to a spectral analysis, because they resemble the appearance of microplastic in environmental samples. Any machine-learning algorithm trained only with reference samples from plastics would eventually assign one of these classes to the particles of biological origins. It will only assign the class with the greatest similarity to the classes it has learned. This can lead to so-called false positive errors. To reduce the occurence of false positives we include some of the samples of biological origin as well. We summarize these samples to broader classes.

# furs and wools
indexFur = grep("fur", data$Abbreviation)
indexWool = grep("wool", data$Abbreviation)
furs = data[c(indexFur, indexWool), ]
furs = furs[ , c(2:1864)] # leave out index column
names(furs) = paste("wvn", wavenumbers, sep="")
furs$class = "FUR"

# fibres
indexFibre = grep("fibre", data$Abbreviation)
fibre = data[indexFibre, ]
fibre = fibre[ , c(2:1864)] # leave out index column
names(fibre) = paste("wvn", wavenumbers, sep="")
fibre$class = "FIBRE"

# wood
indexWood = grep("wood", data$Abbreviation)
wood = data[indexWood, ]
wood = wood[ , c(2:1864)] # leave out index colums
names(wood) = paste("wvn", wavenumbers, sep="")
wood$class = "WOOD"

# synthetic polymers
polyIndex = which(data$`Natural./Synthetic` =="synthetic polymer")
syntPolymer = data[polyIndex,]
counts = summary(syntPolymer$Abbreviation)
polyNames = names(counts)[1:10] # only major polymers
syntPolymer = syntPolymer[which(syntPolymer$Abbreviation %in%  polyNames) , ]
classes = droplevels(syntPolymer$Abbreviation)
syntPolymer = syntPolymer[ , c(2:1864)] # leave out index column
names(syntPolymer) = paste("wvn",wavenumbers,sep="")
syntPolymer$class = as.character(classes)

# lets group together some synthetic polymer classes
syntPolymer$class[grep("Nylon",syntPolymer$class)] = "PA"

Class Distribution

We now bind the reference samples together and take a look at the distribution of classes in the resulting data frame, which is the concrete database used for the following calculations.

data = rbind(furs,wood,fibre,syntPolymer) 
data$class = as.factor(data$class)
summary(data$class)
FIBRE   FUR  HDPE  LDPE    PA    PE   PES   PET    PP    PS   PUR  WOOD 
   27    23    10    11    14     8    15     9    12     7     7     4 

In total, 93 (53%) reference samples of plastic polymers are present in the database and 44 (47%) of biological origin. Within the plastic samples, we found that the data is very balanced with no single class showing less than seven samples. For the samples of biological origin, however, the class FIBRE dominates the distribution. This could prove a disadvantage if a machine-learning algorithm picks up this unbalance by minimizing its error-rate simply by more frequently predicting the FIBRE class. At this point, we will leave the resulting data-base as it is and save it to disk. We save the data in individual files as well as, in a comprehensive file in .csv format. This way we ensure that later extensions to the database are more easily manageable.

write.csv(data, file = paste0(ref, "reference_database.csv"), row.names=FALSE)

# writing class control file
classIndex = as.character(unique(data$class))

for (class in classIndex){
  tmp = data[data$class==class , ]
  write.csv(tmp, file = paste0(ref, "reference_", class, ".csv"), row.names=FALSE)
}

write(classIndex, paste0(ref, "classes.txt"))

Spectra of Reference Samples

One can find below the spectra found within the database. By selecting an entry in the drop-down menu the plot of the respective class will be shown. The solid grey line in the center of the plot indicates the mean value for all samples of the respective wavenumbers. The grey ribbon indicates the standard deviation from that mean value, while the dashed lines show the minimum and the maximum values, respectively.

Citations on this page

Primpke, Sebastian, Marisa Wirth, Claudia Lorenz, and Gunnar Gerdts. 2018. “Reference database design for the automated analysis of microplastic samples based on Fourier transform infrared (FTIR) spectroscopy.” Analytical and Bioanalytical Chemistry 410 (21). Analytical; Bioanalytical Chemistry: 5131–41. https://doi.org/10.1007/s00216-018-1156-x.


sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.1

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tensorflow_1.14.0         abind_1.4-5              
 [3] e1071_1.7-2               keras_2.2.4.1            
 [5] workflowr_1.4.0.9001      baseline_1.2-1           
 [7] gridExtra_2.3             stringr_1.4.0            
 [9] prospectr_0.1.3           RcppArmadillo_0.9.600.4.0
[11] openxlsx_4.1.0.1          magrittr_1.5             
[13] ggplot2_3.2.0             reshape2_1.4.3           
[15] dplyr_0.8.3              

loaded via a namespace (and not attached):
 [1] reticulate_1.13  tidyselect_0.2.5 xfun_0.8         purrr_0.3.2     
 [5] lattice_0.20-38  colorspace_1.4-1 generics_0.0.2   htmltools_0.3.6 
 [9] base64enc_0.1-3  yaml_2.2.0       rlang_0.4.0      later_0.8.0     
[13] pillar_1.4.2     glue_1.3.1       withr_2.1.2      foreach_1.4.7   
[17] plyr_1.8.4       munsell_0.5.0    gtable_0.3.0     zip_2.0.3       
[21] codetools_0.2-16 evaluate_0.14    knitr_1.24       SparseM_1.77    
[25] tfruns_1.4       httpuv_1.5.1     class_7.3-15     highr_0.8       
[29] Rcpp_1.0.2       xtable_1.8-4     promises_1.0.1   scales_1.0.0    
[33] backports_1.1.4  jsonlite_1.6     mime_0.7         fs_1.3.1        
[37] digest_0.6.20    stringi_1.4.3    shiny_1.3.2      grid_3.6.1      
[41] rprojroot_1.3-2  tools_3.6.1      lazyeval_0.2.2   tibble_2.1.3    
[45] crayon_1.3.4     whisker_0.3-2    pkgconfig_2.0.2  zeallot_0.1.0   
[49] Matrix_1.2-17    assertthat_0.2.1 rmarkdown_1.14   iterators_1.0.12
[53] R6_2.4.0         git2r_0.26.1     compiler_3.6.1