Last updated: 2019-08-19
Checks: 6 1
Knit directory: polymeRID/
This reproducible R Markdown analysis was created with workflowr (version 1.4.0.9001). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish
to commit the R Markdown file and build the HTML.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20190729)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rprofile
Ignored: .Rproj.user/
Ignored: analysis/library.bib
Ignored: fun/
Ignored: output/20190810_1538/
Ignored: output/20190810_1546/
Ignored: output/20190810_1609/
Ignored: output/20190813_1044/
Ignored: output/logs/
Ignored: output/natural/
Ignored: output/nnet/
Ignored: output/svm/
Ignored: output/testRunII/
Ignored: output/testRunIII/
Ignored: packrat/lib-R/
Ignored: packrat/lib-ext/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/00LOCK-curl/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/BH/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/FactoMineR/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/IDPmisc/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/KernSmooth/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/MASS/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/Matrix/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/MatrixModels/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ModelMetrics/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/R6/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/RColorBrewer/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/RCurl/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rcpp/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppArmadillo/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppEigen/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppGSL/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/RcppZiggurat/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rfast/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rgtsvm/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/Rmisc/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/SQUAREM/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/SparseM/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/abind/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/askpass/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/assertthat/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/backports/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/base64enc/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/baseline/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/bit/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/bit64/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/bitops/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/boot/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/callr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/car/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/carData/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/caret/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/cellranger/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/class/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/cli/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/clipr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/cluster/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/codetools/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/colorspace/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/config/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/cowplot/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/crayon/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/crosstalk/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/curl/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/data.table/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/dendextend/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/digest/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/doParallel/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/dplyr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/e1071/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ellipse/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ellipsis/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/evaluate/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/factoextra/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/fansi/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/flashClust/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/forcats/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/foreach/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/foreign/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/fs/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/generics/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/getPass/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggplot2/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggpubr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggrepel/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggsci/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ggsignif/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/git2r/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/glue/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/gower/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/gridExtra/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/gtable/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/haven/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/hexbin/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/highr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/hms/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/htmltools/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/htmlwidgets/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/httpuv/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/httr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ipred/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/iterators/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/jsonlite/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/keras/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/kerasR/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/knitr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/labeling/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/later/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/lattice/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/lava/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/lazyeval/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/leaps/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/lme4/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/lubridate/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/magrittr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/maptools/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/markdown/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/mgcv/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/mime/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/minqa/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/munsell/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/nlme/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/nloptr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/nnet/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/numDeriv/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/openssl/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/openxlsx/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/packrat/tests/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/pbkrtest/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/pillar/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/pkgconfig/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/plogr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/plotly/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/plyr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/polynom/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/prettyunits/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/processx/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/prodlim/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/progress/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/promises/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/prospectr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/ps/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/purrr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/quantreg/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/randomForest/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/readr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/readxl/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/recipes/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/rematch/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/reshape2/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/reticulate/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/rio/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/rlang/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/rmarkdown/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/rpart/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/rprojroot/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/rsconnect/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/rstudioapi/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/scales/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/scatterplot3d/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/shiny/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/sourcetools/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/sp/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/stringi/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/stringr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/survival/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/sys/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/tensorflow/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/tfruns/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/tibble/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/tidyr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/tidyselect/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/timeDate/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/tinytex/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/utf8/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/vctrs/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/viridis/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/viridisLite/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/whisker/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/withr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/workflowr/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/xfun/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/xtable/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/yaml/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/zeallot/
Ignored: packrat/lib/x86_64-pc-linux-gnu/3.6.1/zip/
Ignored: packrat/src/
Ignored: polymeRID.Rproj
Ignored: smp/20190812_1723_NNET/files/
Ignored: smp/20190812_1723_NNET/plots/
Ignored: smp/20190812_1729_NNET/files/
Ignored: smp/20190812_1729_NNET/plots/
Ignored: smp/20190812_1731_NNET/files/
Ignored: smp/20190812_1731_NNET/plots/
Ignored: smp/20190812_1733_NNET/files/
Ignored: smp/20190812_1733_NNET/plots/
Ignored: smp/20190815_1847_FUSION/
Ignored: website/
Untracked files:
Untracked: smp/120619_W2_1000_1.txt
Untracked: smp/120619_W2_1000_2.txt
Untracked: smp/120619_W2_300_1.txt
Untracked: smp/120619_W2_300_2.txt
Untracked: smp/120619_W2_300_3.txt
Untracked: smp/120619_W2_300_4.txt
Untracked: smp/120619_W2_300_5.txt
Untracked: smp/120619_W2_500_1.txt
Untracked: smp/120619_W2_500_2.txt
Untracked: smp/120619_W2_500_3.txt
Untracked: smp/120619_W2_500_4.txt
Untracked: smp/120619_W2_500_5.txt
Untracked: smp/120619_W2_500_6.txt
Untracked: smp/120619_W2_500_7.txt
Unstaged changes:
Deleted: Rplots.pdf
Modified: analysis/cnn_calibration.Rmd
Modified: analysis/cnn_exploration.Rmd
Modified: analysis/index.Rmd
Modified: analysis/preparation.Rmd
Modified: classification.R
Modified: code/cnn_cv_K70.R
Modified: code/functions.R
Modified: code/nnet.R
Modified: code/plot_functions.R
Modified: code/shiny_apps/app.R
Modified: code/shiny_apps/rsconnect/shinyapps.io/goergen95/spectra.dcf
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view them.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 6a86688 | goergen95 | 2019-08-19 | prepraration without messages and warnings |
html | 6a86688 | goergen95 | 2019-08-19 | prepraration without messages and warnings |
Rmd | a90881b | goergen95 | 2019-08-19 | prepraration without shiny servers II |
html | a90881b | goergen95 | 2019-08-19 | prepraration without shiny servers II |
Rmd | fee623f | goergen95 | 2019-08-19 | prepraration without shiny servers |
html | fee623f | goergen95 | 2019-08-19 | prepraration without shiny servers |
Rmd | 807b758 | goergen95 | 2019-08-19 | test of rendering shiny app in preparation.html |
html | 807b758 | goergen95 | 2019-08-19 | test of rendering shiny app in preparation.html |
html | b125bc5 | goergen95 | 2019-08-16 | fixed error with pca in classification - now based of training data pca |
html | 2385fbc | goergen95 | 2019-08-14 | republish for layout change |
Rmd | 5d28ce0 | goergen95 | 2019-08-14 | changed citation note |
html | 5d28ce0 | goergen95 | 2019-08-14 | changed citation note |
Rmd | afd89c2 | goergen95 | 2019-08-14 | fixed error in preparation concering FUR class |
html | afd89c2 | goergen95 | 2019-08-14 | fixed error in preparation concering FUR class |
Rmd | c3f088e | goergen95 | 2019-08-13 | started exploration tab |
html | c3f088e | goergen95 | 2019-08-13 | started exploration tab |
html | c52182b | goergen95 | 2019-08-13 | rebuid website |
html | 6e92d01 | goergen95 | 2019-08-13 | Build site. |
Rmd | 9ca3d89 | goergen95 | 2019-08-13 | added website directory mirror |
html | 9ca3d89 | goergen95 | 2019-08-13 | added website directory mirror |
html | 6cfd689 | goergen95 | 2019-08-13 | Build site. |
Rmd | 5774923 | goergen95 | 2019-08-13 | included preparation |
For this project we used a data base published by Primpke et al. (2018) online. The data base can be downloaded here. The authors state the samples were collected based on the FTIR-spectrometer Bruker Tensor 27 System for the spectral range 4000 to 40 1/cm. Additionally, some data of polymer-based fibers and spectra of biological origins were received from the Bremer Faserinstitut. During preprocessing, they applied a concave rubberband correction based on 10 iterations and 64 baseline points. They also excluded the C02 band between 2420 to 2200 1/cm by setting the data points to 0. This should be kept in mind, since the inclusion of additional reference samples requires the same procedure for the data base to stay in a consistent state. The data provided by Primpke et al. (2018) shows a spectral resolution of 2.1 1/cm. Additional reference samples need to be resampled to the same spectral resolution and the same baseline correction should be applied.
To ensure consistency, the data base was read into R and the wavenumbers were saved in a separate file for the future use of resampling additional reference spectra.
library(openxlsx)
url = "https://static-content.springer.com/esm/art%3A10.1007%2Fs00216-018-1156-x/MediaObjects/216_2018_1156_MOESM2_ESM.xlsx"
data = openxlsx::read.xlsx(url)
# extract wavenumbers from first row
wavenumbers = as.numeric(names(data)[2:1864])
# saving wavenumbers to reference sample directory
saveRDS(wavenumbers, paste0(ref, "wavenumbers.rds"))
An important feature of any data base is the distribution of the different classes. Here, we only print the 20 most common classes, since there are a lot of reference samples only found once or twice within the data base.
data$Abbreviation = as.factor(data$Abbreviation)
summary(data$Abbreviation)[1:10]
PES PP LDPE HDPE PET PE Nylon PA PS PUR
15 12 11 10 9 8 7 7 7 7
We are interested in assigning the correct class to potential plastic particles. The most important classes found in the data base to us are thus the ones of artificial polymer origin. However, sometimes also particles of biological origin will be subject to a spectral analysis, because they resemble the appearance of microplastics in environmental samples. Any machine learning algorithm trained only with reference samples from plastics would eventually assign a plastic-class also to the particles of biological origin. It will just assign the class with the greatest similarity to the classes it learned. This can lead to so-called false positive errors. To reduce the error of false positives, we include some of the samples of biological origin as well. We summaries these samples to broader classes.
# furs and wools
indexFur = grep("fur", data$Abbreviation)
indexWool = grep("wool", data$Abbreviation)
furs = data[c(indexFur, indexWool), ]
furs = furs[ , c(2:1864)] # leave out index column
names(furs) = paste("wvn", wavenumbers, sep="")
furs$class = "FUR"
# fibres
indexFibre = grep("fibre", data$Abbreviation)
fibre = data[indexFibre, ]
fibre = fibre[ , c(2:1864)] # leave out index column
names(fibre) = paste("wvn", wavenumbers, sep="")
fibre$class = "FIBRE"
# wood
indexWood = grep("wood", data$Abbreviation)
wood = data[indexWood, ]
wood = wood[ , c(2:1864)] # leave out index colums
names(wood) = paste("wvn", wavenumbers, sep="")
wood$class = "WOOD"
# synthetic polymers
polyIndex = which(data$`Natural./Synthetic` =="synthetic polymer")
syntPolymer = data[polyIndex,]
counts = summary(syntPolymer$Abbreviation)
polyNames = names(counts)[1:10] # only major polymers
syntPolymer = syntPolymer[which(syntPolymer$Abbreviation %in% polyNames) , ]
classes = droplevels(syntPolymer$Abbreviation)
syntPolymer = syntPolymer[ , c(2:1864)] # leave out index column
names(syntPolymer) = paste("wvn",wavenumbers,sep="")
syntPolymer$class = as.character(classes)
# lets group together some synthetic polymer classes
syntPolymer$class[grep("Nylon",syntPolymer$class)] = "PA"
We now bind the reference samples together and take a look at the distribution of classes in the resulting data frame, which is the concrete data base for the following calculations.
data = rbind(furs,wood,fibre,syntPolymer)
data$class = as.factor(data$class)
summary(data$class)
FIBRE FUR HDPE LDPE PA PE PES PET PP PS PUR WOOD
27 23 10 11 14 8 15 9 12 7 7 4
In total, 93 (53%) reference samples of plastic polymers are present in the data base and 44 (47%) of biological origin. Within the plastic samples, we find that the data is very balanced with no single class showing less than 7 samples. For the samples of biological origin, however, the class FIBRE
dominates the distribution. This could prove as an disadvantage if a machine learning algorithm picks up this unbalance by minimizing its error rate simply by more frequently predicting the FIBRE
class. At this point, we will leave the resulting data base as it is and save it to disk. We save the data in individual files as well as in a comprehensive data base in csv format. This way we ensure that later extensions to the data base are easy to manage.
write.csv(data, file = paste0(ref, "reference_database.csv"), row.names=FALSE)
# writing class control file
classIndex = as.character(unique(data$class))
for (class in classIndex){
tmp = data[data$class==class , ]
write.csv(tmp, file = paste0(ref, "reference_", class, ".csv"), row.names=FALSE)
}
write(classIndex, paste0(ref, "classes.txt"))
Primpke, Sebastian, Marisa Wirth, Claudia Lorenz, and Gunnar Gerdts. 2018. “Reference database design for the automated analysis of microplastic samples based on Fourier transform infrared (FTIR) spectroscopy.” Analytical and Bioanalytical Chemistry 410 (21). Analytical; Bioanalytical Chemistry: 5131–41. https://doi.org/10.1007/s00216-018-1156-x.
sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 19.1
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] plotly_4.9.0 shiny_1.3.2
[3] tensorflow_1.14.0 abind_1.4-5
[5] e1071_1.7-2 keras_2.2.4.1
[7] workflowr_1.4.0.9001 baseline_1.2-1
[9] gridExtra_2.3 stringr_1.4.0
[11] prospectr_0.1.3 RcppArmadillo_0.9.600.4.0
[13] openxlsx_4.1.0.1 magrittr_1.5
[15] ggplot2_3.2.0 reshape2_1.4.3
[17] dplyr_0.8.3
loaded via a namespace (and not attached):
[1] Rcpp_1.0.2 lattice_0.20-38 tidyr_0.8.3
[4] class_7.3-15 assertthat_0.2.1 zeallot_0.1.0
[7] rprojroot_1.3-2 digest_0.6.20 foreach_1.4.7
[10] mime_0.7 R6_2.4.0 plyr_1.8.4
[13] backports_1.1.4 evaluate_0.14 httr_1.4.1
[16] pillar_1.4.2 tfruns_1.4 rlang_0.4.0
[19] lazyeval_0.2.2 data.table_1.12.2 SparseM_1.77
[22] whisker_0.3-2 Matrix_1.2-17 reticulate_1.13
[25] rmarkdown_1.14 htmlwidgets_1.3 munsell_0.5.0
[28] compiler_3.6.1 httpuv_1.5.1 xfun_0.8
[31] pkgconfig_2.0.2 base64enc_0.1-3 htmltools_0.3.6
[34] tidyselect_0.2.5 tibble_2.1.3 codetools_0.2-16
[37] viridisLite_0.3.0 crayon_1.3.4 withr_2.1.2
[40] later_0.8.0 grid_3.6.1 jsonlite_1.6
[43] xtable_1.8-4 gtable_0.3.0 git2r_0.26.1
[46] scales_1.0.0 zip_2.0.3 stringi_1.4.3
[49] fs_1.3.1 promises_1.0.1 generics_0.0.2
[52] iterators_1.0.12 tools_3.6.1 glue_1.3.1
[55] purrr_0.3.2 yaml_2.2.0 colorspace_1.4-1
[58] knitr_1.24