Last updated: 2019-08-07
Checks: 2 0
Knit directory: polymeRID/
This reproducible R Markdown analysis was created with workflowr (version 1.4.0.9000). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: fun/
Ignored: output/natural/
Ignored: output/svm/
Ignored: output/testRunII/
Ignored: output/testRunIII/
Ignored: smp/20190802_1224_FUSION/files/
Ignored: smp/20190802_1224_FUSION/plots/
Ignored: smp/20190805_1118_FUSION/
Untracked files:
Untracked: Rplots.pdf
Untracked: mod/20190805_1042/
Untracked: packrat/
Untracked: smp/20190805_1048_FUSION/
Untracked: smp/20190805_1057_SG/
Untracked: smp/20190805_1120_FUSION/
Unstaged changes:
Modified: .Rprofile
Modified: .gitignore
Modified: calibration.R
Modified: classification.R
Modified: code/functions.R
Modified: code/plotTestRun.R
Modified: code/setup.R
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view them.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | caf89e2 | goergen95 | 2019-08-07 | wflow_publish(c(“analysis/index.Rmd”)) |
html | 348ad0a | goergen95 | 2019-08-05 | Build site. |
Rmd | 5b8a2e6 | goergen95 | 2019-08-05 | wflow_publish(c(“analysis/index.Rmd”)) |
Rmd | 6c813f4 | goergen95 | 2019-07-29 | implemented workflowr |
Rmd | d525cc2 | goergen95 | 2019-07-29 | Start workflowr project. |
Welcome to my research website. Here I present the results of my work for a master’s seminar at the University of Marburg concerned with microplastic in the environment.
The seminar was placed in the context of a broader research project between the working group of Soil and Water Ecosystems at the Institute of Geography and the working group of Semiconductor Photonics at the Institute of Physics. It was conducted jointly by Prof. Dr. Peter Chifflard and MSc Julia Prume and was focused on the analysis of envrionmental samples to answer different geographical questions about the how and then of microplastic particles moving in water ecosystems.
While the most projects during the seminar were focusing on these important geographical questions I set out my project to actually ease the cumbersome process of categorizing FTIR spectra of particles wihtin the samples to polymer classes. The idea was that up-to-date machine learning models applied to the high-dimensional spectral data of particles found in samples could minimize the need for human intervention in the classifcation process and thus significantly speed up the process of categorizing particles to reference polymers.
To achieve this aim this project consisted of basically four distinguishable work steps. These are: * Preparation: At first the establishment of a comprehensive database of reference spectra was mandatory to allow the application of machine learning models. The preparation steps included spectral resampling and labeling of reference polymers and natural particles. Later, this database underwent some steps of baseline corrections as well as different level of pre-processing such as normalisation and Savtizkiy-Golay filtering. * Exploration: At the different types of pre-processing techniques were assessed by a brute-force method in which all different levels of the data were presented to different machine learning models and their capability to correctly classify the dataset was captured. Additionally, different levels of noise was added to the data so that the models and pre-processing types which most robustly classify the spectra could be identified. * Calibration: After the explorative stage, a applicable algorithm which can be calibrated to a potentially changing database needed to be established. That was important, so that the code can be used in the future as well e.g. when the reference database should be extended or the wavenumbers of interes might change. * Classification: This last stage is the core part of the project in the sense that at this stage real environmental samples are to be classified in a user-friendly way to ease the categorization process. That means that some accuracy values of the classification need to be easily accesiable as well as some possibilities for a human agent to assess the accuracy