Evaluation of performances

Last updated: 2019-09-01

Checks: 7 0

Knit directory: 2019-feature-selection/

This reproducible R Markdown analysis was created with workflowr (version 1.4.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20190522)

The command set.seed(20190522) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: 2fb8f69

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    .Ruserdata/
    Ignored:    .drake/
    Ignored:    analysis/rosm.cache/
    Ignored:    data/
    Ignored:    inst/Benchmark for Filter Methods for Feature Selection in High-Dimensional  Classification Data.pdf
    Ignored:    inst/study-area-map/study-area.qgs~
    Ignored:    log/
    Ignored:    packrat/lib-R/
    Ignored:    packrat/lib-ext/
    Ignored:    packrat/lib/
    Ignored:    reviews/
    Ignored:    rosm.cache/
    Ignored:    tests/

Untracked files:
    Untracked:  .drake_history/

Unstaged changes:
    Modified:   _drake.R
    Modified:   code/98-paper/ieee/pdf/correlation-filter-nri-1.pdf
    Modified:   code/98-paper/ieee/pdf/correlation-nbins-1.pdf
    Modified:   code/98-paper/ieee/pdf/defoliation-distribution-plot-1.pdf
    Modified:   code/98-paper/ieee/pdf/spectral-signatures-1.pdf
    Modified:   code/98-paper/journal/defoliation-distribution-plot-1.pdf
    Deleted:    docs/figure/eda.Rmd/defoliation-distribution-plot-1.pdf
    Deleted:    docs/figure/eval-performance.Rmd/filter-effect-1.pdf
    Deleted:    docs/figure/eval-performance.Rmd/performance-results-1.pdf
    Deleted:    docs/figure/filter-correlation.Rmd/correlation-filter-nri-1.pdf
    Deleted:    docs/figure/filter-correlation.Rmd/correlation-nbins-1.pdf
    Deleted:    docs/figure/spectral-signatures.Rmd/spectral-signatures-1.pdf

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File	Version	Author	Date	Message
Rmd	518d0cb	pat-s	2019-09-01	style files using tidyverse style
html	8e7e4fe	pat-s	2019-09-01	Build site.
Rmd	8941bca	pat-s	2019-09-01	wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
Rmd	297ed93	pat-s	2019-08-31	add filter vs no filter comparison plot
html	7582c67	pat-s	2019-08-31	Build site.
html	abd531f	pat-s	2019-08-31	Build site.
Rmd	9117eee	pat-s	2019-08-31	wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html	1ec8768	pat-s	2019-08-17	Build site.
html	df85aba	pat-s	2019-07-12	Build site.
html	3a44a95	pat-s	2019-07-10	Build site.
html	c238ce4	pat-s	2019-07-10	Build site.
Rmd	e98cb01	pat-s	2019-07-10	wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
Rmd	24e318f	pat-s	2019-07-01	update reports
Rmd	ca5c5bc	pat-s	2019-06-28	add eval-performance report

(Table) All leaner/filter/task combinations ordered by performance.

Overall leaderboard across all settings, sorted descending by performance.

Learner Group	Task	Filter	RMSE
XGBoost	HR-NRI-VI	No Filter	33.950
RF	NRI	Car	33.958
Ridge-CV	NRI	No Filter	33.980
SVM	HR-VI	Borda	34.254
SVM	VI	Car	34.328
SVM	VI	Relief	34.473
SVM	VI	Borda	34.478
SVM	NRI	Info	34.480
XGBoost	HR-NRI-VI	CMIM	34.541
SVM	HR-NRI	Info	34.562
SVM	HR-VI	Info	34.596
SVM	HR-VI	Pearson	34.601
SVM	HR-NRI-VI	Car	34.606
SVM	VI	Info	34.608
SVM	HR-NRI	Car	34.609
SVM	HR-NRI-VI	Pearson	34.614
SVM	HR-VI	MRMR	34.615
SVM	VI	PCA	34.620
SVM	HR-NRI-VI	PCA	34.620
SVM	HR-NRI-VI	MRMR	34.621
SVM	HR-NRI	PCA	34.621
SVM	VI	No Filter	34.622
SVM	HR-VI	No Filter	34.622
SVM	HR-NRI-VI	No Filter	34.622
SVM	HR-NRI	No Filter	34.622
SVM	HR	No Filter	34.622
SVM	HR-VI	CMIM	34.622
SVM	HR	Pearson	34.622
SVM	HR-NRI-VI	CMIM	34.622
SVM	HR	MRMR	34.622
SVM	HR-NRI	Relief	34.622
SVM	HR-NRI	CMIM	34.622
SVM	HR-VI	PCA	34.623
SVM	HR	Info	34.623
SVM	HR	Borda	34.623
SVM	NRI	PCA	34.624
SVM	HR	CMIM	34.632
SVM	HR	PCA	34.635
SVM	VI	MRMR	34.639
SVM	HR-VI	Car	34.640
SVM	VI	Pearson	34.667
SVM	HR-NRI	MRMR	34.672
XGBoost	HR-NRI	No Filter	34.681
XGBoost	NRI	No Filter	34.707
SVM	HR	Relief	34.738
XGBoost	NRI	CMIM	34.769
RF	HR-NRI-VI	Car	34.879
XGBoost	HR-NRI-VI	Relief	34.882
XGBoost	HR-NRI-VI	Car	35.056
XGBoost	HR-NRI	Relief	35.263
XGBoost	HR-NRI-VI	MRMR	35.310
XGBoost	NRI	MRMR	35.364
XGBoost	HR-NRI	Car	35.414
XGBoost	NRI	Car	35.418
XGBoost	HR-NRI	CMIM	35.507
XGBoost	NRI	Info	35.743
XGBoost	HR-NRI	MRMR	35.876
XGBoost	HR-NRI-VI	Info	36.163
RF	HR-NRI	Car	36.243
XGBoost	HR	CMIM	36.368
SVM	HR-NRI-VI	Info	36.460
SVM	HR-NRI	Pearson	36.615
RF	HR-NRI	PCA	36.756
RF	HR-NRI-VI	PCA	36.772
SVM	VI	CMIM	36.891
RF	NRI	Relief	36.944
XGBoost	HR-NRI	Info	36.965
RF	NRI	PCA	36.977
RF	HR-NRI-VI	Relief	36.999
XGBoost	NRI	Pearson	37.008
RF	HR-NRI-VI	CMIM	37.222
RF	HR	Info	37.338
RF	NRI	CMIM	37.383
RF	HR-NRI	CMIM	37.455
SVM	HR	Car	37.485
RF	HR-NRI-VI	Pearson	37.512
XGBoost	HR	Pearson	37.568
RF	NRI	Info	37.589
RF	NRI	No Filter	37.609
XGBoost	HR-NRI	PCA	37.616
RF	HR-NRI-VI	No Filter	37.661
RF	HR-NRI	Relief	37.669
RF	HR-NRI	No Filter	37.796
RF	HR-NRI	MRMR	37.854
RF	HR-NRI	Info	37.890
RF	HR-NRI-VI	Info	37.904
XGBoost	HR	Car	37.998
RF	NRI	MRMR	37.999
RF	HR-NRI-VI	MRMR	38.047
RF	HR-NRI-VI	Borda	38.080
RF	HR-NRI	Borda	38.259
XGBoost	NRI	PCA	38.293
RF	HR	Pearson	38.321
RF	NRI	Borda	38.366
XGBoost	HR-NRI	Pearson	38.382
RF	HR-NRI	Pearson	38.473
SVM	HR-NRI-VI	Borda	38.487
XGBoost	NRI	Relief	38.504
SVM	HR-NRI	Borda	38.521
RF	HR	CMIM	38.528
RF	NRI	Pearson	38.544
RF	HR	MRMR	38.661
RF	VI	Borda	38.716
XGBoost	HR-NRI-VI	PCA	38.831
RF	HR	Borda	39.001
RF	VI	Car	39.049
RF	HR-VI	Info	39.479
RF	VI	No Filter	39.541
RF	VI	Info	39.685
XGBoost	HR	Info	39.730
XGBoost	HR-NRI-VI	Pearson	39.890
XGBoost	HR-VI	Relief	39.992
RF	HR-VI	No Filter	40.330
SVM	NRI	MRMR	40.331
SVM	NRI	Car	40.516
RF	HR-VI	Relief	40.527
RF	VI	CMIM	40.564
XGBoost	VI	Car	40.596
RF	HR	Car	40.600
RF	VI	MRMR	40.626
SVM	NRI	No Filter	40.632
RF	VI	Pearson	40.658
SVM	NRI	Pearson	40.692
RF	HR-VI	Borda	40.713
XGBoost	VI	No Filter	40.718
RF	VI	Relief	40.744
Ridge-CV	HR-NRI	No Filter	40.883
RF	HR-VI	MRMR	40.996
RF	HR-VI	CMIM	41.019
RF	HR-VI	Pearson	41.090
XGBoost	HR-VI	PCA	41.125
Ridge-CV	HR	No Filter	41.136
XGBoost	HR-VI	Pearson	41.196
XGBoost	VI	Info	41.237
XGBoost	VI	Pearson	41.321
XGBoost	HR	MRMR	41.350
RF	HR-VI	Car	41.449
RF	VI	PCA	41.458
SVM	NRI	CMIM	41.476
RF	HR	Relief	41.492
XGBoost	VI	CMIM	41.528
XGBoost	HR-VI	No Filter	41.563
SVM	NRI	Borda	41.651
XGBoost	VI	PCA	41.790
XGBoost	VI	Relief	41.860
XGBoost	VI	MRMR	41.914
XGBoost	HR-VI	Info	41.953
XGBoost	HR-VI	MRMR	41.962
RF	HR	No Filter	42.114
XGBoost	HR	Relief	42.127
SVM	NRI	Relief	42.300
RF	HR-VI	PCA	42.791
SVM	HR-VI	Relief	42.805
XGBoost	HR-VI	CMIM	42.912
RF	HR	PCA	43.776
XGBoost	HR	No Filter	43.898
XGBoost	HR-VI	Car	44.434
XGBoost	HR	PCA	45.335
SVM	HR-NRI-VI	Relief	46.288
Lasso-CV	HR	No Filter	50.453
Lasso-CV	HR-NRI	No Filter	50.855
Lasso-CV	NRI	No Filter	51.184
Lasso-CV	HR-NRI-VI	No Filter	58.263
Lasso-CV	VI	No Filter	58.329
Lasso-CV	HR-VI	No Filter	58.329
Ridge-MBO	HR-VI	No Filter	58.555
Ridge-MBO	VI	No Filter	58.555
Ridge-MBO	HR-NRI-VI	No Filter	58.555
Lasso-MBO	HR	No Filter	58.555
Lasso-MBO	NRI	No Filter	58.555
Lasso-MBO	HR-NRI	No Filter	58.555
Ridge-MBO	HR	No Filter	58.555
Ridge-MBO	NRI	No Filter	58.555
Ridge-MBO	HR-NRI	No Filter	58.555
Lasso-MBO	VI	No Filter	58.555
Lasso-MBO	HR-VI	No Filter	58.555
Lasso-MBO	HR-NRI-VI	No Filter	58.555
Ridge-CV	HR-NRI-VI	No Filter	50502559.642
Ridge-CV	HR-VI	No Filter	55034896.733
Ridge-CV	VI	No Filter	55669702.590

(Table) Best learner/filter/task combination

Learners: On which task and using which filter did every learner score their best result on?

-CV: L2 penalized regression using the internal 10-fold CV tuning of the glmnet package -MBO: L2 penalized regression using using MBO for hyperparameter optimization.

Task	Learner Group	Filter	RMSE
defoliation-all-plots-HR-NRI-VI	XGBoost	No Filter	33.950
defoliation-all-plots-NRI	RF	Car	33.958
defoliation-all-plots-NRI	Ridge-CV	No Filter	33.980
defoliation-all-plots-HR-VI	SVM	Borda	34.254
defoliation-all-plots-HR	Lasso-CV	No Filter	50.453
defoliation-all-plots-HR-VI	Ridge-MBO	No Filter	58.555
defoliation-all-plots-HR	Lasso-MBO	No Filter	58.555

(Plot) Best learner/filter combs for all tasks

Version	Author	Date
8e7e4fe	pat-s	2019-09-01
7582c67	pat-s	2019-08-31
abd531f	pat-s	2019-08-31

(Plot) Best filter combination of each learner vs. no filter per task

Showing the final effect of applying feature selection to a learner for each task. The more left a certain filter appears for a given task compared to the purple dot (No Filter), the higher the effectivity of applying feature selection for that given learner on the given task.

Version	Author	Date
8e7e4fe	pat-s	2019-09-01

R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /opt/R/3.5.2/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/libopenblaso-r0.3.3.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] xtable_1.8-3      flextable_0.5.5   ggbeeswarm_0.7.0 
 [4] ggpubr_0.1.6      ggsci_2.9         ggrepel_0.8.0    
 [7] ggplot2_3.1.0     dplyr_0.8.0.1     magrittr_1.5     
[10] mlr_2.15.0        ParamHelpers_1.12 tidyselect_0.2.5 

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0        txtq_0.1.4        lattice_0.20-38  
 [4] tidyr_0.8.2       assertthat_0.2.0  rprojroot_1.3-2  
 [7] digest_0.6.18     R6_2.4.0          plyr_1.8.4       
[10] backports_1.1.3   drake_7.5.2       evaluate_0.13    
[13] pillar_1.3.1      gdtools_0.1.7     rlang_0.3.4      
[16] lazyeval_0.2.1    uuid_0.1-2        data.table_1.12.0
[19] whisker_0.3-2     R.utils_2.8.0     R.oo_1.22.0      
[22] Matrix_1.2-15     checkmate_1.9.1   rmarkdown_1.13   
[25] labeling_0.3      splines_3.5.2     stringr_1.4.0    
[28] igraph_1.2.4      munsell_0.5.0     compiler_3.5.2   
[31] vipor_0.4.5       xfun_0.5          pkgconfig_2.0.2  
[34] base64enc_0.1-3   BBmisc_1.11       htmltools_0.3.6  
[37] tibble_2.0.1      workflowr_1.4.0   crayon_1.3.4     
[40] withr_2.1.2       R.methodsS3_1.7.1 grid_3.5.2       
[43] gtable_0.2.0      git2r_0.24.0      storr_1.2.1      
[46] scales_1.0.0      zip_2.0.0         cli_1.1.0        
[49] stringi_1.3.1     fs_1.2.6          parallelMap_1.4  
[52] xml2_1.2.0        filelock_1.0.2    fastmatch_1.1-0  
[55] tools_3.5.2       glue_1.3.0        beeswarm_0.2.3   
[58] officer_0.3.5     purrr_0.3.0       parallel_3.5.2   
[61] survival_2.43-3   yaml_2.2.0        colorspace_1.4-0 
[64] base64url_1.4     knitr_1.23

Evaluation of performances

Patrick Schratz

(Table) All leaner/filter/task combinations ordered by performance.

(Table) Best learner/filter/task combination

(Plot) Best learner/filter combs for all tasks

(Plot) Best filter combination of each learner vs. no filter per task