Last updated: 2019-10-02

Checks: 7 0

Knit directory: 2019-feature-selection/

This reproducible R Markdown analysis was created with workflowr (version 1.4.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20190522) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    .Ruserdata/
    Ignored:    .drake/
    Ignored:    analysis/rosm.cache/
    Ignored:    data/
    Ignored:    inst/Benchmark for Filter Methods for Feature Selection in High-Dimensional  Classification Data.pdf
    Ignored:    inst/study-area-map/study-area.qgs~
    Ignored:    log/
    Ignored:    packrat/lib-R/
    Ignored:    packrat/lib-ext/
    Ignored:    packrat/lib/
    Ignored:    reviews/
    Ignored:    rosm.cache/
    Ignored:    tests/

Untracked files:
    Untracked:  .drake_history/

Unstaged changes:
    Modified:   .Rprofile
    Modified:   R/06-mlr-paper.R
    Modified:   _drake.R
    Modified:   code/06-benchmark-matrix.R
    Modified:   code/061-aggregate.R
    Modified:   code/98-paper/ieee/pdf/correlation-filter-nri-1.pdf
    Modified:   code/98-paper/ieee/pdf/correlation-nbins-1.pdf
    Modified:   code/98-paper/ieee/pdf/defoliation-distribution-plot-1.pdf
    Modified:   code/98-paper/ieee/pdf/spectral-signatures-1.pdf
    Modified:   code/98-paper/journal/defoliation-distribution-plot-1.pdf
    Modified:   code/99-packages.R
    Modified:   code/move-figures.R
    Deleted:    docs/figure/eval-performance.Rmd/filter-effect-1.pdf
    Deleted:    docs/figure/eval-performance.Rmd/filter-perf-all-1.pdf
    Deleted:    docs/figure/eval-performance.Rmd/performance-results-1.pdf
    Deleted:    docs/figure/spectral-signatures.Rmd/spectral-signatures-1.pdf
    Deleted:    docs/logo/life.jpg
    Modified:   inst/study-area-map/study-area.qgs
    Deleted:    packrat/src/Matrix/Matrix_1.2-14.tar.gz
    Deleted:    packrat/src/dplyr/dplyr_0.7.4.tar.gz
    Deleted:    packrat/src/igraph/igraph_1.2.2.tar.gz
    Modified:   packrat/src/igraph/igraph_1.2.4.tar.gz
    Deleted:    packrat/src/mlr/2c578dc4a2bf43041d5101df881f4d21dddd35bf.tar.gz
    Deleted:    packrat/src/mlr/57bb7819aee16b317e725b103b8a1ba0d26dc5b2.tar.gz
    Deleted:    packrat/src/tidyr/tidyr_0.8.0.tar.gz
    Modified:   packrat/src/tidyr/tidyr_0.8.2.tar.gz
    Modified:   slurm_clustermq.tmpl

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
html 49da171 pat-s 2019-09-22 Build site.
html c6317a8 pat-s 2019-09-19 Build site.
Rmd d7c72a8 pat-s 2019-09-19 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html 7fd40ca pat-s 2019-09-18 Build site.
Rmd 44ff84b pat-s 2019-09-18 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html 41aae14 pat-s 2019-09-12 Build site.
html ff340b8 pat-s 2019-09-03 Build site.
Rmd a524819 pat-s 2019-09-03 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html b181c52 pat-s 2019-09-02 Build site.
Rmd cf6e820 pat-s 2019-09-02 wflow_publish(“analysis/eval-performance.Rmd”)
Rmd 1bec10d pat-s 2019-09-01 no timestamps in latex tables
html 4e363ac pat-s 2019-09-01 Build site.
Rmd 518d0cb pat-s 2019-09-01 style files using tidyverse style
html 8e7e4fe pat-s 2019-09-01 Build site.
Rmd 8941bca pat-s 2019-09-01 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
Rmd 297ed93 pat-s 2019-08-31 add filter vs no filter comparison plot
html 7582c67 pat-s 2019-08-31 Build site.
html abd531f pat-s 2019-08-31 Build site.
Rmd 9117eee pat-s 2019-08-31 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html 1ec8768 pat-s 2019-08-17 Build site.
html df85aba pat-s 2019-07-12 Build site.
html 3a44a95 pat-s 2019-07-10 Build site.
html c238ce4 pat-s 2019-07-10 Build site.
Rmd e98cb01 pat-s 2019-07-10 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
Rmd 24e318f pat-s 2019-07-01 update reports
Rmd ca5c5bc pat-s 2019-06-28 add eval-performance report

(Table) All leaner/filter/task combinations ordered by performance.

Overall leaderboard across all settings, sorted descending by performance.

Learner Group

Task

Filter

RMSE

XGBoost

HR-NRI-VI

Borda

32.373

XGBoost

HR

Relief

32.927

XGBoost

HR

CMIM

33.293

RF

NRI

Car

33.498

XGBoost

HR-NRI

Borda

33.615

SVM

HR-NRI

Car

33.950

XGBoost

HR

MRMR

34.041

Ridge-CV

NRI

No Filter

34.050

XGBoost

NRI

Info

34.133

XGBoost

HR-NRI-VI

Info

34.277

XGBoost

NRI

No Filter

34.302

SVM

HR-VI

Car

34.352

SVM

VI

Car

34.361

XGBoost

NRI

CMIM

34.368

SVM

HR-NRI-VI

Car

34.396

XGBoost

HR-NRI

Pearson

34.417

XGBoost

HR-NRI

Info

34.450

SVM

HR-VI

Info

34.474

SVM

HR-NRI-VI

MRMR

34.505

SVM

VI

CMIM

34.534

SVM

HR-NRI-VI

Borda

34.536

SVM

VI

Info

34.537

SVM

HR-NRI-VI

Info

34.544

XGBoost

HR-NRI-VI

Relief

34.557

SVM

VI

Pearson

34.587

SVM

HR-VI

MRMR

34.590

XGBoost

HR-NRI-VI

CMIM

34.612

SVM

HR-NRI-VI

CMIM

34.620

SVM

NRI

PCA

34.621

SVM

HR-NRI-VI

PCA

34.621

SVM

VI

PCA

34.621

SVM

HR-NRI

PCA

34.621

SVM

HR-NRI

No Filter

34.622

SVM

HR-VI

No Filter

34.622

SVM

HR-NRI-VI

No Filter

34.622

SVM

HR

No Filter

34.622

SVM

VI

No Filter

34.622

SVM

HR

Pearson

34.622

SVM

HR-VI

PCA

34.622

SVM

HR-VI

Borda

34.625

SVM

HR

PCA

34.626

SVM

HR

Borda

34.627

SVM

HR

Car

34.631

SVM

HR

MRMR

34.631

SVM

HR

CMIM

34.631

SVM

HR

Info

34.633

SVM

HR-NRI

CMIM

34.644

SVM

HR-VI

CMIM

34.653

SVM

VI

MRMR

34.666

XGBoost

HR-NRI-VI

MRMR

34.683

SVM

HR-NRI

Relief

34.683

SVM

HR

Relief

34.731

SVM

HR-NRI

MRMR

34.736

SVM

HR-VI

Pearson

34.746

XGBoost

NRI

MRMR

34.747

XGBoost

NRI

Borda

34.845

XGBoost

NRI

Relief

34.887

SVM

VI

Relief

34.935

SVM

VI

Borda

34.968

XGBoost

HR-NRI

MRMR

35.107

XGBoost

HR-NRI-VI

No Filter

35.237

XGBoost

HR-NRI

No Filter

35.385

RF

HR-NRI

Car

35.633

XGBoost

HR-NRI

Car

35.636

XGBoost

HR-NRI-VI

Car

35.754

XGBoost

NRI

Car

35.841

SVM

NRI

Car

36.032

XGBoost

HR-NRI

Relief

36.103

XGBoost

HR-NRI-VI

Pearson

36.410

XGBoost

NRI

Pearson

36.451

RF

VI

Info

36.675

XGBoost

HR-NRI

CMIM

36.692

RF

HR-NRI-VI

Car

36.893

RF

HR-NRI-VI

PCA

36.908

RF

VI

Borda

36.921

RF

HR-NRI

PCA

36.964

RF

HR-NRI-VI

CMIM

37.069

RF

HR-NRI

CMIM

37.244

RF

NRI

CMIM

37.398

RF

HR-NRI

Pearson

37.592

RF

NRI

MRMR

37.613

RF

HR-VI

MRMR

37.623

RF

HR

Info

37.632

RF

HR-NRI

No Filter

37.744

RF

HR-NRI-VI

Pearson

37.776

RF

NRI

No Filter

37.780

RF

HR-NRI-VI

No Filter

37.806

RF

HR-NRI-VI

Borda

37.843

RF

HR-NRI-VI

MRMR

37.845

RF

HR-VI

Borda

37.853

XGBoost

HR-VI

Car

37.917

XGBoost

HR-VI

Pearson

37.930

RF

HR-NRI

Info

37.949

RF

NRI

Info

37.971

RF

HR-NRI

Borda

37.989

RF

HR-NRI

MRMR

38.000

XGBoost

HR-VI

MRMR

38.058

RF

HR-NRI-VI

Info

38.161

XGBoost

HR

Borda

38.175

XGBoost

VI

CMIM

38.213

SVM

HR-VI

Relief

38.213

RF

NRI

Borda

38.215

RF

HR-NRI

Relief

38.253

SVM

HR-NRI

Pearson

38.313

RF

NRI

Pearson

38.389

XGBoost

HR

Info

38.447

XGBoost

VI

Pearson

38.546

RF

HR

Pearson

38.555

RF

HR-NRI-VI

Relief

38.577

XGBoost

VI

No Filter

38.651

XGBoost

VI

Relief

38.670

RF

HR

Car

38.689

SVM

HR-NRI

Borda

38.737

SVM

HR-NRI

Info

38.802

XGBoost

HR-VI

CMIM

38.840

XGBoost

VI

Info

38.854

RF

HR

Borda

38.860

XGBoost

VI

Borda

38.863

XGBoost

VI

MRMR

38.891

RF

NRI

Relief

38.904

XGBoost

HR

Pearson

38.907

XGBoost

HR-VI

Relief

39.007

SVM

HR-NRI-VI

Pearson

39.007

XGBoost

HR-VI

Borda

39.307

XGBoost

HR

Car

39.439

RF

HR

MRMR

39.487

XGBoost

HR-VI

Info

39.515

RF

VI

No Filter

39.573

RF

VI

MRMR

39.673

RF

NRI

PCA

39.680

XGBoost

HR-NRI

PCA

39.897

RF

VI

Car

40.007

RF

VI

Pearson

40.153

RF

HR-VI

No Filter

40.484

RF

VI

CMIM

40.711

RF

HR-VI

Car

40.794

RF

HR-VI

CMIM

40.823

Ridge-CV

HR-NRI

No Filter

40.883

RF

VI

Relief

40.999

RF

HR-VI

Info

41.028

SVM

NRI

No Filter

41.048

RF

HR-VI

Relief

41.113

SVM

NRI

MRMR

41.142

SVM

NRI

Relief

41.142

RF

HR

CMIM

41.433

SVM

NRI

Pearson

41.434

SVM

NRI

CMIM

41.459

Ridge-CV

HR

No Filter

41.463

SVM

NRI

Borda

41.520

XGBoost

VI

Car

41.584

RF

HR-VI

Pearson

41.585

RF

HR

Relief

41.595

SVM

NRI

Info

41.614

RF

VI

PCA

41.633

XGBoost

NRI

PCA

41.637

XGBoost

HR-VI

PCA

42.014

XGBoost

HR

PCA

42.053

XGBoost

HR-VI

No Filter

42.146

RF

HR

No Filter

42.186

XGBoost

VI

PCA

42.474

RF

HR-VI

PCA

42.774

XGBoost

HR-NRI-VI

PCA

43.312

RF

HR

PCA

43.636

SVM

HR-NRI-VI

Relief

43.838

XGBoost

HR

No Filter

44.350

Lasso-CV

HR

No Filter

50.453

Lasso-CV

HR-NRI

No Filter

50.855

Lasso-CV

NRI

No Filter

51.184

Lasso-CV

HR-NRI-VI

No Filter

58.263

Lasso-CV

VI

No Filter

58.329

Lasso-CV

HR-VI

No Filter

58.329

Ridge-MBO

VI

No Filter

58.555

Ridge-MBO

HR-VI

No Filter

58.555

Ridge-MBO

HR-NRI-VI

No Filter

58.555

Lasso-MBO

HR

No Filter

58.555

Lasso-MBO

NRI

No Filter

58.555

Lasso-MBO

HR-NRI

No Filter

58.555

Ridge-MBO

HR

No Filter

58.555

Ridge-MBO

NRI

No Filter

58.555

Ridge-MBO

HR-NRI

No Filter

58.555

Lasso-MBO

VI

No Filter

58.555

Lasso-MBO

HR-VI

No Filter

58.555

Lasso-MBO

HR-NRI-VI

No Filter

58.555

Ridge-CV

HR-NRI-VI

No Filter

50502559.642

Ridge-CV

HR-VI

No Filter

59986649.556

Ridge-CV

VI

No Filter

60698033.323

(Table) Best learner/filter/task combination

Learners: On which task and using which filter did every learner score their best result on?

-CV: L2 penalized regression using the internal 10-fold CV tuning of the glmnet package -MBO: L2 penalized regression using using MBO for hyperparameter optimization.

Task

Learner Group

Filter

RMSE

defoliation-all-plots-HR-NRI-VI

XGBoost

Borda

32.373

defoliation-all-plots-NRI

RF

Car

33.498

defoliation-all-plots-HR-NRI

SVM

Car

33.950

defoliation-all-plots-NRI

Ridge-CV

No Filter

34.050

defoliation-all-plots-HR

Lasso-CV

No Filter

50.453

defoliation-all-plots-VI

Ridge-MBO

No Filter

58.555

defoliation-all-plots-HR

Lasso-MBO

No Filter

58.555

(Plot) Best learner/filter combs for all tasks

Version Author Date
49da171 pat-s 2019-09-22
41aae14 pat-s 2019-09-12
b181c52 pat-s 2019-09-02
8e7e4fe pat-s 2019-09-01
7582c67 pat-s 2019-08-31
abd531f pat-s 2019-08-31

(Plot) Best filter combination of each learner vs. no filter per task vs. Borda

Showing the final effect of applying feature selection to a learner for each task. The more left a certain filter appears for a given task compared to the purple dot (No Filter), the higher the effectivity of applying feature selection for that given learner on the given task.

Version Author Date
49da171 pat-s 2019-09-22
c6317a8 pat-s 2019-09-19
7fd40ca pat-s 2019-09-18
41aae14 pat-s 2019-09-12
ff340b8 pat-s 2019-09-03
b181c52 pat-s 2019-09-02
8e7e4fe pat-s 2019-09-01

(Plot) Best filter combination of each learner vs. Borda filter

Showing the final effect of the ensemble Borda filter vs the best scoring simple filter.

Version Author Date
49da171 pat-s 2019-09-22
7fd40ca pat-s 2019-09-18
41aae14 pat-s 2019-09-12
ff340b8 pat-s 2019-09-03
b181c52 pat-s 2019-09-02

(Plot) Performances of all filter methods

Version Author Date
49da171 pat-s 2019-09-22
7fd40ca pat-s 2019-09-18
41aae14 pat-s 2019-09-12
ff340b8 pat-s 2019-09-03
b181c52 pat-s 2019-09-02

R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /opt/spack/opt/spack/linux-centos7-x86_64/gcc-9.2.0/openblas-0.3.7-epeitvjwewaa3avb3brxkgbim4rh3qwb/lib/libopenblas_zen-r0.3.7.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] xtable_1.8-3      flextable_0.5.5   ggbeeswarm_0.7.0 
 [4] ggpubr_0.1.6      ggsci_2.9         ggrepel_0.8.0    
 [7] ggplot2_3.1.0     dplyr_0.8.0.1     magrittr_1.5     
[10] mlr_2.15.0        ParamHelpers_1.12 tidyselect_0.2.5 

loaded via a namespace (and not attached):
 [1] httr_1.4.0         tidyr_0.8.2        jsonlite_1.6      
 [4] viridisLite_0.3.0  splines_3.5.2      R.utils_2.8.0     
 [7] assertthat_0.2.0   base64url_1.4      vipor_0.4.5       
[10] yaml_2.2.0         gdtools_0.1.7      mlrMBO_1.1.2      
[13] pillar_1.3.1       backports_1.1.3    lattice_0.20-38   
[16] glue_1.3.0         uuid_0.1-2         digest_0.6.18     
[19] RColorBrewer_1.1-2 checkmate_1.9.1    colorspace_1.4-0  
[22] htmltools_0.3.6    Matrix_1.2-15      R.oo_1.22.0       
[25] plyr_1.8.4         pkgconfig_2.0.2    lhs_1.0.1         
[28] misc3d_0.8-4       drake_7.5.2        purrr_0.3.0       
[31] scales_1.0.0       parallelMap_1.4    whisker_0.3-2     
[34] officer_0.3.5      mco_1.0-15.1       git2r_0.24.0      
[37] tibble_2.0.1       txtq_0.1.4         withr_2.1.2       
[40] lazyeval_0.2.1     cli_1.1.0          survival_2.43-3   
[43] RJSONIO_1.3-1.1    crayon_1.3.4       evaluate_0.13     
[46] storr_1.2.1        R.methodsS3_1.7.1  fs_1.2.6          
[49] xml2_1.2.0         beeswarm_0.2.3     tools_3.5.2       
[52] data.table_1.12.0  BBmisc_1.11        stringr_1.4.0     
[55] plotly_4.8.0       munsell_0.5.0      zip_2.0.0         
[58] compiler_3.5.2     rlang_0.3.4        plot3D_1.1.1      
[61] grid_3.5.2         htmlwidgets_1.3    igraph_1.2.4      
[64] labeling_0.3       base64enc_0.1-3    rmarkdown_1.13    
[67] gtable_0.2.0       smoof_1.5.1        R6_2.4.0          
[70] knitr_1.23         fastmatch_1.1-0    filelock_1.0.2    
[73] workflowr_1.4.0    rprojroot_1.3-2    stringi_1.3.1     
[76] parallel_3.5.2     Rcpp_1.0.0         xfun_0.5