Last updated: 2019-09-22

Checks: 7 0

Knit directory: 2019-feature-selection/

This reproducible R Markdown analysis was created with workflowr (version 1.4.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20190522) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    .Ruserdata/
    Ignored:    .drake/
    Ignored:    analysis/rosm.cache/
    Ignored:    data/
    Ignored:    inst/Benchmark for Filter Methods for Feature Selection in High-Dimensional  Classification Data.pdf
    Ignored:    inst/study-area-map/study-area.qgs~
    Ignored:    log/
    Ignored:    packrat/lib-R/
    Ignored:    packrat/lib-ext/
    Ignored:    packrat/lib/
    Ignored:    reviews/
    Ignored:    rosm.cache/
    Ignored:    tests/

Untracked files:
    Untracked:  .drake_history/

Unstaged changes:
    Modified:   R/06-mlr-paper.R
    Modified:   _drake.R
    Modified:   code/05-modeling/paper/feature-importance.R
    Modified:   code/06-benchmark-matrix.R
    Modified:   code/061-aggregate.R
    Modified:   code/98-paper/ieee/pdf/correlation-filter-nri-1.pdf
    Modified:   code/98-paper/ieee/pdf/correlation-nbins-1.pdf
    Modified:   code/98-paper/ieee/pdf/defoliation-distribution-plot-1.pdf
    Modified:   code/98-paper/ieee/pdf/spectral-signatures-1.pdf
    Modified:   code/98-paper/ieee/performance-best-per-learner.tex
    Modified:   code/98-paper/ieee/performance-top-20.tex
    Modified:   code/98-paper/journal/defoliation-distribution-plot-1.pdf
    Modified:   code/move-figures.R
    Deleted:    docs/figure/eval-performance.Rmd/filter-effect-1.pdf
    Deleted:    docs/figure/eval-performance.Rmd/filter-perf-all-1.pdf
    Deleted:    docs/figure/eval-performance.Rmd/performance-results-1.pdf
    Deleted:    docs/figure/spectral-signatures.Rmd/spectral-signatures-1.pdf
    Deleted:    docs/logo/life.jpg
    Modified:   inst/study-area-map/study-area.qgs

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
html c6317a8 pat-s 2019-09-19 Build site.
Rmd d7c72a8 pat-s 2019-09-19 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html 7fd40ca pat-s 2019-09-18 Build site.
Rmd 44ff84b pat-s 2019-09-18 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html 41aae14 pat-s 2019-09-12 Build site.
html ff340b8 pat-s 2019-09-03 Build site.
Rmd a524819 pat-s 2019-09-03 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html b181c52 pat-s 2019-09-02 Build site.
Rmd cf6e820 pat-s 2019-09-02 wflow_publish(“analysis/eval-performance.Rmd”)
Rmd 1bec10d pat-s 2019-09-01 no timestamps in latex tables
html 4e363ac pat-s 2019-09-01 Build site.
Rmd 518d0cb pat-s 2019-09-01 style files using tidyverse style
html 8e7e4fe pat-s 2019-09-01 Build site.
Rmd 8941bca pat-s 2019-09-01 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
Rmd 297ed93 pat-s 2019-08-31 add filter vs no filter comparison plot
html 7582c67 pat-s 2019-08-31 Build site.
html abd531f pat-s 2019-08-31 Build site.
Rmd 9117eee pat-s 2019-08-31 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html 1ec8768 pat-s 2019-08-17 Build site.
html df85aba pat-s 2019-07-12 Build site.
html 3a44a95 pat-s 2019-07-10 Build site.
html c238ce4 pat-s 2019-07-10 Build site.
Rmd e98cb01 pat-s 2019-07-10 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
Rmd 24e318f pat-s 2019-07-01 update reports
Rmd ca5c5bc pat-s 2019-06-28 add eval-performance report

(Table) All leaner/filter/task combinations ordered by performance.

Overall leaderboard across all settings, sorted descending by performance.

Learner Group

Task

Filter

RMSE

XGBoost

HR-NRI

Pearson

33.844

XGBoost

HR-NRI

MRMR

33.945

XGBoost

HR-NRI

CMIM

33.994

XGBoost

HR

Relief

34.005

Ridge-CV

NRI

No Filter

34.050

XGBoost

HR-NRI

No Filter

34.246

SVM

HR-NRI

Pearson

34.330

RF

HR-NRI

Car

34.335

XGBoost

NRI

CMIM

34.385

XGBoost

HR-NRI-VI

Car

34.502

XGBoost

HR-NRI-VI

CMIM

34.541

XGBoost

HR-NRI-VI

No Filter

34.544

SVM

VI

Relief

34.550

SVM

HR-NRI-VI

Pearson

34.554

SVM

HR-NRI-VI

Borda

34.586

SVM

HR-NRI-VI

Info

34.588

SVM

VI

Car

34.614

SVM

HR-NRI-VI

Car

34.616

SVM

HR-NRI

PCA

34.620

SVM

VI

PCA

34.621

SVM

HR-NRI-VI

Relief

34.621

SVM

HR-NRI-VI

MRMR

34.621

SVM

HR-NRI-VI

PCA

34.621

SVM

HR-NRI

Relief

34.621

SVM

HR-NRI-VI

CMIM

34.621

SVM

VI

Borda

34.621

SVM

HR

No Filter

34.622

SVM

HR

Pearson

34.622

SVM

HR-VI

No Filter

34.622

SVM

HR-NRI-VI

No Filter

34.622

SVM

HR-VI

Borda

34.622

SVM

HR-NRI

No Filter

34.622

SVM

HR

Relief

34.622

SVM

VI

No Filter

34.622

SVM

HR-VI

PCA

34.622

SVM

VI

Info

34.624

SVM

HR-NRI

Car

34.624

SVM

HR

Info

34.624

SVM

HR-VI

Info

34.625

SVM

HR-VI

MRMR

34.625

SVM

VI

MRMR

34.626

SVM

HR

PCA

34.626

SVM

HR-VI

Pearson

34.631

SVM

HR

Borda

34.631

SVM

HR-VI

CMIM

34.631

SVM

HR

Car

34.631

SVM

HR

CMIM

34.631

XGBoost

HR-NRI-VI

Borda

34.640

SVM

HR-NRI

CMIM

34.642

SVM

HR-NRI

Borda

34.651

SVM

HR-NRI

Info

34.651

SVM

HR-VI

Car

34.661

SVM

HR

MRMR

34.663

XGBoost

NRI

Relief

34.678

SVM

VI

CMIM

34.696

XGBoost

HR-NRI

Car

34.709

SVM

VI

Pearson

34.718

RF

NRI

Car

34.749

SVM

NRI

PCA

34.899

XGBoost

HR-NRI

Borda

34.920

XGBoost

NRI

Car

35.191

XGBoost

NRI

Info

35.220

XGBoost

HR-NRI

Relief

35.235

XGBoost

HR-NRI-VI

Pearson

35.242

XGBoost

NRI

Borda

35.246

XGBoost

HR-NRI-VI

Relief

35.302

XGBoost

HR-NRI-VI

MRMR

35.347

XGBoost

HR-NRI-VI

Info

35.800

XGBoost

NRI

MRMR

36.056

RF

HR-NRI

CMIM

36.632

RF

HR

Relief

36.743

RF

HR-NRI-VI

PCA

36.857

RF

NRI

CMIM

36.968

XGBoost

NRI

No Filter

37.012

XGBoost

NRI

Pearson

37.088

RF

HR-NRI

PCA

37.195

XGBoost

HR

Car

37.242

RF

HR

Borda

37.268

XGBoost

HR

MRMR

37.272

RF

HR

Info

37.289

XGBoost

HR

CMIM

37.366

RF

HR-NRI-VI

Car

37.423

RF

HR-NRI-VI

CMIM

37.475

RF

NRI

MRMR

37.490

RF

HR-NRI

MRMR

37.512

RF

HR-NRI-VI

Pearson

37.671

RF

HR-NRI

No Filter

37.714

RF

HR-NRI-VI

No Filter

37.825

RF

NRI

No Filter

37.919

RF

NRI

Borda

37.942

RF

NRI

Relief

37.961

RF

HR-NRI-VI

Info

38.080

RF

HR-NRI

Pearson

38.137

RF

HR-NRI-VI

Borda

38.206

RF

NRI

Info

38.207

XGBoost

HR-NRI

Info

38.245

RF

HR-NRI-VI

Relief

38.256

RF

HR-NRI

Borda

38.263

XGBoost

HR-NRI

PCA

38.291

RF

NRI

Pearson

38.299

RF

HR

Pearson

38.383

RF

HR-NRI

Info

38.389

XGBoost

HR

Borda

38.425

XGBoost

VI

No Filter

38.509

RF

VI

Relief

38.538

RF

HR-NRI-VI

MRMR

38.588

RF

HR

CMIM

38.644

RF

VI

Car

38.659

RF

HR-NRI

Relief

38.963

XGBoost

VI

Car

39.190

XGBoost

HR-NRI-VI

PCA

39.193

XGBoost

NRI

PCA

39.246

XGBoost

HR-VI

PCA

39.512

RF

NRI

PCA

39.667

RF

VI

No Filter

39.742

RF

VI

Borda

40.000

RF

HR-VI

MRMR

40.049

SVM

HR-NRI

MRMR

40.125

RF

HR

Car

40.153

RF

HR-VI

Car

40.269

RF

VI

CMIM

40.276

XGBoost

HR

Info

40.335

RF

HR-VI

Info

40.431

RF

VI

Info

40.470

XGBoost

HR-VI

No Filter

40.541

RF

VI

Pearson

40.649

XGBoost

VI

CMIM

40.753

RF

HR-VI

No Filter

40.768

RF

VI

MRMR

40.770

Ridge-CV

HR-NRI

No Filter

40.883

RF

HR-VI

Borda

40.980

SVM

NRI

No Filter

41.046

RF

HR-VI

CMIM

41.057

SVM

NRI

MRMR

41.097

XGBoost

VI

MRMR

41.158

RF

HR-VI

Pearson

41.223

SVM

NRI

Car

41.250

SVM

NRI

Relief

41.306

XGBoost

HR-VI

Relief

41.337

XGBoost

HR-VI

MRMR

41.354

SVM

NRI

CMIM

41.388

SVM

NRI

Info

41.436

RF

VI

PCA

41.448

Ridge-CV

HR

No Filter

41.463

RF

HR-VI

Relief

41.530

SVM

NRI

Pearson

41.553

SVM

NRI

Borda

41.554

XGBoost

HR-VI

Info

41.680

XGBoost

HR

Pearson

41.728

XGBoost

HR-VI

CMIM

41.739

XGBoost

VI

Pearson

41.808

XGBoost

HR-VI

Pearson

42.094

XGBoost

HR-VI

Car

42.131

RF

HR

MRMR

42.302

RF

HR

No Filter

42.367

XGBoost

VI

Info

42.497

RF

HR-VI

PCA

42.819

XGBoost

VI

Relief

42.834

XGBoost

HR-VI

Borda

43.359

XGBoost

VI

PCA

43.631

RF

HR

PCA

43.846

XGBoost

VI

Borda

44.671

SVM

HR-VI

Relief

44.713

XGBoost

HR

PCA

46.163

XGBoost

HR

No Filter

46.567

Lasso-CV

HR

No Filter

50.453

Lasso-CV

HR-NRI

No Filter

50.855

Lasso-CV

NRI

No Filter

51.184

Lasso-CV

HR-NRI-VI

No Filter

58.263

Lasso-CV

VI

No Filter

58.329

Lasso-CV

HR-VI

No Filter

58.329

Ridge-MBO

HR-VI

No Filter

58.555

Ridge-MBO

VI

No Filter

58.555

Ridge-MBO

HR-NRI-VI

No Filter

58.555

Lasso-MBO

HR

No Filter

58.555

Lasso-MBO

NRI

No Filter

58.555

Lasso-MBO

HR-NRI

No Filter

58.555

Ridge-MBO

HR

No Filter

58.555

Ridge-MBO

NRI

No Filter

58.555

Ridge-MBO

HR-NRI

No Filter

58.555

Lasso-MBO

VI

No Filter

58.555

Lasso-MBO

HR-VI

No Filter

58.555

Lasso-MBO

HR-NRI-VI

No Filter

58.555

Ridge-CV

HR-NRI-VI

No Filter

50502559.642

Ridge-CV

HR-VI

No Filter

59986649.556

Ridge-CV

VI

No Filter

60698033.323

(Table) Best learner/filter/task combination

Learners: On which task and using which filter did every learner score their best result on?

-CV: L2 penalized regression using the internal 10-fold CV tuning of the glmnet package -MBO: L2 penalized regression using using MBO for hyperparameter optimization.

Task

Learner Group

Filter

RMSE

defoliation-all-plots-HR-NRI

XGBoost

Pearson

33.844

defoliation-all-plots-NRI

Ridge-CV

No Filter

34.050

defoliation-all-plots-HR-NRI

SVM

Pearson

34.330

defoliation-all-plots-HR-NRI

RF

Car

34.335

defoliation-all-plots-HR

Lasso-CV

No Filter

50.453

defoliation-all-plots-HR-VI

Ridge-MBO

No Filter

58.555

defoliation-all-plots-HR

Lasso-MBO

No Filter

58.555

(Plot) Best learner/filter combs for all tasks

Version Author Date
41aae14 pat-s 2019-09-12
b181c52 pat-s 2019-09-02
8e7e4fe pat-s 2019-09-01
7582c67 pat-s 2019-08-31
abd531f pat-s 2019-08-31

(Plot) Best filter combination of each learner vs. no filter per task vs. Borda

Showing the final effect of applying feature selection to a learner for each task. The more left a certain filter appears for a given task compared to the purple dot (No Filter), the higher the effectivity of applying feature selection for that given learner on the given task.

Version Author Date
c6317a8 pat-s 2019-09-19
7fd40ca pat-s 2019-09-18
41aae14 pat-s 2019-09-12
ff340b8 pat-s 2019-09-03
b181c52 pat-s 2019-09-02
8e7e4fe pat-s 2019-09-01

(Plot) Best filter combination of each learner vs. Borda filter

Showing the final effect of the ensemble Borda filter vs the best scoring simple filter.

Version Author Date
7fd40ca pat-s 2019-09-18
41aae14 pat-s 2019-09-12
ff340b8 pat-s 2019-09-03
b181c52 pat-s 2019-09-02

(Plot) Performances of all filter methods

Version Author Date
7fd40ca pat-s 2019-09-18
41aae14 pat-s 2019-09-12
ff340b8 pat-s 2019-09-03
b181c52 pat-s 2019-09-02

R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /opt/R/3.5.2/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/libopenblaso-r0.3.3.so

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] xtable_1.8-3      flextable_0.5.5   ggbeeswarm_0.7.0 
 [4] ggpubr_0.1.6      ggsci_2.9         ggrepel_0.8.0    
 [7] ggplot2_3.1.0     dplyr_0.8.3       magrittr_1.5     
[10] mlr_2.15.0.9000   ParamHelpers_1.12 tidyselect_0.2.5 

loaded via a namespace (and not attached):
 [1] fs_1.2.6           filelock_1.0.2     RColorBrewer_1.1-2
 [4] httr_1.4.0         rprojroot_1.3-2    tools_3.5.2       
 [7] backports_1.1.3    R6_2.4.0           vipor_0.4.5       
[10] lazyeval_0.2.1     colorspace_1.4-0   withr_2.1.2       
[13] mco_1.0-15.1       compiler_3.5.2     git2r_0.24.0      
[16] parallelMap_1.4    cli_1.1.0          xml2_1.2.0        
[19] plotly_4.8.0       officer_0.3.5      labeling_0.3      
[22] scales_1.0.0       checkmate_1.9.1    plot3D_1.1.1      
[25] stringr_1.4.0      digest_0.6.18      txtq_0.1.4        
[28] rmarkdown_1.13     R.utils_2.8.0      smoof_1.5.1       
[31] base64enc_0.1-3    pkgconfig_2.0.2    htmltools_0.3.6   
[34] lhs_1.0.1          htmlwidgets_1.3    rlang_0.4.0       
[37] BBmisc_1.11        drake_7.5.2        mlrMBO_1.1.2      
[40] jsonlite_1.6       zip_2.0.0          R.oo_1.22.0       
[43] Matrix_1.2-15      Rcpp_1.0.2         munsell_0.5.0     
[46] gdtools_0.1.7      lifecycle_0.1.0    R.methodsS3_1.7.1 
[49] stringi_1.3.1      whisker_0.3-2      yaml_2.2.0        
[52] storr_1.2.1        RJSONIO_1.3-1.1    plyr_1.8.4        
[55] grid_3.5.2         misc3d_0.8-4       parallel_3.5.2    
[58] crayon_1.3.4       lattice_0.20-38    splines_3.5.2     
[61] zeallot_0.1.0      knitr_1.23         pillar_1.3.1      
[64] igraph_1.2.4       uuid_0.1-2         base64url_1.4     
[67] fastmatch_1.1-0    glue_1.3.0         evaluate_0.13     
[70] data.table_1.12.0  vctrs_0.2.0        gtable_0.2.0      
[73] purrr_0.3.0        tidyr_1.0.0        assertthat_0.2.0  
[76] xfun_0.5           survival_2.43-3    viridisLite_0.3.0 
[79] tibble_2.1.3       beeswarm_0.2.3     workflowr_1.4.0   
[82] ellipsis_0.2.0.1