Last updated: 2019-09-18

Checks: 7 0

Knit directory: 2019-feature-selection/

This reproducible R Markdown analysis was created with workflowr (version 1.4.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20190522) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    .Ruserdata/
    Ignored:    .drake/
    Ignored:    analysis/rosm.cache/
    Ignored:    data/
    Ignored:    inst/Benchmark for Filter Methods for Feature Selection in High-Dimensional  Classification Data.pdf
    Ignored:    inst/study-area-map/study-area.qgs~
    Ignored:    log/
    Ignored:    packrat/lib-R/
    Ignored:    packrat/lib-ext/
    Ignored:    packrat/lib/
    Ignored:    reviews/
    Ignored:    rosm.cache/
    Ignored:    tests/

Untracked files:
    Untracked:  .drake_history/

Unstaged changes:
    Modified:   R/06-mlr-paper.R
    Modified:   _drake.R
    Modified:   code/05-modeling/paper/feature-importance.R
    Modified:   code/06-benchmark-matrix.R
    Modified:   code/061-aggregate.R
    Modified:   code/98-paper/ieee/pdf/correlation-filter-nri-1.pdf
    Modified:   code/98-paper/ieee/pdf/correlation-nbins-1.pdf
    Modified:   code/98-paper/ieee/pdf/defoliation-distribution-plot-1.pdf
    Modified:   code/98-paper/ieee/pdf/spectral-signatures-1.pdf
    Modified:   code/98-paper/ieee/performance-best-per-learner.tex
    Modified:   code/98-paper/ieee/performance-top-20.tex
    Modified:   code/98-paper/journal/defoliation-distribution-plot-1.pdf
    Modified:   code/move-figures.R
    Deleted:    docs/figure/eval-performance.Rmd/filter-effect-1.pdf
    Deleted:    docs/figure/eval-performance.Rmd/filter-perf-all-1.pdf
    Deleted:    docs/figure/eval-performance.Rmd/performance-results-1.pdf
    Deleted:    docs/figure/spectral-signatures.Rmd/spectral-signatures-1.pdf
    Deleted:    docs/logo/life.jpg

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
Rmd 44ff84b pat-s 2019-09-18 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html 41aae14 pat-s 2019-09-12 Build site.
html ff340b8 pat-s 2019-09-03 Build site.
Rmd a524819 pat-s 2019-09-03 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html b181c52 pat-s 2019-09-02 Build site.
Rmd cf6e820 pat-s 2019-09-02 wflow_publish(“analysis/eval-performance.Rmd”)
Rmd 1bec10d pat-s 2019-09-01 no timestamps in latex tables
html 4e363ac pat-s 2019-09-01 Build site.
Rmd 518d0cb pat-s 2019-09-01 style files using tidyverse style
html 8e7e4fe pat-s 2019-09-01 Build site.
Rmd 8941bca pat-s 2019-09-01 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
Rmd 297ed93 pat-s 2019-08-31 add filter vs no filter comparison plot
html 7582c67 pat-s 2019-08-31 Build site.
html abd531f pat-s 2019-08-31 Build site.
Rmd 9117eee pat-s 2019-08-31 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
html 1ec8768 pat-s 2019-08-17 Build site.
html df85aba pat-s 2019-07-12 Build site.
html 3a44a95 pat-s 2019-07-10 Build site.
html c238ce4 pat-s 2019-07-10 Build site.
Rmd e98cb01 pat-s 2019-07-10 wflow_publish(knitr_in(“analysis/eval-performance.Rmd”), view = FALSE,
Rmd 24e318f pat-s 2019-07-01 update reports
Rmd ca5c5bc pat-s 2019-06-28 add eval-performance report

(Table) All leaner/filter/task combinations ordered by performance.

Overall leaderboard across all settings, sorted descending by performance.

Learner Group

Task

Filter

RMSE

Ridge-CV

NRI

No Filter

34.050

XGBoost

NRI

MRMR

34.099

XGBoost

HR-NRI-VI

CMIM

34.152

XGBoost

HR-NRI-VI

Relief

34.215

XGBoost

HR-NRI

No Filter

34.238

XGBoost

NRI

Relief

34.332

SVM

HR-NRI

Car

34.410

SVM

HR-NRI

Info

34.439

XGBoost

HR-NRI

MRMR

34.446

XGBoost

HR-NRI-VI

Pearson

34.457

SVM

HR-NRI

Relief

34.505

XGBoost

HR-NRI-VI

No Filter

34.569

XGBoost

HR-NRI-VI

MRMR

34.571

SVM

HR-NRI-VI

Relief

34.571

SVM

HR-NRI-VI

Pearson

34.585

SVM

HR-NRI-VI

Borda

34.586

SVM

HR-NRI

Pearson

34.586

SVM

HR-NRI-VI

Info

34.588

SVM

VI

Car

34.609

SVM

HR-NRI-VI

MRMR

34.615

SVM

HR-NRI

PCA

34.619

SVM

HR-NRI-VI

Car

34.620

SVM

HR-NRI-VI

PCA

34.620

SVM

VI

PCA

34.621

SVM

VI

Info

34.621

SVM

HR-NRI-VI

CMIM

34.621

SVM

VI

Borda

34.621

SVM

HR-NRI

No Filter

34.622

SVM

HR

No Filter

34.622

SVM

HR-VI

No Filter

34.622

SVM

HR-NRI-VI

No Filter

34.622

SVM

HR-VI

Borda

34.622

SVM

HR

Pearson

34.622

SVM

VI

CMIM

34.622

SVM

HR-VI

Info

34.622

SVM

VI

No Filter

34.622

SVM

HR-VI

PCA

34.622

SVM

VI

MRMR

34.622

SVM

HR-NRI

CMIM

34.624

SVM

HR

Info

34.625

SVM

HR

PCA

34.626

SVM

HR-NRI

MRMR

34.629

SVM

HR-VI

Pearson

34.631

SVM

HR

Borda

34.631

SVM

HR-VI

CMIM

34.631

SVM

HR

CMIM

34.631

SVM

HR

Car

34.631

SVM

HR

MRMR

34.631

SVM

HR-VI

MRMR

34.631

SVM

VI

Pearson

34.635

XGBoost

HR-NRI-VI

Borda

34.640

SVM

HR-NRI

Borda

34.651

SVM

HR-VI

Car

34.673

SVM

VI

Relief

34.686

XGBoost

NRI

CMIM

34.788

RF

NRI

Car

34.865

SVM

NRI

PCA

34.867

XGBoost

HR-NRI-VI

Car

34.886

XGBoost

HR-NRI

Borda

34.920

XGBoost

HR-NRI

Info

35.009

XGBoost

NRI

Borda

35.246

XGBoost

HR-NRI-VI

Info

35.276

XGBoost

HR-NRI

CMIM

35.317

XGBoost

NRI

No Filter

35.420

XGBoost

NRI

Info

35.436

XGBoost

HR-NRI

Pearson

35.610

XGBoost

HR-NRI

Car

35.793

SVM

HR

Relief

35.853

XGBoost

NRI

Car

36.124

XGBoost

NRI

Pearson

36.210

RF

HR-NRI-VI

Car

36.223

RF

HR-NRI

Car

36.500

XGBoost

HR

CMIM

36.767

RF

NRI

CMIM

36.805

XGBoost

HR-NRI

Relief

36.981

RF

HR-NRI-VI

CMIM

37.012

RF

HR-NRI

PCA

37.046

RF

HR-NRI-VI

PCA

37.112

RF

HR

Info

37.179

RF

HR

Borda

37.268

XGBoost

HR

Pearson

37.393

RF

HR-NRI

CMIM

37.465

RF

HR-NRI-VI

No Filter

37.744

RF

HR-NRI

MRMR

37.762

RF

NRI

No Filter

37.828

RF

HR-NRI-VI

Info

37.876

RF

NRI

Borda

37.942

RF

HR-NRI

Pearson

37.998

RF

NRI

MRMR

38.000

RF

HR-NRI

No Filter

38.049

RF

NRI

Info

38.132

RF

HR-NRI-VI

Pearson

38.140

RF

NRI

Pearson

38.200

RF

HR-NRI-VI

Borda

38.206

RF

HR-NRI

Borda

38.263

RF

HR-VI

Relief

38.265

RF

HR-NRI

Relief

38.277

RF

VI

Car

38.336

RF

HR-NRI

Info

38.411

RF

HR

Pearson

38.414

XGBoost

HR

Borda

38.425

RF

HR

CMIM

38.481

RF

HR-NRI-VI

Relief

38.536

RF

HR-NRI-VI

MRMR

38.588

RF

NRI

Relief

38.601

RF

HR

Car

38.667

RF

HR

MRMR

38.711

XGBoost

HR

Car

39.024

XGBoost

HR

MRMR

39.067

XGBoost

VI

No Filter

39.131

XGBoost

VI

Car

39.236

RF

VI

Info

39.493

RF

VI

No Filter

39.514

XGBoost

HR-VI

CMIM

39.794

XGBoost

HR-VI

PCA

39.959

RF

VI

Borda

40.000

RF

NRI

PCA

40.008

XGBoost

HR-NRI-VI

PCA

40.123

RF

HR-VI

Info

40.149

RF

HR-VI

MRMR

40.151

RF

HR-VI

Car

40.266

XGBoost

VI

MRMR

40.286

RF

VI

CMIM

40.330

XGBoost

VI

Info

40.425

XGBoost

HR-VI

Relief

40.454

RF

VI

MRMR

40.575

XGBoost

HR-NRI

PCA

40.596

RF

VI

Pearson

40.710

RF

HR-VI

CMIM

40.721

XGBoost

HR

Info

40.766

RF

HR-VI

No Filter

40.803

Ridge-CV

HR-NRI

No Filter

40.917

XGBoost

VI

Relief

40.966

RF

HR-VI

Borda

40.980

XGBoost

HR-VI

No Filter

41.008

RF

VI

Relief

41.036

SVM

NRI

No Filter

41.047

XGBoost

HR-VI

Car

41.098

XGBoost

HR-VI

MRMR

41.167

SVM

NRI

MRMR

41.220

RF

HR

Relief

41.246

SVM

NRI

Relief

41.305

SVM

NRI

CMIM

41.413

RF

VI

PCA

41.440

RF

HR-VI

Pearson

41.474

Ridge-CV

HR

No Filter

41.486

SVM

NRI

Car

41.534

SVM

NRI

Borda

41.554

SVM

NRI

Info

41.603

XGBoost

VI

CMIM

41.619

XGBoost

NRI

PCA

41.654

SVM

NRI

Pearson

41.709

SVM

HR-VI

Relief

41.790

XGBoost

HR

Relief

42.078

XGBoost

VI

Pearson

42.085

RF

HR

No Filter

42.395

XGBoost

HR

PCA

42.610

RF

HR-VI

PCA

42.782

XGBoost

HR-VI

Info

42.849

XGBoost

HR-VI

Borda

43.359

RF

HR

PCA

43.634

XGBoost

VI

PCA

43.643

XGBoost

HR-VI

Pearson

44.661

XGBoost

VI

Borda

44.671

XGBoost

HR

No Filter

46.335

Lasso-CV

HR

No Filter

50.453

Lasso-CV

HR-NRI

No Filter

50.855

Lasso-CV

NRI

No Filter

51.184

Lasso-CV

HR-NRI-VI

No Filter

58.263

Lasso-CV

VI

No Filter

58.329

Lasso-CV

HR-VI

No Filter

58.329

Ridge-MBO

HR-VI

No Filter

58.555

Ridge-MBO

VI

No Filter

58.555

Ridge-MBO

HR-NRI-VI

No Filter

58.555

Lasso-MBO

HR

No Filter

58.555

Lasso-MBO

NRI

No Filter

58.555

Lasso-MBO

HR-NRI

No Filter

58.555

Ridge-MBO

HR

No Filter

58.555

Ridge-MBO

NRI

No Filter

58.555

Ridge-MBO

HR-NRI

No Filter

58.555

Lasso-MBO

VI

No Filter

58.555

Lasso-MBO

HR-VI

No Filter

58.555

Lasso-MBO

HR-NRI-VI

No Filter

58.555

Ridge-CV

HR-NRI-VI

No Filter

50502559.642

Ridge-CV

HR-VI

No Filter

55034896.733

Ridge-CV

VI

No Filter

55669702.590

(Table) Best learner/filter/task combination

Learners: On which task and using which filter did every learner score their best result on?

-CV: L2 penalized regression using the internal 10-fold CV tuning of the glmnet package -MBO: L2 penalized regression using using MBO for hyperparameter optimization.

Task

Learner Group

Filter

RMSE

defoliation-all-plots-NRI

Ridge-CV

No Filter

34.050

defoliation-all-plots-NRI

XGBoost

MRMR

34.099

defoliation-all-plots-HR-NRI

SVM

Car

34.410

defoliation-all-plots-NRI

RF

Car

34.865

defoliation-all-plots-HR

Lasso-CV

No Filter

50.453

defoliation-all-plots-HR-VI

Ridge-MBO

No Filter

58.555

defoliation-all-plots-HR

Lasso-MBO

No Filter

58.555

(Plot) Best learner/filter combs for all tasks

Version Author Date
41aae14 pat-s 2019-09-12
b181c52 pat-s 2019-09-02
8e7e4fe pat-s 2019-09-01
7582c67 pat-s 2019-08-31
abd531f pat-s 2019-08-31

(Plot) Best filter combination of each learner vs. no filter per task vs. Borda

Showing the final effect of applying feature selection to a learner for each task. The more left a certain filter appears for a given task compared to the purple dot (No Filter), the higher the effectivity of applying feature selection for that given learner on the given task.

Version Author Date
41aae14 pat-s 2019-09-12
ff340b8 pat-s 2019-09-03
b181c52 pat-s 2019-09-02
8e7e4fe pat-s 2019-09-01

(Plot) Best filter combination of each learner vs. Borda filter

Showing the final effect of the ensemble Borda filter vs the best scoring simple filter.

Version Author Date
41aae14 pat-s 2019-09-12
ff340b8 pat-s 2019-09-03
b181c52 pat-s 2019-09-02

(Plot) Performances of all filter methods

Version Author Date
41aae14 pat-s 2019-09-12
ff340b8 pat-s 2019-09-03
b181c52 pat-s 2019-09-02

R version 3.5.2 (2018-12-20)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /opt/R/3.5.2/lib64/R/lib/libRblas.so
LAPACK: /usr/lib64/libopenblaso-r0.3.3.so

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] xtable_1.8-3      flextable_0.5.5   ggbeeswarm_0.7.0 
 [4] ggpubr_0.1.6      ggsci_2.9         ggrepel_0.8.0    
 [7] ggplot2_3.1.0     dplyr_0.8.0.1     magrittr_1.5     
[10] mlr_2.15.0.9000   ParamHelpers_1.12 tidyselect_0.2.5 

loaded via a namespace (and not attached):
 [1] httr_1.4.0         tidyr_0.8.2        jsonlite_1.6      
 [4] viridisLite_0.3.0  splines_3.5.2      R.utils_2.8.0     
 [7] assertthat_0.2.0   base64url_1.4      vipor_0.4.5       
[10] yaml_2.2.0         gdtools_0.1.7      mlrMBO_1.1.2      
[13] pillar_1.3.1       backports_1.1.3    lattice_0.20-38   
[16] glue_1.3.0         uuid_0.1-2         digest_0.6.18     
[19] RColorBrewer_1.1-2 checkmate_1.9.1    colorspace_1.4-0  
[22] htmltools_0.3.6    Matrix_1.2-15      R.oo_1.22.0       
[25] plyr_1.8.4         pkgconfig_2.0.2    lhs_1.0.1         
[28] misc3d_0.8-4       drake_7.5.2        purrr_0.3.0       
[31] scales_1.0.0       parallelMap_1.4    whisker_0.3-2     
[34] officer_0.3.5      mco_1.0-15.1       git2r_0.24.0      
[37] tibble_2.0.1       txtq_0.1.4         withr_2.1.2       
[40] lazyeval_0.2.1     cli_1.1.0          survival_2.43-3   
[43] RJSONIO_1.3-1.1    crayon_1.3.4       evaluate_0.13     
[46] storr_1.2.1        R.methodsS3_1.7.1  fs_1.2.6          
[49] xml2_1.2.0         beeswarm_0.2.3     tools_3.5.2       
[52] data.table_1.12.0  BBmisc_1.11        stringr_1.4.0     
[55] plotly_4.8.0       munsell_0.5.0      zip_2.0.0         
[58] compiler_3.5.2     rlang_0.3.4        plot3D_1.1.1      
[61] grid_3.5.2         htmlwidgets_1.3    igraph_1.2.4      
[64] labeling_0.3       base64enc_0.1-3    rmarkdown_1.13    
[67] gtable_0.2.0       smoof_1.5.1        R6_2.4.0          
[70] knitr_1.23         fastmatch_1.1-0    filelock_1.0.2    
[73] workflowr_1.4.0    rprojroot_1.3-2    stringi_1.3.1     
[76] parallel_3.5.2     Rcpp_1.0.0         xfun_0.5