Last updated: 2023-10-05
Checks: 7 0
Knit directory: muse/
This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20200712)
was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version a63df84. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish
or
wflow_git_commit
). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: r_packages_4.3.0/
Untracked files:
Untracked: analysis/cell_ranger.Rmd
Untracked: analysis/complex_heatmap.Rmd
Untracked: analysis/sleuth.Rmd
Untracked: analysis/tss_xgboost.Rmd
Untracked: code/multiz100way/
Untracked: data/HG00702_SH089_CHSTrio.chr1.vcf.gz
Untracked: data/HG00702_SH089_CHSTrio.chr1.vcf.gz.tbi
Untracked: data/ncrna_NONCODE[v3.0].fasta.tar.gz
Untracked: data/ncrna_noncode_v3.fa
Untracked: data/netmhciipan.out.gz
Untracked: data/test
Untracked: export/davetang039sblog.WordPress.2023-06-30.xml
Untracked: export/output/
Untracked: women.json
Unstaged changes:
Modified: analysis/graph.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/seriation.Rmd
) and HTML
(docs/seriation.html
) files. If you’ve configured a remote
Git repository (see ?wflow_git_remote
), click on the
hyperlinks in the table below to view the files as they were in that
past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | a63df84 | Dave Tang | 2023-10-05 | Ordering objects using seriation |
Install stable CRAN version.
if(! "seriation" %in% installed.packages()[, 1]){
install.packages("seriation", repos = c("https://mhahsler.r-universe.dev", "https://cloud.r-project.org/"))
}
packageVersion("seriation")
[1] '1.5.1.1'
Use the example
dataset SupremeCourt
, which:
Contains a (a subset of the) decisions for the stable 8-yr period 1995-2002 of the second Rehnquist Supreme Court. Decisions are aggregated to the joint probability for disagreement between judges.
library(seriation)
data("SupremeCourt")
SupremeCourt
Breyer Ginsburg Kennedy OConnor Rehnquist Scalia Souter Stevens
Breyer 0.00000 0.11966 0.25000 0.20940 0.29915 0.35256 0.11752 0.16239
Ginsburg 0.11966 0.00000 0.26790 0.25214 0.30769 0.36966 0.09615 0.14530
Kennedy 0.25000 0.26709 0.00000 0.15598 0.12179 0.18803 0.24786 0.32692
OConnor 0.20940 0.25214 0.15598 0.00000 0.16239 0.20726 0.22009 0.32906
Rehnquist 0.29915 0.30769 0.12179 0.16239 0.00000 0.14316 0.29274 0.40171
Scalia 0.35256 0.36966 0.18803 0.20726 0.14316 0.00000 0.33761 0.43803
Souter 0.11752 0.09615 0.24790 0.22009 0.29274 0.33761 0.00000 0.16880
Stevens 0.16239 0.14530 0.32692 0.32906 0.40171 0.43803 0.16880 0.00000
Thomas 0.35897 0.36752 0.17735 0.20513 0.13675 0.06624 0.33120 0.43590
Thomas
Breyer 0.35897
Ginsburg 0.36752
Kennedy 0.17735
OConnor 0.20513
Rehnquist 0.13675
Scalia 0.06624
Souter 0.33120
Stevens 0.43590
Thomas 0.00000
Convert to distance matrix.
d <- as.dist(SupremeCourt)
d
Breyer Ginsburg Kennedy OConnor Rehnquist Scalia Souter Stevens
Ginsburg 0.11966
Kennedy 0.25000 0.26709
OConnor 0.20940 0.25214 0.15598
Rehnquist 0.29915 0.30769 0.12179 0.16239
Scalia 0.35256 0.36966 0.18803 0.20726 0.14316
Souter 0.11752 0.09615 0.24790 0.22009 0.29274 0.33761
Stevens 0.16239 0.14530 0.32692 0.32906 0.40171 0.43803 0.16880
Thomas 0.35897 0.36752 0.17735 0.20513 0.13675 0.06624 0.33120 0.43590
Perform the default seriation method to reorder the objects.
my_order <- seriate(d)
get_order(my_order)
Scalia Thomas Rehnquist Kennedy OConnor Souter Breyer Ginsburg
6 9 5 3 4 7 1 2
Stevens
8
Plot heatmap.
p1 <- ggpimage(d, upper_tri = TRUE) +
ggtitle("Judges (original alphabetical order)")
p2 <- ggpimage(d, my_order, upper_tri = TRUE) +
ggtitle("Judges (reordered by seriation)")
p1 + p2 & scale_fill_gradientn(colours = c("darkgrey", "skyblue"))
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Return linear configuration where more similar objects are located closer to each other.
get_config(my_order)
Breyer Ginsburg Kennedy OConnor Rehnquist Scalia Souter
0.2362454 0.2814388 -0.1501451 -0.1051162 -0.2656098 -0.4159504 0.2139295
Stevens Thomas
0.6129974 -0.4077896
Plot linear configuration.
plot_config(my_order)
The Wood
dataset consists of:
A data matrix containing a sample of the normalized gene expression data for 6 locations in the stem of Popla trees published in the study by Herzberg et al (2001). The sample of 136 genes selected by Caraux and Pinloche (2005).
data("Wood")
dim(Wood)
[1] 136 6
Methods of interest for heatmaps are dendrogram leaf order-based
methods applied to rows and columns. This is done using
method = "heatmap"
. The actual seriation method can be
passed on as parameter seriaton_method, but it has a suitable default if
it is omitted.
wood_hc_complete <- seriate(Wood, method = "Heatmap", seriation_method = "HC_complete")
wood_olo_complete <- seriate(Wood, method = "Heatmap", seriation_method = "OLO_complete")
get_order(wood_hc_complete)
AI162370 AI166095 AI164136 AI165011 AI165492 AI166057 AI162004 AI163151
26 131 69 91 106 128 15 44
AI164970 AI166086 AI162593 AI163756 AI164684 AI164612 AI165835 AI163485
88 130 32 59 80 78 119 51
AI163315 AI161513 AI162809 AI162561 AI162652 AI165426 AI162216 AI163303
47 3 38 31 34 105 23 46
AI164585 AI164635 AI165041 AI163821 AI162521 AI163812 AI166111 AI161629
76 79 93 62 30 61 133 6
AI164686 AI164884 AI165668 AI163528 AI165913 AI163131 AI166101 AI165006
81 86 111 52 123 43 132 90
AI165990 AI163617 AI164793 AI163249 AI163994 AI162997 AI163650 AI163758
126 56 85 45 66 41 58 60
AI161573 AI162600 AI165529 AI165730 AI162928 AI165868 AI162429 AI164730
5 33 108 116 39 121 28 83
AI165247 AI165289 AI166182 AI163328 AI164060 AI165272 AI165803 AI165655
101 103 136 48 67 102 117 110
AI162157 AI165826 AI161500 AI165575 AI163608 AI164450 AI162158 AI165175
20 118 2 109 55 74 21 97
AI165687 AI166167 AI164370 AI162148 AI165162 AI163474 AI161694 AI161899
113 135 72 19 96 50 9 14
AI165836 AI164101 AI166068 AI164435 AI164711 AI165215 AI161730 AI165062
120 68 129 73 82 100 11 94
AI164964 AI162452 AI165206 AI163012 AI165107 AI162249 AI163991 AI163880
87 29 99 42 95 24 65 63
AI165691 AI165189 AI161827 AI163941 AI165320 AI164753 AI163624 AI162402
115 98 13 64 104 84 57 27
AI161452 AI165690 AI164359 AI165949 AI163594 AI166034 AI162710 AI162940
1 114 71 124 54 127 36 40
AI162318 AI164979 AI164231 AI165974 AI163580 AI161823 AI165520 AI165903
25 89 70 125 53 12 107 122
AI162684 AI161697 AI162729 AI161572 AI162094 AI163338 AI166128 AI162215
35 10 37 4 18 49 134 22
AI165673 AI161638 AI161674 AI164604 AI165021 AI162092 AI162060 AI164546
112 7 8 77 92 17 16 75
Ordering.
get_order(wood_hc_complete, 2)
P A B E C D
1 2 3 6 4 5
get_order(wood_olo_complete, 2)
P A B C D E
1 2 3 4 5 6
Heatmap.
p1 <- ggpimage(Wood) +
ggtitle("Wood (no order)") +
theme(legend.position = "none")
p2 <- ggpimage(Wood, wood_hc_complete) +
ggtitle("Wood (HC complete)") +
theme(legend.position = "none")
p3 <- ggpimage(Wood, wood_olo_complete) +
ggtitle("Wood (OLO complete)")
p1 + p2 + p3 & scale_fill_gradientn(colours = c("skyblue", "red"))
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
Scale for fill is already present.
Adding another scale for fill, which will replace the existing scale.
sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC
tzcode source: system (glibc)
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] seriation_1.5.1-1 patchwork_1.1.3.9000 lubridate_1.9.2
[4] forcats_1.0.0 stringr_1.5.0 dplyr_1.1.2
[7] purrr_1.0.1 readr_2.1.4 tidyr_1.3.0
[10] tibble_3.2.1 ggplot2_3.4.2 tidyverse_2.0.0
[13] workflowr_1.7.0
loaded via a namespace (and not attached):
[1] gtable_0.3.3 xfun_0.39 bslib_0.5.0 processx_3.8.1
[5] callr_3.7.3 tzdb_0.4.0 vctrs_0.6.2 tools_4.3.0
[9] ps_1.7.5 generics_0.1.3 ca_0.71.1 fansi_1.0.4
[13] highr_0.10 pkgconfig_2.0.3 lifecycle_1.0.3 compiler_4.3.0
[17] farver_2.1.1 git2r_0.32.0 munsell_0.5.0 getPass_0.2-2
[21] codetools_0.2-19 httpuv_1.6.11 htmltools_0.5.5 sass_0.4.6
[25] yaml_2.3.7 later_1.3.1 pillar_1.9.0 jquerylib_0.1.4
[29] whisker_0.4.1 cachem_1.0.8 iterators_1.0.14 TSP_1.2-4
[33] foreach_1.5.2 tidyselect_1.2.0 digest_0.6.31 stringi_1.7.12
[37] labeling_0.4.2 rprojroot_2.0.3 fastmap_1.1.1 grid_4.3.0
[41] colorspace_2.1-0 cli_3.6.1 magrittr_2.0.3 utf8_1.2.3
[45] withr_2.5.0 scales_1.2.1 promises_1.2.0.1 registry_0.5-1
[49] timechange_0.2.0 rmarkdown_2.22 httr_1.4.6 hms_1.1.3
[53] evaluate_0.21 knitr_1.43 rlang_1.1.1 Rcpp_1.0.10
[57] glue_1.6.2 rstudioapi_0.14 jsonlite_1.8.5 R6_2.5.1
[61] fs_1.6.2