Last updated: 2021-10-07
Checks: 7 0
Knit directory: Amphibolis_Posidonia_Comparison/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20210414)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 08b28ea. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/OTT.nb.html
Ignored: analysis/plotGenes.nb.html
Ignored: analysis/plotRgenes.nb.html
Untracked files:
Untracked: Lost_GO_terms_in_five_species.PlantSpecific.xlsx
Untracked: Lost_GO_terms_in_five_species.xlsx
Untracked: data/BACKGROUND.txt
Untracked: data/Lost_GO_terms_in_five_species.PlantSpecific.xlsx
Untracked: data/Lost_GO_terms_in_five_species.xlsx
Untracked: data/R_genes.xlsx
Untracked: data/lost_in_amphi_GO.txt
Untracked: data/lost_in_posi_GO.txt
Untracked: data/lost_in_zmar_GO.txt
Untracked: data/lost_in_zmuel_GO.txt
Untracked: data/missing_amphi_vs_all_GO.txt
Untracked: data/missing_aquatics_GO.txt
Untracked: data/missing_arabidopsis_vs_all_GO.txt
Untracked: data/missing_posi_vs_all_GO.txt
Untracked: data/missing_seagrasses_GO.txt
Untracked: data/missing_zmar_vs_all_GO.txt
Untracked: data/missing_zmuel_vs_all_GO.txt
Untracked: data/only_in_amphi_GO.txt
Untracked: data/only_in_posi_GO.txt
Untracked: data/only_in_zmar_GO.txt
Untracked: data/only_in_zmuel_GO.txt
Untracked: data/only_seagrasses_GO.txt
Untracked: data/shared_lost_genes.xlsx
Untracked: data/species.txt
Untracked: species.csv
Untracked: whatever/
Unstaged changes:
Modified: data/Orthogroups.tsv
Deleted: data/Orthogroups_SpeciesOverlaps.tsv
Modified: data/Species_timetree.svg
Modified: data/species.csv
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/OTT.Rmd
) and HTML (docs/OTT.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 08b28ea | Philipp Bayer | 2021-10-07 | wflow_publish(files = c("analysis/*")) |
html | 76c8918 | Philipp Bayer | 2021-04-15 | Build site. |
Rmd | 49dcbb8 | Philipp Bayer | 2021-04-15 | wflow_publish(files = list.files("analysis/", pattern = "*Rmd", |
Rmd | 9e449bb | Philipp Bayer | 2021-04-14 | Add missing files |
html | 9e449bb | Philipp Bayer | 2021-04-14 | Add missing files |
html | 9e91425 | Philipp Bayer | 2021-04-14 | Build site. |
html | 1d42715 | Philipp Bayer | 2021-04-14 | Build site. |
Rmd | f3d4014 | Philipp Bayer | 2021-04-14 | wflow_publish("analysis/OTT.Rmd") |
Here I make a subtree from the Open Tree Of Life API for the species we have.
library(rotl)
library(ggtree)
ggtree v3.0.4 For help: https://yulab-smu.top/treedata-book/
If you use ggtree in published research, please cite the most appropriate paper(s):
1. Guangchuang Yu. Using ggtree to visualize data on tree-like structures. Current Protocols in Bioinformatics, 2020, 69:e96. doi:10.1002/cpbi.96
2. Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods for mapping and visualizing associated data on phylogeny using ggtree. Molecular Biology and Evolution 2018, 35(12):3041-3043. doi:10.1093/molbev/msy194
3. Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution 2017, 8(1):28-36. doi:10.1111/2041-210X.12628
library(cowplot)
library(ggplot2)
library(tidyverse)
-- Attaching packages --------------------------------------- tidyverse 1.3.1 --
v tibble 3.1.5 v dplyr 1.0.7
v tidyr 1.1.4 v stringr 1.4.0
v readr 2.0.2 v forcats 0.5.1
v purrr 0.3.4
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x tidyr::expand() masks ggtree::expand()
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
species <- c('Zostera marina', 'Zostera muelleri', 'Arabidopsis thaliana', 'Thellungiella parvula', 'Populus trichocarpa', 'Vitis vinifera', 'Amborella trichopoda', 'Oryza sativa', 'Zea mays', 'Brachypodium distachyon', 'Spirodela polyrhiza', 'Selaginella moellendorffi', 'Physcomitrella patens', 'Chlamydomonoas reinhardtii', 'Ostreococcus lucimarinus', 'Lemna gibba', 'Posidonia australis' ,'Amphibolis antarctica', 'Wolffia australiana')
Let’s call the API to get the OTT IDs for these species, and to double-check we got the right names :)
We should get a tree for 19 species.
taxon_search <- tnrs_match_names(names = species, context_name = "All life")
knitr::kable(taxon_search)
search_string | unique_name | approximate_match | ott_id | is_synonym | flags | number_matches |
---|---|---|---|---|---|---|
zostera marina | Zostera marina | FALSE | 814202 | FALSE | sibling_higher | 1 |
zostera muelleri | Zostera muelleri | FALSE | 766348 | FALSE | sibling_higher | 1 |
arabidopsis thaliana | Arabidopsis thaliana | FALSE | 309263 | FALSE | 1 | |
thellungiella parvula | Schrenkiella parvula | FALSE | 991614 | TRUE | incertae_sedis_inherited | 1 |
populus trichocarpa | Populus trichocarpa | FALSE | 8861 | FALSE | 2 | |
vitis vinifera | Vitis vinifera | FALSE | 756728 | FALSE | 2 | |
amborella trichopoda | Amborella trichopoda | FALSE | 303950 | FALSE | 1 | |
oryza sativa | Oryza sativa | FALSE | 709894 | FALSE | 1 | |
zea mays | Zea mays | FALSE | 605194 | FALSE | 1 | |
brachypodium distachyon | Brachypodium distachyon | FALSE | 413237 | FALSE | 1 | |
spirodela polyrhiza | Spirodela polyrhiza | FALSE | 814207 | FALSE | 1 | |
selaginella moellendorffi | Selaginella moellendorffii | TRUE | 799880 | FALSE | 3 | |
physcomitrella patens | Physcomitrella patens | FALSE | 821359 | FALSE | 1 | |
chlamydomonoas reinhardtii | Chlamydomonas reinhardtii | TRUE | 33153 | FALSE | 3 | |
ostreococcus lucimarinus | Ostreococcus sp. ‘lucimarinus’ | FALSE | 851102 | TRUE | incertae_sedis | 1 |
lemna gibba | Lemna gibba | FALSE | 431383 | FALSE | 1 | |
posidonia australis | Posidonia australis | FALSE | 91976 | FALSE | 1 | |
amphibolis antarctica | Amphibolis antarctica | FALSE | 460583 | FALSE | 1 | |
wolffia australiana | Wolffia australiana | FALSE | 1059895 | FALSE | 1 |
Let’s write the species names out again for timetree.org
cat(capture.output(cat(taxon_search$unique_name, sep='\n'), file="data/species.csv"))
That looks good to me!
ott_in_tree <- ott_id(taxon_search)[is_in_tree(ott_id(taxon_search))]
tr <- tol_induced_subtree(ott_ids = ott_in_tree)
Warning in collapse_single_cpp(ances = tree$edge[, 1], desc = tree$edge[, :
printing of extremely long output is truncated
Warning in collapse_single_cpp(ances = tree$edge[, 1], desc = tree$edge[, :
printing of extremely long output is truncated
Warning in collapse_singles(tr, show_progress): Dropping singleton nodes
with labels: Streptophyta ott916750, mrcaott2ott50189, mrcaott2ott108668,
mrcaott2ott59852, mrcaott2ott8171, mrcaott2ott70394, Euphyllophyta ott1007992,
Spermatophyta ott10218, mrcaott2ott2645, mrcaott2ott35778, Mesangiospermae
ott5298374, mrcaott2ott10930, mrcaott2ott2441, mrcaott2ott969, mrcaott2ott62529,
mrcaott2ott8379, eudicotyledons ott431495, Gunneridae ott853757, Pentapetalae
ott5316182, mrcaott2ott2464, fabids ott565281, mrcaott2ott371, mrcaott2ott1479,
mrcaott2ott345, Malpighiales ott429482, mrcaott345ott3853, mrcaott3853ott8858,
mrcaott8858ott33097, mrcaott8858ott12186, mrcaott8858ott98085, Salicaceae
ott530183, mrcaott8858ott270454, mrcaott8858ott703490, mrcaott8858ott33085,
mrcaott8858ott102531, mrcaott8858ott502530, mrcaott8858ott737360,
mrcaott8858ott474976, Saliceae ott509390, Populus ott530178, mrcaott8858ott8861,
mrcaott8861ott815941, mrcaott8861ott320722, mrcaott8861ott8867, malvids
ott565277, mrcaott96ott14140, mrcaott96ott50744, mrcaott96ott378,
mrcaott378ott29446, mrcaott378ott1697, Brassicales ott8844, mrcaott378ott307071,
mrcaott378ott32461, mrcaott378ott509555, mrcaott378ott318175,
mrcaott378ott9635, mrcaott378ott125843, mrcaott378ott509568, mrcaott378ott28763,
mrcaott378ott83547, Brassicaceae ott309271, mrcaott378ott299734,
mrcaott378ott4671, mrcaott4671ott58909, mrcaott4671ott6278, mrcaott6278ott15318,
mrcaott6278ott158438, mrcaott6278ott10585, mrcaott6278ott34460,
mrcaott6278ott9083, mrcaott9083ott19798, mrcaott19798ott31487,
mrcaott31487ott88883, mrcaott31487ott152275, mrcaott11023ott24850,
mrcaott11023ott56298, mrcaott11023ott95692, mrcaott11023ott25468,
mrcaott11023ott25476, Schrenkiella (genus in kingdom Archaeplastida)
ott5740975, Vitales ott1069308, mrcaott8384ott10050, mrcaott10050ott337881,
mrcaott10050ott194302, mrcaott10050ott302228, mrcaott10050ott24244,
mrcaott24244ott240260, mrcaott24244ott39768, mrcaott39768ott211705,
mrcaott39768ott175159, mrcaott175159ott602495, mrcaott175159ott246514,
mrcaott175159ott903683, mrcaott175159ott568892, mrcaott175159ott235835,
mrcaott175159ott941927, mrcaott175159ott568880, mrcaott175159ott175164,
mrcaott175159ott509603, mrcaott509603ott805071, Liliopsida ott1058517,
Petrosaviidae ott5308424, mrcaott121ott4474, mrcaott121ott1439,
mrcaott121ott334, commelinids ott225270, mrcaott252ott213153, Poales ott921871,
mrcaott252ott128594, mrcaott252ott1477, mrcaott252ott7120, mrcaott252ott285512,
mrcaott252ott3717, mrcaott252ott272812, mrcaott252ott334529,
mrcaott252ott427739, Poaceae ott508090, mrcaott252ott196505, mrcaott252ott43427,
mrcaott252ott751372, mrcaott252ott11561, Oryzoideae ott641467, Oryzeae
ott415723, mrcaott36487ott890607, Oryzinae ott5744622, Oryza ott135764,
mrcaott67120ott709900, mrcaott67120ott272949, mrcaott67120ott709898,
mrcaott67120ott135756, mrcaott135756ott709892, mrcaott135756ott709887,
mrcaott656ott1473, mrcaott1473ott33975, Brachypodieae ott693951, Brachypodium
ott413242, mrcaott1787ott2051, mrcaott1787ott1985, Panicoideae ott641461,
mrcaott6065ott9925, Andropogonodae ott5737327, Andropogoneae ott475213,
Tripsacinae ott5759522, Zea ott605186, Araceae ott481972, mrcaott290ott3983,
Lemna ott1008186, mrcaott338992ott338994, mrcaott338994ott431388,
mrcaott338994ott431383, mrcaott66494ott407568, mrcaott407568ott432842,
mrcaott407568ott1059896, mrcaott407568ott1059895, mrcaott1059895ott1059900,
Spirodela ott431386, mrcaott5202ott159280, mrcaott5202ott30666,
mrcaott30666ott106181, mrcaott30666ott208913, mrcaott30666ott814198,
mrcaott30666ott40117, Zosteraceae ott637476, mrcaott40117ott91951,
mrcaott40117ott197961, mrcaott197961ott817762, mrcaott197961ott529863,
mrcaott197961ott766348, mrcaott766348ott766350, mrcaott436201ott436206,
mrcaott436201ott436203, mrcaott436201ott814202, mrcaott87589ott256106,
mrcaott256106ott460583, mrcaott460583ott514367, mrcaott460583ott613178,
Amphibolis ott460577, mrcaott91974ott351467, Posidoniaceae ott460581, Posidonia
(genus in kingdom Archaeplastida) ott91949, mrcaott91976ott928713, Amborellales
ott927960, Amborellaceae ott295910, Amborella ott303952, Lycopodiopsida
ott144795, mrcaott3661ott18736, Selaginellales ott144821, Selaginellaceae
ott144818, Selaginella ott144816, mrcaott10602ott10604, mrcaott10604ott10608,
mrcaott10604ott78166, mrcaott10604ott10618, mrcaott101177ott144813,
mrcaott532022ott583202, mrcaott532022ott799880, mrcaott541ott1066,
Bryophyta ott246594, mrcaott1066ott204205, Bryophytina ott471195, Bryopsida
ott821346, mrcaott1066ott86152, mrcaott1066ott1769, mrcaott1769ott7813,
mrcaott7813ott148635, mrcaott7813ott86148, mrcaott7813ott454749,
mrcaott7813ott86154, mrcaott7813ott31044, mrcaott31044ott844044,
mrcaott31044ott1060862, Physcomitrella ott821349, Chlorophyta ott979501,
mrcaott185ott42071, mrcaott185ott1426, mrcaott1426ott1544, mrcaott1544ott8659,
mrcaott1544ott15345, mrcaott1544ott9282, mrcaott9389ott818260,
mrcaott9389ott23557, mrcaott23557ott527099
plot(tr)
Let’s get rid of those OTT IDs from the tree, I don’t like that. There’s a helper function for that
tr$tip.label <- strip_ott_ids(tr$tip.label, remove_underscores = TRUE)
ggtree(tr) +
geom_tree() +
geom_tiplab(size=3,
fontface='italic') +
theme_tree() +
xlim(0, 15)
Cool! I took the above species list and uploaded to timetree (see Species_timetree.svg)
plot(ggdraw()+cowplot::draw_image('./data/species_timetree.svg'))
Warning: Package `magick` is required to draw images. Image not drawn.
Timetree also lets you export the tree timed with MYA
tree <- read.tree('./data/timetree_species.nwk')
tree$tip.label <- str_replace_all(tree$tip.label, '_', ' ')
p <- ggplot(tree) +
geom_tree() +
theme_tree2() +
scale_x_continuous(labels = abs) +
geom_tiplab(size=3, fontface='italic')
revts(p) + xlim(-1200, 300)
Scale for 'x' is already present. Adding another scale for 'x', which will
replace the existing scale.
sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)
Matrix products: default
locale:
[1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.5.1 stringr_1.4.0 dplyr_1.0.7 purrr_0.3.4
[5] readr_2.0.2 tidyr_1.1.4 tibble_3.1.5 tidyverse_1.3.1
[9] ggplot2_3.3.5 cowplot_1.1.1 ggtree_3.0.4 rotl_3.0.11
[13] workflowr_1.6.2
loaded via a namespace (and not attached):
[1] nlme_3.1-152 fs_1.5.0 lubridate_1.7.10 progress_1.2.2
[5] httr_1.4.2 rprojroot_2.0.2 tools_4.1.1 backports_1.2.1
[9] utf8_1.2.2 R6_2.5.1 DBI_1.1.1 lazyeval_0.2.2
[13] colorspace_2.0-2 withr_2.4.2 tidyselect_1.1.1 prettyunits_1.1.1
[17] curl_4.3.2 compiler_4.1.1 git2r_0.28.0 cli_3.0.1
[21] rvest_1.0.1 xml2_1.3.2 labeling_0.4.2 scales_1.1.1
[25] digest_0.6.28 yulab.utils_0.0.2 rmarkdown_2.11 rentrez_1.2.3
[29] pkgconfig_2.0.3 htmltools_0.5.2 dbplyr_2.1.1 fastmap_1.1.0
[33] highr_0.9 rlang_0.4.11 readxl_1.3.1 rstudioapi_0.13
[37] farver_2.1.0 gridGraphics_0.5-1 jquerylib_0.1.4 generics_0.1.0
[41] jsonlite_1.7.2 magrittr_2.0.1 ggplotify_0.1.0 patchwork_1.1.1
[45] Rcpp_1.0.7 munsell_0.5.0 fansi_0.5.0 ape_5.5
[49] lifecycle_1.0.1 stringi_1.7.5 whisker_0.4 yaml_2.2.1
[53] grid_4.1.1 parallel_4.1.1 promises_1.2.0.1 crayon_1.4.1
[57] rncl_0.8.4 lattice_0.20-44 haven_2.4.3 hms_1.1.1
[61] knitr_1.36 pillar_1.6.3 reprex_2.0.1 XML_3.99-0.8
[65] glue_1.4.2 evaluate_0.14 ggfun_0.0.4 modelr_0.1.8
[69] vctrs_0.3.8 treeio_1.16.2 tzdb_0.1.2 httpuv_1.6.3
[73] cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1 xfun_0.26
[77] broom_0.7.9 tidytree_0.3.5 later_1.3.0 aplot_0.1.1
[81] ellipsis_0.3.2