A_03_Taxon_matching_across_datasets.R

Taxonomic matching across datasets

Every dataset uses it’s own taxonomy. In order to assess as many species based on field data as possible, we decided to keep the original taxonomies for all analyses. That means that same species that are now recognized to belong to different clades, will be assessed based on their old taxonomy. Subspecies and hybrids will also be based on the old taxonomy although more updated nomenclature is available.

There are 3 data sources that may lead to information loss if species are not standardized to match the taxonomy in those datasets. Those are:

  1. BirdLife global range maps for the calculation of global climate niche breadth (version: 2024)

  2. BirdTree phylogeny for the calculation of phylogenetic distinctiveness and assessment of phylogenetic autocorrelation in the model residuals. (version: 2012)

  3. AVONET trait database for the extraction of species traits (version 2022)

(4. IUCN but this is part of BirdLife2024 and has the same taxonomy)

Source 00_Configuration.R

Code
source(here::here("Code/00_Configuration.R"))
x <- lapply(package_list, require, character = TRUE)
rm(x)

1. BirdLife 2024 + dataset verbatim names

The data was matched to BirdLife 2024 taxonomy (there was only one change in scientific name)

“scientificName” should match the taxonomy of the BirdLife range maps (2024) and IUCN data

Code
tax_lookup <- 
  readRDS(here::here("Data/output/1_data_filtered.rds")) %>%
  distinct(verbatimIdentification, scientificName) %>% 
  na.omit() # BirdLife 2024 taxonomy

head(tax_lookup)
   verbatimIdentification          scientificName
1 Nucifraga caryocatactes Nucifraga caryocatactes
2      Anas platyrhynchos      Anas platyrhynchos
3         Aythya fuligula         Aythya fuligula
4           Ciconia nigra           Ciconia nigra
5     Carduelis carduelis     Carduelis carduelis
6         Passer montanus         Passer montanus
Code
length(unique(tax_lookup$scientificName)) # 726 standardized species
[1] 726
Code
length(unique(tax_lookup$verbatimIdentification)) # 762 from raw data
[1] 762

2. Crosswalk for BL2010-BL2018

Code
crosswalk <- readxl::read_excel("c:/Users/wolke/OneDrive - CZU v Praze/Frieda_PhD_files/02_StaticPatterns/Git/Data/input/BL_taxonomy_crosswalk.xlsx",
                                sheet = "Crosswalk") %>%
  dplyr::select(1,41) %>% # 2018 & 2010 taxonomy
  setNames(c("ScientificName2018", "ScientificName2010"))
Code
crosswalk %>% 
  filter(ScientificName2018 == "Not Recognised") %>% 
  kable()
ScientificName2018 ScientificName2010
Not Recognised Larus thayeri
Not Recognised Corapipo altera
Not Recognised Serpophaga munda
Not Recognised Epinecrophylla fjeldsaai
Not Recognised Myrmotherula fluminensis
Not Recognised Formicivora littoralis
Not Recognised Upucerthia jelskii
Not Recognised Asthenes sclateri
Not Recognised Hylexetastes uniformis
Not Recognised Hylexetastes brigidai
Not Recognised Philemon novaeguineae
Not Recognised Lichmera limbata
Not Recognised Sericornis virgatus
Not Recognised Telophorus quadricolor
Not Recognised Lanius marwitzi
Not Recognised Monarcha erythrostictus
Not Recognised Aphelocoma insularis
Not Recognised Corvus levaillantii
Not Recognised Parotia helenae
Not Recognised Hirundo domicola
Not Recognised Mirafra cantillans
Not Recognised Heteromirafra sidamoensis
Not Recognised Certhilauda barlowi
Not Recognised Calandrella cheleensis
Not Recognised Alauda japonica
Not Recognised Cisticola discolor
Not Recognised Phyllastrephus leucolepis
Not Recognised Cettia vulcania
Not Recognised Bradypterus timorensis
Not Recognised Nesillas longicaudata
Not Recognised Phylloscopus makirensis
Not Recognised Sylvia minula
Not Recognised Sylvia althaea
Not Recognised Stachyris ambigua
Not Recognised Polioptila clementsi
Not Recognised Lamprotornis elisabeth
Not Recognised Zoothera imbricata
Not Recognised Oenanthe lugentoides
Not Recognised Dicaeum aeruginosum
Not Recognised Arachnothera everetti
Not Recognised Lagonosticta landanae
Not Recognised Erythrura regia
Not Recognised Lonchura nigriceps
Not Recognised Lonchura nigerrima
Not Recognised Prunella fagani
Not Recognised Anthus longicaudatus
Not Recognised Sporophila melanops
Not Recognised Amaurospiza concolor
Not Recognised Amaurospiza carrizalensis
Not Recognised Lophura hatinhensis
Not Recognised Lophura hoogerwerfi
Not Recognised Butorides virescens
Not Recognised Falco pelegrinoides
Not Recognised Fulica caribaea
Not Recognised Haematopus finschi
Not Recognised Haematopus bachmani
Not Recognised Himantopus leucocephalus
Not Recognised Himantopus mexicanus
Not Recognised Larus scopulinus
Not Recognised Catharacta lonnbergi
Not Recognised Dysmoropelia dekarchiskos
Not Recognised Gallicolumba norfolciensis
Not Recognised Cyanoramphus cookii
Not Recognised Cyanoramphus saisseti
Not Recognised Ara gossei
Not Recognised Ara atwoodi
Not Recognised Ara erythrocephala
Not Recognised Ara guadeloupensis
Not Recognised Aratinga brevipes
Not Recognised Cuculus optatus
Not Recognised Cacomantis sepulcralis
Not Recognised Chrysococcyx russatus
Not Recognised Eudynamys melanorhynchus
Not Recognised Neomorphus squamiger
Not Recognised Tyto manusi
Not Recognised Tyto sororcula
Not Recognised Bubo vosseleri
Not Recognised Batrachostomus affinis
Not Recognised Caprimulgus nigriscapularis
Not Recognised Caprimulgus ruwenzorii
Not Recognised Collocalia rogersi
Not Recognised Collocalia amelis
Not Recognised Collocalia palawanensis
Not Recognised Collocalia ocista
Not Recognised Collocalia germani
Not Recognised Chaetura viridipennis
Not Recognised Campylopterus excellens
Not Recognised Chlorostilbon melanorhynchus
Not Recognised Chlorostilbon alice
Not Recognised Thalurania fannyi
Not Recognised Trogon aurantiiventris
Not Recognised Not Recognised
Not Recognised Trogon caligatus
Not Recognised Ceyx rufidorsa
Not Recognised Penelopides samarensis
Not Recognised Ramphastos swainsonii
Not Recognised Picoides dorsalis
Not Recognised Hypositta perdita
Not Recognised Argusianus bipunctatus
Not Recognised Apus toulsoni
Not Recognised Zoothera tanganjicae
Not Recognised Serinus whytii
Not Recognised Carduelis hornemanni
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Code
crosswalk %>% 
  filter(ScientificName2010 == "Not Recognised") %>% 
  kable()
ScientificName2018 ScientificName2010
Ortalis columbiana Not Recognised
Melanitta stejnegeri Not Recognised
Melanitta deglandi Not Recognised
Chenonetta finschi Not Recognised
Anas zonorhyncha Not Recognised
Columba thiriouxi Not Recognised
Nesoenas cicur Not Recognised
Alectroenas payandeei Not Recognised
Antrostomus arizonae Not Recognised
Phaethornis aethopygus Not Recognised
Mentocrex beankaensis Not Recognised
Dryolimnas augusti Not Recognised
Porphyrio paepae Not Recognised
Tribonyx hodgenorum Not Recognised
Oceanites pincoyae Not Recognised
Puffinus bryani Not Recognised
Nyctanassa carcinocatactes Not Recognised
Tyto almae Not Recognised
Aegolius gradyi Not Recognised
Bermuteo avivorus Not Recognised
Buteo socotraensis Not Recognised
Merops mentalis Not Recognised
Capito fitzpatricki Not Recognised
Picus sharpei Not Recognised
Colaptes oceanicus Not Recognised
Psittacus timneh Not Recognised
Eclectus infectus Not Recognised
Hydrornis schwaneri Not Recognised
Hydrornis irena Not Recognised
Formicivora grantsaui Not Recognised
Formicivora paludicola Not Recognised
Herpsilochmus praedictus Not Recognised
Herpsilochmus stotzi Not Recognised
Willisornis vidua Not Recognised
Sipia palliata Not Recognised
Grallaria fenwickorum Not Recognised
Grallaricula cumanensis Not Recognised
Scytalopus diamantinensis Not Recognised
Cichlocolaptes mazarbarnetti Not Recognised
Clibanornis rufipectus Not Recognised
Automolus lammi Not Recognised
Thripophaga amacurensis Not Recognised
Synallaxis beverlyae Not Recognised
Phibalura boliviana Not Recognised
Tityra leucura Not Recognised
Hemitriccus cohnhafti Not Recognised
Zimmerius chicomendesi Not Recognised
Pseudocolopteryx citreola Not Recognised
Batis erlangeri Not Recognised
Dicrurus menagei Not Recognised
Chasiempis sclateri Not Recognised
Chasiempis ibidis Not Recognised
Cissa jefferyi Not Recognised
Cyanocorax hafferi Not Recognised
Orthotomus chaktomuk Not Recognised
Locustella chengi Not Recognised
Thryophilus sernai Not Recognised
Aplonis ulietensis Not Recognised
Turdus xanthorhynchus Not Recognised
Alethe castanea Not Recognised
Oenanthe heuglinii Not Recognised
Passer zarudnyi Not Recognised
Rhodopechys alienus Not Recognised
Crithagra reichenowi Not Recognised
Pipilo naufragus Not Recognised
Icterus northropi Not Recognised
Icterus melanopsis Not Recognised
Icterus portoricensis Not Recognised
Icterus pyrrhopterus Not Recognised
Setophaga flavescens Not Recognised
Sporophila iberaensis Not Recognised
Paroaria nigrogenis Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Not Recognised Not Recognised
Code
skim(crosswalk)
Data summary
Name crosswalk
Number of rows 11241
Number of columns 2
_______________________
Column type frequency:
character 2
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
ScientificName2018 0 1 9 36 0 11127 0
ScientificName2010 0 1 9 33 0 10028 0

remove rows where both columns have “Not Recognised” to avoid wrong matches

Code
crosswalk <- 
  crosswalk %>%
  filter(!(ScientificName2018 == "Not Recognised" & 
             ScientificName2010 == "Not Recognised"))

Differences between BL2024 & BL2018

Code
nm <- 
  setdiff(tax_lookup$scientificName, 
          crosswalk$ScientificName2018) # 20 not matched

setdiff(tax_lookup$verbatimIdentification, 
        crosswalk$ScientificName2018) #98 not matched
 [1] "Parus palustris"                  "Carduelis chloris"               
 [3] "Carduelis cannabina"              "Dendrocopos minor"               
 [5] "Parus ater"                       "Parus montanus"                  
 [7] "Carduelis spinus"                 "Corvus corone cornix"            
 [9] "Miliaria calandra"                "Delichon urbica"                 
[11] "Parus cristatus"                  "Parus caeruleus"                 
[13] "Corvus corone corone"             "Saxicola torquata"               
[15] "Regulus ignicapillus"             "Anas querquedula"                
[17] "Carduelis flammea"                "Tetrao tetrix"                   
[19] "Anas strepera"                    "Dendrocopos medius"              
[21] "Carduelis flammea cabaret"        "Anas clypeata"                   
[23] "Luscinia svecica svecica"         "Charadrius morinellus"           
[25] "Aquila pomarina"                  "Porzana parva"                   
[27] "Luscinia svecica cyanecula"       "Dendroica pensylvanica"          
[29] "Carpodacus mexicanus"             "Dryocopus pileatus"              
[31] "Vermivora pinus"                  "Dendroica petechia"              
[33] "Picoides pubescens"               "Dendroica fusca"                 
[35] "Dendroica magnolia"               "Wilsonia citrina"                
[37] "Carpodacus purpureus"             "Seiurus motacilla"               
[39] "Oporornis philadelphia"           "Dendroica caerulescens"          
[41] "Dendroica coronata"               "Dendroica virens"                
[43] "Butorides virescens"              "Picoides villosus"               
[45] "Seiurus noveboracensis"           "Ammodramus henslowii"            
[47] "Coccothraustes vespertinus"       "Wilsonia canadensis"             
[49] "Anas discors"                     "Vermivora ruficapilla"           
[51] "Parula americana"                 "Dendroica pinus"                 
[53] "Dendroica cerulea"                "Hydropogne caspia"               
[55] "Anas americana"                   "Dendroica discolor"              
[57] "Oporornis formosus"               "Phalacrocorax auritus"           
[59] "Caprimulgus vociferus"            "Dendroica dominica"              
[61] "Picoides dorsalis"                "Dendroica castanea"              
[63] "Dendroica striata"                "Vermivora peregrina"             
[65] "Dendroica palmarum"               "Dendroica tigrina"               
[67] "Wilsonia pusilla"                 "Ammodramus caudacutus"           
[69] "Caprimulgus carolinensis"         "Leucophaeus atricilla"           
[71] "Ammodramus maritimus"             "Anas platyrhynchos x A. rubripes"
[73] "Vermivora chrysoptera x V. pinus" "Egretta intermedia"              
[75] "Cettia diphone"                   "Parus minor"                     
[77] "Porzana fusca"                    "Sterna albifrons"                
[79] "Poecile varius"                   "Hirundo daurica"                 
[81] "Luscinia komadori"                "Dendrocopos kizuki"              
[83] "Sapheopipo noguchii"              "Luscinia akahige"                
[85] "Cuculus optatus"                  "Luscinia calliope"               
[87] "Anas falcata"                     "Zoothera sibirica"               
[89] "Passer rutilans"                  "Luscinia cyane"                  
[91] "Uragus sibiricus"                 "Phalacrocorax pelagicus"         
[93] "Oceanodroma monorhis"             "Porzana pusilla"                 
[95] "Oceanodroma leucorhoa"            "Oceanodroma castro"              
[97] "Tetrastes bonasia"                "Phalacrocorax urile"             
Code
setdiff(tax_lookup$verbatimIdentification, 
        crosswalk$ScientificName2010) #93 not matched
 [1] "Corvus corone cornix"             "Delichon urbica"                 
 [3] "Corvus corone corone"             "Saxicola torquata"               
 [5] "Regulus ignicapillus"             "Carduelis flammea cabaret"       
 [7] "Luscinia svecica svecica"         "Charadrius morinellus"           
 [9] "Luscinia svecica cyanecula"       "Spinus tristis"                  
[11] "Poecile atricapillus"             "Gallinago delicata"              
[13] "Spinus pinus"                     "Hydropogne caspia"               
[15] "Ardea alba"                       "Poecile hudsonicus"              
[17] "Falcipennis canadensis"           "Leucophaeus atricilla"           
[19] "Tringa semipalmata"               "Anas platyrhynchos x A. rubripes"
[21] "Sternula antillarum"              "Gelochelidon nilotica"           
[23] "Vermivora chrysoptera x V. pinus" "Egretta intermedia"              
[25] "Hypsipetes amaurotis"             "Anas zonorhyncha"                
[27] "Parus minor"                      "Poecile varius"                  
[29] "Luscinia komadori"                "Acrocephalus orientalis"         
[31] "Spodiopsar cineraceus"            "Otus lempiji"                    
[33] "Sapheopipo noguchii"              "Hierococcyx hyperythrus"         
[35] "Periparus ater"                   "Chloris sinica"                  
[37] "Phylloscopus xanthodryas"         "Luscinia akahige"                
[39] "Pernis ptilorhynchus"             "Poecile montanus"                
[41] "Agropsar philippensis"            "Tetrastes bonasia"               
[43] "Poecile palustris"                "Calonectris borealis"            
[45] "Hydrobates castro"                "Mareca penelope"                 
[47] "Acanthis flammea"                 "Gulosus aristotelis"             
[49] "Mareca strepera"                  "Spinus spinus"                   
[51] "Hydrobates leucorhous"            "Spatula clypeata"                
[53] "Hydrocoloeus minutus"             "Emberiza calandra"               
[55] "Cyanopica cooki"                  "Cecropis daurica"                
[57] "Cyanistes caeruleus"              "Picus sharpei"                   
[59] "Lophophanes cristatus"            "Sternula albifrons"              
[61] "Aquila fasciata"                  "Linaria cannabina"               
[63] "Dryobates minor"                  "Lanius meridionalis"             
[65] "Zapornia pusilla"                 "Ptyonoprogne rupestris"          
[67] "Alaudala rufescens"               "Cercotrichas galactotes"         
[69] "Iduna opaca"                      "Spatula querquedula"             
[71] "Zapornia parva"                   "Cyanecula svecica"               
[73] "Leiopicus medius"                 "Thalasseus sandvicensis"         
[75] "Linaria flavirostris"             "Calidris pugnax"                 
[77] "Lyrurus tetrix"                   "Bubo scandiacus"                 
[79] "Hydroprogne caspia"               "Clanga pomarina"                 
[81] "Passer italiae"                   "Calidris falcinellus"            
[83] "Poecile cinctus"                  "Spilopelia senegalensis"         
[85] "Microcarbo pygmaeus"              "Sylvia crassirostris"            
[87] "Iduna pallida"                    "Pastor roseus"                   
[89] "Poecile lugubris"                 "Phylloscopus orientalis"         
[91] "Clanga clanga"                    "Cyanistes cyanus"                
[93] "Anthropoides virgo"              
Code
setdiff(
  tax_lookup %>% filter(scientificName %in% nm) %>% pull(verbatimIdentification),
  crosswalk$ScientificName2010) 
[1] "Luscinia svecica svecica"         "Luscinia svecica cyanecula"      
[3] "Falcipennis canadensis"           "Anas platyrhynchos x A. rubripes"
[5] "Vermivora chrysoptera x V. pinus" "Tetrastes bonasia"               
[7] "Cyanecula svecica"                "Sylvia crassirostris"            
Code
# 8 out of 20 not matched (but 12 more matched)

Merge taxonomies

step 1: merge scientificName & ScientificName2018 step 2: merge verbatimIdentification & ScientificName2010

Code
tax_lookup2 <- tax_lookup %>%
  left_join(
    crosswalk,
    by = join_by(scientificName == ScientificName2018), keep = TRUE) %>%
  left_join(
    crosswalk,
    by = join_by(verbatimIdentification == ScientificName2010),
    keep = TRUE) %>%
  mutate(ScientificName2010 = coalesce(ScientificName2010.x, ScientificName2010.y)) %>%
  select(-ScientificName2010.x, -ScientificName2010.y) %>%
  mutate(ScientificName2018 = coalesce(ScientificName2018.x, ScientificName2018.y)) %>%
  select(-ScientificName2018.x, -ScientificName2018.y) %>%
  mutate(ScientificName2010 = case_when(ScientificName2010 == "Not Recognised" ~ NA,.default = ScientificName2010),
         ScientificName2018 = case_when(ScientificName2018 == "Not Recognised" ~ NA,.default = ScientificName2018))
Code
skim(tax_lookup2)
Data summary
Name tax_lookup2
Number of rows 863
Number of columns 4
_______________________
Column type frequency:
character 4
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
verbatimIdentification 0 1.00 9 32 0 762 0
scientificName 0 1.00 9 39 0 726 0
ScientificName2010 10 0.99 9 29 0 711 0
ScientificName2018 9 0.99 9 29 0 723 0
Code
tax_lookup2 %>% filter(is.na(ScientificName2010)) %>% 
  kable()
verbatimIdentification scientificName ScientificName2010 ScientificName2018
Luscinia svecica svecica Luscinia svecica NA NA
Luscinia svecica cyanecula Luscinia svecica NA NA
Falcipennis canadensis Canachites canadensis NA NA
Anas platyrhynchos x A. rubripes Anas platyrhynchos x Anas rubripes NA NA
Vermivora chrysoptera x V. pinus Vermivora chrysoptera x Vermivora pinus NA NA
Anas zonorhyncha Anas zonorhyncha NA Anas zonorhyncha
Tetrastes bonasia Tetrastes bonasia NA NA
Picus sharpei Picus sharpei NA Picus sharpei
Cyanecula svecica Luscinia svecica NA NA
Sylvia crassirostris Curruca crassirostris NA NA
Code
tax_lookup2 %>% filter(is.na(ScientificName2018)) %>% 
  kable()
verbatimIdentification scientificName ScientificName2010 ScientificName2018
Luscinia svecica svecica Luscinia svecica NA NA
Luscinia svecica cyanecula Luscinia svecica NA NA
Falcipennis canadensis Canachites canadensis NA NA
Anas platyrhynchos x A. rubripes Anas platyrhynchos x Anas rubripes NA NA
Vermivora chrysoptera x V. pinus Vermivora chrysoptera x Vermivora pinus NA NA
Cuculus optatus Cuculus optatus Cuculus optatus NA
Tetrastes bonasia Tetrastes bonasia NA NA
Cyanecula svecica Luscinia svecica NA NA
Sylvia crassirostris Curruca crassirostris NA NA

Notes: there are NAs in the ScientificName2018, 2010 columns We have to be careful not to match NAs to NAs which would add wrong data.

3. Avonet crosswalk: Birdlife / BirdTree (Species 1 = ScientificName2018)

Code
BL_BT_crosswalk <- 
  read_excel(
    "c:/Users/wolke/OneDrive - CZU v Praze/Frieda_PhD_files/02_StaticPatterns/Git/Data/input/AVONET Supplementary dataset 1.xlsx",
    sheet = "BirdLife–BirdTree crosswalk")

dd_BL_notMatched <- 
  setdiff(tax_lookup2$ScientificName2018 %>% na.omit(),
          BL_BT_crosswalk$Species1 %>% na.omit())

intersect(dd_BL_notMatched, 
          BL_BT_crosswalk$Species3) 
character(0)
Code
# missing species found in Species3 ("Psittacula krameri")
Code
tax_lookup3 <- 
  tax_lookup2 %>%
  select(ScientificName2018) %>% unique() %>%
  na.omit() %>%
  left_join(BL_BT_crosswalk %>% filter(!is.na(Species1)) %>%
              select(-Match.type, -Match.notes),
            by = join_by("ScientificName2018" == "Species1"),
                         keep = FALSE) %>%
  setNames(c("ScientificName2018", "BirdTree")) %>%
  left_join(BL_BT_crosswalk %>% 
              filter(Species3 == "Psittacula krameri") %>% 
              select(Species1, Species3), 
            by = join_by("ScientificName2018" == "Species3"), 
            keep = TRUE) %>%
  right_join(tax_lookup2) %>%
  select(verbatimIdentification, scientificName, 
         ScientificName2010, ScientificName2018, BirdTree)

skim(tax_lookup3)
Data summary
Name tax_lookup3
Number of rows 890
Number of columns 5
_______________________
Column type frequency:
character 5
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
verbatimIdentification 0 1.00 9 32 0 762 0
scientificName 0 1.00 9 39 0 726 0
ScientificName2010 10 0.99 9 29 0 711 0
ScientificName2018 9 0.99 9 29 0 723 0
BirdTree 9 0.99 9 29 0 733 0

check NAs:

Code
tax_lookup3 %>% 
  filter(if_any(everything(), is.na))
             verbatimIdentification                          scientificName
1                  Anas zonorhyncha                        Anas zonorhyncha
2                     Picus sharpei                           Picus sharpei
3          Luscinia svecica svecica                        Luscinia svecica
4        Luscinia svecica cyanecula                        Luscinia svecica
5            Falcipennis canadensis                   Canachites canadensis
6  Anas platyrhynchos x A. rubripes      Anas platyrhynchos x Anas rubripes
7  Vermivora chrysoptera x V. pinus Vermivora chrysoptera x Vermivora pinus
8                   Cuculus optatus                         Cuculus optatus
9                 Tetrastes bonasia                       Tetrastes bonasia
10                Cyanecula svecica                        Luscinia svecica
11             Sylvia crassirostris                   Curruca crassirostris
   ScientificName2010 ScientificName2018            BirdTree
1                <NA>   Anas zonorhyncha Anas poecilorhyncha
2                <NA>      Picus sharpei       Picus viridis
3                <NA>               <NA>                <NA>
4                <NA>               <NA>                <NA>
5                <NA>               <NA>                <NA>
6                <NA>               <NA>                <NA>
7                <NA>               <NA>                <NA>
8     Cuculus optatus               <NA>                <NA>
9                <NA>               <NA>                <NA>
10               <NA>               <NA>                <NA>
11               <NA>               <NA>                <NA>

10 species missing from phylo 8 species missing from AVONET (9 in ScientificName2018)

BirdTree consensus (based on recommendations from: https://doi.org/10.1093/czoolo/61.6.959) created in python

Code
library(ape)
tips <- ape::read.tree(
  here::here("Data/input/MRC_consensus_BirdTree.tre"))$tip.label
tree_sp <- data.frame(tip.label = tips)
tree_sp$tip_label <- gsub("_", " ", tree_sp$tip.label)

setdiff(unique(tax_lookup3$BirdTree)[-734], tree_sp$tip_label)
character(0)

all matched (we dropped the NA (row 734) from BirdTree column)

Merge with actual bird tree tip labels (not just the ones from Avonet)

Code
tax_lookup4 <- 
  tax_lookup3 %>%
  left_join(tree_sp, 
            by = join_by("BirdTree" == "tip_label"), 
            keep = FALSE)

tax_lookup4 %>% 
  filter(is.na(BirdTree)) %>% 
  kable() #9
verbatimIdentification scientificName ScientificName2010 ScientificName2018 BirdTree tip.label
Luscinia svecica svecica Luscinia svecica NA NA NA NA
Luscinia svecica cyanecula Luscinia svecica NA NA NA NA
Falcipennis canadensis Canachites canadensis NA NA NA NA
Anas platyrhynchos x A. rubripes Anas platyrhynchos x Anas rubripes NA NA NA NA
Vermivora chrysoptera x V. pinus Vermivora chrysoptera x Vermivora pinus NA NA NA NA
Cuculus optatus Cuculus optatus Cuculus optatus NA NA NA
Tetrastes bonasia Tetrastes bonasia NA NA NA NA
Cyanecula svecica Luscinia svecica NA NA NA NA
Sylvia crassirostris Curruca crassirostris NA NA NA NA

Write to file (csv)

Code
write.csv(tax_lookup4, here::here("Data/input/Tax_lookup.csv"))