Last updated: 2025-09-28

Checks: 7 0

Knit directory: genomics_ancest_disease_dispar/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20220216) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version e777ee8. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rproj.user/
    Ignored:    data/.DS_Store
    Ignored:    data/gbd/.DS_Store
    Ignored:    data/gbd/IHME-GBD_2021_DATA-d8cf695e-1.csv
    Ignored:    data/gbd/ihme_gbd_2019_global_disease_burden_rate_all_ages.csv
    Ignored:    data/gbd/ihme_gbd_2019_global_paf_rate_percent_all_ages.csv
    Ignored:    data/gbd/ihme_gbd_2021_global_disease_burden_rate_all_ages.csv
    Ignored:    data/gbd/ihme_gbd_2021_global_paf_rate_percent_all_ages.csv
    Ignored:    data/gwas_catalog/
    Ignored:    data/icd/.DS_Store
    Ignored:    data/icd/IHME_GBD_2019_COD_CAUSE_ICD_CODE_MAP_Y2020M10D15.XLSX
    Ignored:    data/icd/IHME_GBD_2019_NONFATAL_CAUSE_ICD_CODE_MAP_Y2020M10D15.XLSX
    Ignored:    data/icd/IHME_GBD_2021_COD_CAUSE_ICD_CODE_MAP_Y2024M05D16.XLSX
    Ignored:    data/icd/IHME_GBD_2021_NONFATAL_CAUSE_ICD_CODE_MAP_Y2024M05D16.XLSX
    Ignored:    data/icd/UK_Biobank_master_file.tsv
    Ignored:    data/icd/cdc_valid_icd10_Sep_23_2025.xlsx
    Ignored:    data/icd/cdc_valid_icd9_Sep_23_2025.xlsx
    Ignored:    data/icd/phecode_international_version_unrolled.csv
    Ignored:    data/icd/semiautomatic_ICD-pheno.txt
    Ignored:    data/icd/~$IHME_GBD_2021_COD_CAUSE_ICD_CODE_MAP_Y2024M05D16.XLSX
    Ignored:    data/icd/~$IHME_GBD_2021_NONFATAL_CAUSE_ICD_CODE_MAP_Y2024M05D16.XLSX
    Ignored:    data/who/
    Ignored:    diseases.txt
    Ignored:    not_found_diseases.txt
    Ignored:    orig_phecode_map.csv
    Ignored:    original_phecodes_pheinfo.csv
    Ignored:    output/gwas_cat/
    Ignored:    output/gwas_study_info_cohort_corrected.csv
    Ignored:    output/gwas_study_info_trait_corrected.csv
    Ignored:    output/gwas_study_info_trait_ontology_info.csv
    Ignored:    output/gwas_study_info_trait_ontology_info_l1.csv
    Ignored:    output/gwas_study_info_trait_ontology_info_l2.csv
    Ignored:    output/trait_ontology/
    Ignored:    renv/
    Ignored:    sup_table.xlsx
    Ignored:    zooma.tsv
    Ignored:    zooma_res.tsv

Untracked files:
    Untracked:  analysis/garbage_icd_codes.Rmd
    Untracked:  analysis/map_trait_to_icd10.Rmd
    Untracked:  disease_mapping.R

Unstaged changes:
    Modified:   analysis/disease_inves_by_ancest.Rmd
    Modified:   analysis/exclude_infectious_diseases.Rmd
    Modified:   analysis/gbd_data_plots.Rmd
    Modified:   analysis/index.Rmd
    Modified:   analysis/level_1_disease_group_non_cancer.Rmd
    Modified:   analysis/level_2_disease_group.Rmd
    Modified:   analysis/trait_ontology_categorization.Rmd
    Modified:   data/icd/README.md

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/group_non_cancer_diseases.Rmd) and HTML (docs/group_non_cancer_diseases.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd e777ee8 IJbeasley 2025-09-28 workflowr::wflow_publish("analysis/group_non_cancer_diseases.Rmd")
html 3640ce0 IJbeasley 2025-09-28 Build site.
Rmd d8f04de IJbeasley 2025-09-28 Using ICD / PheCode mapping

Set up

library(dplyr)
library(stringr)
library(data.table)

Ontology help - for getting disease subtypes

source(here::here("code/get_term_descendants.R"))

Load Data

gwas_study_info <- fread(here::here("output/gwas_cat/gwas_study_info_group.csv"))

How many unique traits are there now?

diseases <- stringr::str_split(pattern = ", ",
                               gwas_study_info$collected_all_disease_terms[gwas_study_info$collected_all_disease_terms != ""])  |>
  unlist() |>
  stringr::str_trim()

diseases <- unique(diseases)

print(length(diseases))
[1] 1718

Filtering some traits

Abnormality of skin pigmentation

gwas_study_info |> 
  filter(grepl("abnormality of skin pigmentation", collected_all_disease_terms)) |>
  pull(`DISEASE/TRAIT`) |>
  unique()
character(0)
# as can see, most of these traits are not about true skin pigmentation abnormalities
# just about skin pigmentation in general

# if `DISEASE/TRAIT` contains "UKB data field 1717" (which is just a skin colour measurment field)
# then remove "abnormality of skin pigmentation" from collected_all_disease_terms
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("UKB data field 1717", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "abnormality of skin pigmentation"
         ),
         collected_all_disease_terms
         )
  )

# also remove if `DISEASE/TRAIT` contains "Tatto pigmentation"
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("tatto pigmentation", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "abnormality of skin pigmentation"
         ),
         collected_all_disease_terms
         )
  )

# also remove if `DISEASE/TRAIT` is any of the following:
# - Perceived skin darkness 
# - Skin red/green component
# - Skin yellow/blue component
# - Skin pigmentation traits
# - Skin pigmentation 
# - Skin luminance

remove_terms <- c("Perceived skin darkness",
                  "Skin red/green component",
                  "Skin yellow/blue component",
                  "Skin pigmentation traits",
                  "Skin pigmentation",
                  "Skin luminance",
                  "Skin colour saturation"
                  )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(`DISEASE/TRAIT` %in% remove_terms,
                stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "abnormality of skin pigmentation"
         ),
         collected_all_disease_terms
         )
  )

# this leaves PUBMED_ID 23548203, which according to suplementary table 1
# is `Number of non-melanoma skin cancer (5+)` 
# so change to "non-melanoma skin cancer"

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(PUBMED_ID == "23548203",
                stringr::str_replace_all(collected_all_disease_terms,
                          pattern = "abnormality of skin pigmentation",
                          replacement = "non-melanoma skin cancer"
         ),
         collected_all_disease_terms
         )
  )

Handedness

gwas_study_info |> 
  filter(grepl("handedness", collected_all_disease_terms)) |>
  pull(`DISEASE/TRAIT`) |>
  unique()
character(0)
# remove handedness
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "handedness"
         ))

# also remove functional laterality if Handedness chirality laterality is in DISEASE/TRAIT
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("Handedness", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "functional laterality"
         ),
         collected_all_disease_terms
         )
  )

# also remove functional laterality if Usual side of head for mobile phone use is in DISEASE/TRAIT
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("Usual side of head for mobile phone use", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "functional laterality"
         ),
         collected_all_disease_terms
         )
  )

# also remove functional laterality if PUBMED_ID=20585627
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(PUBMED_ID == "20585627",
                stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "functional laterality"
         ),
         collected_all_disease_terms
         )
  ) 

Attached earlobe

gwas_study_info |> 
  filter(grepl("attached earlobe", collected_all_disease_terms)) |>
  pull(`DISEASE/TRAIT`) |>
  unique()
character(0)
# remove attached earlobe
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "attached earlobe"
         ))

Facial wrinkling

gwas_study_info |> 
  filter(grepl("facial wrinkling", collected_all_disease_terms)) |>
  pull(`DISEASE/TRAIT`) |>
  unique()
character(0)
# remove facial wrinkling
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "facial wrinkling"
         ))

Sneezing in response to bright light (autosomal dominant compelling helio-ophthalmic outburst syndrome)

gwas_study_info |> 
  filter(grepl("autosomal dominant compelling helio-ophthalmic outburst syndrome", collected_all_disease_terms)) |>
  pull(`DISEASE/TRAIT`) |>
  unique()
character(0)
# remove autosomal dominant compelling helio-ophthalmic outburst syndrome
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_remove_all(collected_all_disease_terms,
                          pattern = "autosomal dominant compelling helio-ophthalmic outburst syndrome"
         ))

Skin sensitivity to sun

gwas_study_info |> 
  filter(grepl("skin sensitivity to sun", collected_all_disease_terms)) |>
  pull(`DISEASE/TRAIT`) |>
  unique()
 [1] "Skin sensitivity to sun"                                                                           
 [2] "Ease of sunburn"                                                                                   
 [3] "Ease of skin tanning"                                                                              
 [4] "Ease of skin tanning - Get mildly or occasionally tanned (UKB data field 1727)"                    
 [5] "Ease of skin tanning - Get moderately tanned (UKB data field 1727)"                                
 [6] "Ease of skin tanning - Get very tanned (UKB data field 1727)"                                      
 [7] "Ease of skin tanning - Never tan only burn (UKB data field 1727)"                                  
 [8] "Ease of skin tanning - Get mildly or occasionally tanned (UKB data field 1727) (Gene-based burden)"
 [9] "Ease of skin tanning - Get moderately tanned (UKB data field 1727) (Gene-based burden)"            
[10] "Ease of skin tanning - Never tan only burn (UKB data field 1727) (Gene-based burden)"              
[11] "Ease of skin tanning - Get very tanned (UKB data field 1727) (Gene-based burden)"                  
# remove skin sensitivity to sun
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_remove_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern("skin sensitivity to sun")
         ))

# also remove suntan
gwas_study_info |> 
  filter(grepl("suntan", collected_all_disease_terms)) |>
  pull(`DISEASE/TRAIT`) |>
  unique()
[1] "Tanning"          "Low tan response"
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_remove_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern("suntan"
                          )
         ))

Group some non-cancer diseases together

Abnormal brain morphology

# if DISEASE/TRAIT contains Unidentified bright object on brain MRI, 
# then replace abnormal brain morphology with Other abnormal findings on diagnostic imaging of central nervous system 
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("Unidentified bright object on brain MRI", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("abnormal brain morphology"),
                          "other abnormal findings on diagnostic imaging of central nervous system"
         ),
         collected_all_disease_terms
         )
  )

Abnormality of gait

gwas_study_info = 
  gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                            "gait imbalance",
                            "decreased walking ability")
                            ),
                          "abnormality of gait"
         ))

Abdominal abscess

gwas_study_info |>
  filter(grepl(vec_to_grep_pattern("abdominal abscess"), 
               collected_all_disease_terms, perl = T)) 
Empty data.table (0 rows and 32 cols): DATE_ADDED_TO_CATALOG,PUBMED_ID,FIRST_AUTHOR,DATE,JOURNAL,LINK...
# just one study - 35173190, has abdominal abscess
# they define unusually as: D73.3, K35-37, K57, K61, K63.0, K65, K75.0, K81, K83.0
# 'Abdominal infections'

gwas_study_info = 
  gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(PUBMED_ID == "35173190",
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("abdominal abscess"),
                          "abdominal infections code"
         ),
         collected_all_disease_terms
         )
  )

Abnormal mammogram

# if DISEASE/TRAIT contains Abnormal mammogram, then change Abnormality of the breast to abnormal mammogram

gwas_study_info = 
  gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("Abnormal findings on mammogram or breast exam", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("abnormality of the breast"),
                          "abnormal mammogram"
         ),
         collected_all_disease_terms
         )
  )

Achalasia of cardia

gwas_study_info = 
  gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("achalasia"),
                          "achalasia of cardia"
         ))

Acute pancreatitis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("asparaginase-induced acute pancreatitis"),
                          "acute pancreatitis"
         ))

Altitude sickness

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("chronic mountain sickness"),
                          "altitude sickness"
         ))

Alopecia

alopecia_terms <- c("frontal fibrosing alopecia" 
                   )


gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(alopecia_terms),
                          "alopecia"
         )
  )
         
         
drug_induced_alopecia_terms <- c(
                               "chemotherapy-induced alopecia"
                               )


gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(drug_induced_alopecia_terms),
                          "drug-induced androgenic alopecia"
         )
  )

Alcohol and nicotine codependence

# alcohol and nicotine codependence -> alcohol dependence, nicotine dependence

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("alcohol and nicotine codependence"),
                          "alcohol dependence, nicotine dependence"
         )) 

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("alcohol dependence"),
                          "alcohol-related disorders"
         ))

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("nicotine dependence"),
                          "tobacco use disorder"
         ))

Amyloidosis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("al amyloidosis"),
                          "amyloidosis"
         ))

Anxiety disorder

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("anxiety"),
                          "anxiety disorder"
         )) 

Aplastic anemia

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern("severe aplastic anemia"),
                          "aplastic anemia"
         ))

Androgenic alopecia

gwas_study_info = 
  gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern("androgenetic alopecia"),
                          "androgenic alopecia"
         ))  

angioedema -> Angioneurotic oedema

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("angioedema"),
                          "angioneurotic oedema"
         ))

Astigmatism

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("corneal astigmatism"),
                          "astigmatism"
         ))

Asthma

gwas_study_info = gwas_study_info |>
      mutate(collected_all_disease_terms  = 
           stringr::str_replace_all(collected_all_disease_terms,
                            vec_to_grep_pattern(c("atopic asthma", "chronic obstructive asthma")),
                            "asthma"
           )) |>
    mutate(collected_all_disease_terms  = 
           stringr::str_replace_all(collected_all_disease_terms,
                            vec_to_grep_pattern(c("childhood onset asthma",
                                                    "adult onset asthma",
                                                    "aspirin-induced asthma"
                                                  )
                            ),
                            "asthma"
           ))

Atrial fibrillation and flutter

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("atrial flutter"),
                          "atrial fibrillation and flutter"
         ))

Atopic dermatitis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("recalcitrant atopic dermatitis"),
                          "atopic dermatitis"
         ))

Eczema

# if DISEASE/STUDY contains eczema, change eczematoid dermatitis to eczema
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("eczema", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("eczematoid dermatitis"),
                          "eczema"
         ),
         collected_all_disease_terms
         )
  )


gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("hand eczema"),
                          "eczema"
         )
  )

Post-operative

gwas_study_info = gwas_study_info |> 
    mutate(collected_all_disease_terms  = 
         stringr::str_remove_all(collected_all_disease_terms,
                         pattern = "^post-operative |^postoperative"
         ))

Bipolar disorder

gwas_study_info = gwas_study_info |> 
    mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("bipolar ii disorder",
                              "bipolar i disorder"
                              )
                            ),
                          "bipolar disorder"
         ))

Bacterial infection

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                             "bacterial infection"
                             )
                             ),
                          "bacterial infection nos"
         ))

Benign mammary dysplasia

# if `DISEASE/TRAIT contains Benign mammary dysplasia`, then change abnormality of the breast to benign mammary dysplasia
gwas_study_info = 
  gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("Benign mammary dysplasia", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("abnormality of the breast"),
                          "benign mammary dysplasia"
         ),
         collected_all_disease_terms
         )
  )

Blindness and low vision

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                             "blindness",
                             "progressive visual loss",
                             "visual loss"
                             )
                             ),
                          "blindness and low vision"
         ))

# Disorders of optic nerve and visual pathways
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                             "visual pathway disorder",
                             "optic nerve disorder"
                             )
                             ),
                          "disorders of optic nerve and visual pathways"
         ))

# visuospatial impairment
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                             "visuospatial impairment"
                             )
                             ),
                          "other and unspecified symptoms and signs involving cognitive functions and awareness"
         ))

Cafe-au-lait spot

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("cafe-au-lait spot"),
                          "café au lait spots"
         ))

Cancer of eye

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                             "ocular melanoma")
                             ),
                          "cancer of eye"
         ))

Candidiasis of vulva and vagina

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                             "vaginal yeast infection")
                             ),
                          "candidiasis of vulva and vagina"
         ))

Carbuncle and furuncle

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                             "carbuncle", "furuncle")
                             ),
                          "carbuncle and furuncle"
         ))

Cardiac arrhythmia

arrhythmia_terms <-
c("ventricular arrhythmia",
  "torsades de pointes"
  )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(arrhythmia_terms),
                          "cardiac arrhythmia"
         ))

Cardiomyopathy

cardiomyopathy_terms <- c("nonischemic cardiomyopathy")

gwas_study_info =
gwas_study_info |> 
 mutate(collected_all_disease_terms  = 
          stringr::str_replace_all(collected_all_disease_terms,
                                  pattern = vec_to_grep_pattern(cardiomyopathy_terms),
                                   "cardiomyopathy"
                          )  
        )

Celiac disease

gwas_study_info = 
gwas_study_info |> 
 mutate(collected_all_disease_terms  = 
          stringr::str_replace_all(collected_all_disease_terms,
                                  pattern = vec_to_grep_pattern("refractory celiac disease"),
                                   "celiac disease"
                          )  
        )

Central and perpheral ertigo

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("central nervous system origin vertigo"),
                          "central origin vertigo"
         ))


gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("peripheral vertigo"),
                          "peripheral or central vertigo"
         ))

Cerebral atherosclerosis

# if Brain vascular atherosclerosis in DISEASE/TRAIT, then change vascular brain injury to  cerebral atherosclerosis 

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("Brain vascular atherosclerosis", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("vascular brain injury"),
                          "cerebral atherosclerosis"
         ),
         collected_all_disease_terms
         )
  )

Cerebrovascular disease

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
        ifelse(grepl("Vascular brain injury", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                             "vascular brain injury"
                             )
                             ),
                          "cerebrovascular disease"
         ),
         collected_all_disease_terms)
  )

Occlusion of cerebral arteries

# if Brain vascular stenosis in DISEASE/TRAIT, then change vascular brain injury to  occlusion of cerebral arteries

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("Brain vascular stenosis", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("vascular brain injury"),
                          "occlusion of cerebral arteries"
         ),
         collected_all_disease_terms
         )
  )

Cerebral infarction

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                             "mri defined brain infarct",
                             "cardioembolic stroke")
                             ),
                          "cerebral infarction"
         ))

Chronic kidney disease

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("stage 5 chronic kidney disease"),
                          "chronic kidney disease"
         ))

Chronic non-alcoholic pancreatitis

# if Non-alcoholic chronic pancreatitis  in `DISEASE/TRAIT`,
# then non-alcoholic pancreatitis to chronic pancreatitis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(grepl("Non-alcoholic chronic pancreatitis", 
                     `DISEASE/TRAIT`,
                     ignore.case = TRUE),
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("non-alcoholic pancreatitis"),
                          "chronic pancreatitis"
         ),
         collected_all_disease_terms
         )
  )

Chronic rhinosinusitis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("chronic rhinosinusitis with nasal polyps"),
                          "chronic rhinosinusitis"
         ))

Chronic sinusitis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("chronic sinus infection"),
                          "chronic sinusitis"
         ))

Crohn’s Disease

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("perianal crohns disease"),
                          "crohns disease"
         ))

Cholangitis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("sclerosing cholangitis"),
                          "cholangitis"
         ))

Cluster headache syndrome

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("cluster headache"),
                          "cluster headache syndrome"
         ))

Circumscribed brain atrophy

brain_atropy <- c("frontotemporal dementia",
                  "grn-related frontotemporal lobar degeneration with tdp43 inclusions",
                  "pick disease")

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(brain_atropy),
                          "circumscribed brain atrophy"
         ))

Contact dermatitis

pattern <- vec_to_grep_pattern("contact dermatitis due to nickel")


gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = pattern,
                          "contact dermatitis"
         ))

Congenital heart disease/s

congenital_heart_disease_terms <- c(
  "heart septal defect",  
  "atrial heart septal defect",
  
  "congenital left-sided heart lesions",
  "congenital right-sided heart lesions",
  
  "congenital anomaly of the great arteries", # equiv = "Congenital malformation of great arteries, unspecified"
  
  # malformation of cardiac septum, 
    "abnormal cardiac septum morphology",
    "atrioventricular canal defect"
)
  

url <- "http://www.ebi.ac.uk/ols4/api/ontologies/efo/terms/http%253A%252F%252Fwww.ebi.ac.uk%252Fefo%252FEFO_0005207/descendants"

congenital_heart_disease_terms <- c(congenital_heart_disease_terms,
                                  get_descendants(url)
                                  ) |>
  unique()
[1] "Number of terms collected:"
[1] 102
[1] "\n Some example terms"
[1] "double outlet right ventricle with subaortic or doubly committed ventricular septal defect with pulmonary stenosis"
[2] "gata6-related congenital heart disease with or without pancreatic agenesis or neonatal diabetes"                   
[3] "double outlet right ventricle with non-committed subpulmonary ventricular septal defect"                           
[4] "congenitally uncorrected transposition of the great arteries with cardiac malformation"                            
[5] "congenitally uncorrected transposition of the great arteries with coarctation"                                     
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(congenital_heart_disease_terms),
                          "congenital heart disease"
         )
  )


url <- "http://www.ebi.ac.uk/ols4/api/ontologies/mondo/terms/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FMONDO_0024239/descendants"

more_congenital_heart_disease_terms <- get_descendants(url)
[1] "Number of terms collected:"
[1] 229
[1] "\n Some example terms"
[1] "double outlet right ventricle with subaortic or doubly committed ventricular septal defect with pulmonary stenosis"
[2] "double outlet right ventricle with atrioventricular septal defect, pulmonary stenosis, heterotaxy"                 
[3] "gata6-related congenital heart disease with or without pancreatic agenesis or neonatal diabetes"                   
[4] "double outlet right ventricle with subaortic or doubly committed ventricular septal defect"                        
[5] "pulmonary valve agenesis-ventricular septal defect-persistent ductus arteriosus syndrome"                          
congenital_heart_disease_terms <- c(congenital_heart_disease_terms,
                                  more_congenital_heart_disease_terms
                                  ) |>
  unique() |>
  str_length_sort()


gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(congenital_heart_disease_terms),
                          "congenital anomaly of cardiovascular system"
         )
  )

Cardiac congenital anomalies

cardiac_congenital_anomalies_terms <- c(
  "bicuspid aortic valve"
)

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(cardiac_congenital_anomalies_terms),
                          "cardiac congenital anomalies"
         )
  )

Congenital hypertrophic pyloric stenosis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("infantile hypertrophic pyloric stenosis"),
                          "congenital hypertrophic pyloric stenosis"
         ))

Congenital hypothyroidism

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("congenital hypothyroidism due to developmental anomaly"),
                          "congenital hypothyroidism"
         ))

Congestive heart failure (CHF) NOS

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                            "diastolic heart failure",
                            "systolic heart failure")
                            ),
                          "congestive heart failure \\(chf\\) nos"
         ))

Coronary artery aneurysm and dissection

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                            "coronary aneurysm",
                            "coronary artery dissection")
                            ),
                          "coronary artery aneurysm and dissection"
         ))

# Aortic aneurysm

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                            "thoracic aortic aneurysm")
                            ),
                          "aortic aneurysm"
         ))

Creutzfeldt jacob disease

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                          c("sporadic creutzfeld jacob disease",
                            "creutzfeldt jacob disease",
                            "creutzfeldt-jacob disease")
                          ),
                          "creutzfeldt-jakob disease"
         ))

Cryoglobulinemia

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("cryoglobulinemia"),
                          "cryoglobulinaemia"
         ))

Chronic pain

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                            "chronic widespread pain",
                            "multisite chronic pain",
                            "chronic musculoskeletal pain",
                            "chronic pain syndrome")
                            ),
                          "chronic pain"
         ))

Cystic fibrosis with intestinal manifestations

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("cystic fibrosis associated meconium ileus"),
                          "cystic fibrosis with intestinal manifestations"
         ))

Dental caries

dental_caries_terms <- c("pit and fissure surface dental caries",
                         "smooth surface dental caries",
                         "primary dental caries",
                         "enamel caries",
                         "permanent dental caries"
                         )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(dental_caries_terms),
                          "dental caries"
         )
)

Dermatomyositis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("clinically amyopathic dermatomyositis"),
                          "dermatomyositis"
         ))

Dilated cardiomyopathy

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("idiopathic dilated cardiomyopathy"),
                          "dilated cardiomyopathy"
         ))

Disorders of tooth development

tooth_dev_terms <- c("dental enamel hypoplasia",
                     "tooth agenesis",
                     "molar-incisor hypomineralization"
                     )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(tooth_dev_terms),
                          "disorders of tooth development"
         )
  )

Depressive episode

depress_epi <- c("depressive",
                 "depression",
                 "depressive disorder"
                 )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(depress_epi),
                          "depressive episode"
         ))

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(depress_epi),
                          "depressive episode"
         ))

Disorders of purine and pyrimidine metabolism

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("hyperuricemia"),
                          "disorders of purine and pyrimidine metabolism"
         ))

Disorders of refraction and accommodation

refractive_terms <- c("hyperopia",
                      "refractive error"
                      )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(refractive_terms),
                          "disorders of refraction and accommodation"
         )
  )

Diseases of white blood cells

wbc_terms <- c("leukopenia")

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(wbc_terms),
                          "diseases of white blood cells"
         )
  )

Disorders of amino-acid metabolism

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("amino acid metabolism disease"),
                          "disorders of amino-acid metabolism"
         ))

Drug allergy

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("beta-lactam allergy"),
                          "drug allergy"
         ))

Degeneration of macula and posterior pole

macular <- c("macular degeneration",
             "atrophic macular degeneration",
             "retinal drusen"
             )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(macular),
                          "degeneration of macula and posterior pole"
         ))

Essential hypertension

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c(
                              "early onset hypertension",
                              "treatment-resistant hypertension")),
                          "essential hypertension"
         ))

Exstrophy of urinary bladder

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("bladder exstrophy"),
                          "exstrophy of urinary bladder"
         ))

Febrile seizures

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                                                "mmr-related febrile seizures",
                                                "febrile seizure")
                                              ),
                          "febrile convulsions"
         ))

Food allergy

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                                  vec_to_grep_pattern(
                                    c("peanut allergy",
                                      "milk allergy",
                                      "egg allergy",
                                      "wheat allergic reaction"
                                     )),
                          "food allergy"
         ))

Glaucoma

url <- "http://www.ebi.ac.uk/ols4/api/ontologies/efo/terms/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FMONDO_0005041/descendants"

glaucoma_terms <- get_descendants(url)
[1] "Number of terms collected:"
[1] 19
[1] "\n Some example terms"
[1] "cyp1b1-related glaucoma with or without anterior segment dysgenesis"
[2] "glaucoma secondary to spherophakia/ectopia lentis and megalocornea" 
[3] "hereditary glaucoma, primary closed-angle"                          
[4] "primary angle closure glaucoma"                                     
[5] "secondary dysgenetic glaucoma"                                      
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(glaucoma_terms),
                          "glaucoma"
         ))

Hordeolum and other deep inflammation of eyelid

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("hordeolum"),
                          "hordeolum and other deep inflammation of eyelid"
         ))

Graft vs host disease

graft_vs_host_terms <- c("chronic graft versus host disease",
                         "chronic graft vs. host disease",
                         "acute graft versus host disease",
                         "acute graft vs. host disease"
                         )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(graft_vs_host_terms),
                          "graft versus host disease"
         ))

Gingival and periodontal diseases

ginival_and_periodontal_terms <- c("periodontal pocket",
                                   "periodontal disorder",
                                   "gingival disease",
                                   "gingival bleeding"
                               )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(ginival_and_periodontal_terms),
                          "gingival and periodontal diseases"
         )
  )

Hemiplegia

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("hemiparesis")),
                          "hemiplegia"
         ))

Hearing loss

gwas_study_info  = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("age-related hearing impairment"),
                          "presbycusis"
         ))

gwas_study_info  = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                            "deafness",
                            "noise-induced hearing loss")
                            ),
                          "hearing loss"
         ))

Hereditary hemochromatosis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("hereditary hemochromatosis type 1"),
                          "hereditary hemochromatosis"
         ))

Hypercholesterolemia

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                            "familial hypercholesterolemia"
                            )),
                          "hypercholesterolemia"
         ))

Hyperlipidemia

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                            "familial hypercholesterolemia"
                            )),
                          "hyperlipidemia"
         ))

HIV disease resulting in encephalopathy

gwas_study_info = 
  gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                                               "aids dementia",
                                               "hiv-associated neurocognitive disorder")
                                              ),
                          "hiv disease resulting in encephalopathy"
         ))

Inherited retinal dystrophy

url <-  "http://www.ebi.ac.uk/ols4/api/ontologies/mondo/terms/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FMONDO_0019118/descendants"

inherited_retinal_dystrophy_terms <- get_descendants(url)
[1] "Number of terms collected:"
[1] 341
[1] "\n Some example terms"
[1] "spondyloepiphyseal dysplasia, sensorineural hearing loss, impaired intellectual development, and leber congenital amaurosis"
[2] "x-linked intellectual disability-limb spasticity-retinal dystrophy-diabetes insipidus syndrome"                             
[3] "microcephaly with or without chorioretinopathy, lymphedema, or intellectual disability"                                     
[4] "retinal vasculopathy with cerebral leukoencephalopathy and systemic manifestations"                                         
[5] "retinal dystrophy with inner retinal dysfunction and ganglion cell anomalies"                                               
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(inherited_retinal_dystrophy_terms),
                          "hereditary retinal dystrophy"
         ))

Intentional self-harm by unspecified means

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("self-injurious behavior",
                                                "self-injurious ideation")),
                          "intentional self-harm by unspecified means"
         ))

Infertility (male)

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
                  stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("azoospermia",
                              "sertoli cell-only syndrome")
                            ),
                          "male infertility"
                          )  )

Induratio penis plastica

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
                  stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("peyronie disease")
                            ),
                          "induratio penis plastica"
                          )  )

Intracerebral hemorrhage

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
                  stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("non-lobar intracerebral hemorrhage",
                              "lobar intracerebral hemorrhage")
                            ),
                          "intracerebral hemorrhage"
                          )  )

Idiopathic generalized epilepsy

url <- "http://www.ebi.ac.uk/ols4/api/ontologies/snomed/terms/http%253A%252F%252Fsnomed.info%252Fid%252F36803009/descendants"

idiopathic_generalized_epilepsy_terms <- get_descendants(url)
[1] "Number of terms collected:"
[1] 4
[1] "\n Some example terms"
[1] "epilepsy with generalized tonic-clonic seizures alone (disorder)"
[2] "juvenile myoclonic epilepsy"                                     
[3] "childhood absence epilepsy"                                      
[4] "juvenile absence epilepsy"                                       
[5] NA                                                                
idiopathic_generalized_epilepsy_terms = stringr::str_remove_all(
                                        idiopathic_generalized_epilepsy_terms, 
                                        " \\(disorder\\)$"
                                        )

idiopathic_generalized_epilepsy_terms = c("epilepsy with generalized tonic-clonic seizures",
                                          idiopathic_generalized_epilepsy_terms
                                         )


gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(idiopathic_generalized_epilepsy_terms),
                          "generalized idiopathic epilepsy and epileptic syndromes"
         ))

Idiopathic thrombocytopenic purpura

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("autoimmune thrombocytopenic purpura"),
                          "idiopathic thrombocytopenic purpura"
         ))

Juvenile idiopathic arthritis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("rheumatoid factor-negative juvenile idiopathic arthritis"),
                          "juvenile idiopathic arthritis"
         ))

Keratoconjunctivitis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("keratoconjunctivitis sicca"),
                          "keratoconjunctivitis"
         ))

Learning disorder

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("dyslexia",
                                                "mathematics disorder",
                                                "disorder of written expression"
                                                )
                                              ),
                          "learning disorder"
         ))

Lewy body dementia

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("lewy body attribute"),
                          "lewy body dementia"
         ))

Rheumatoid arthritis

ra_terms <- c("acpa-positive rheumatoid arthritis",
               "acpa-negative rheumatoid arthritis",
              "adult-onset stills disease")

gwas_study_info = 
gwas_study_info |> 
 mutate(collected_all_disease_terms  = 
          stringr::str_replace_all(collected_all_disease_terms,
                                  pattern = vec_to_grep_pattern(ra_terms),
                                   "rheumatoid arthritis"
                          )  
        )

Migraine

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("migraine with aura",
                              "migraine without aura",
                              "migraine disorder"
                              )
                            ),
                          "migraine"
         ))

Multiple sclerosis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("relapsing-remitting multiple sclerosis"),
                          "multiple sclerosis"
         ))

Myasthenia gravis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("late-onset myasthenia gravis"),
                          "myasthenia gravis"
         ))

narcolepsy-cataplexy syndrome -> Narcolepsy and cataplexy

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("narcolepsy-cataplexy syndrom"),
                          "narcolepsy and cataplexy"
         ))

Nephrolithiasis

nephro_terms <- c("uric acid nephrolithiasis",
                   "calcium phosphate nephrolithiasis",
                    "calcium oxalate nephrolithiasis",
                   "struvite nephrolithiasis")


gwas_study_info =
gwas_study_info |> 
 mutate(collected_all_disease_terms  = 
          stringr::str_replace_all(collected_all_disease_terms,
                                  pattern = vec_to_grep_pattern(nephro_terms),
                                   "nephrolithiasis"
                          )  
        )

Nephritis and nephropathy with pathological lesion

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("acute tubulointerstitial nephritis",
                                              "iga glomerulonephritis",
                                              "membranous glomerulonephritis",
                                              "lupus nephritis")
                                              ),
                          "nephritis and nephropathy with pathological lesion"
         ))

Neurofibromatosis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("neurofibromatosis type 1",
                              "neurofibromatosis type 2")
                          ),
                          "neurofibromatosis"
         ))

Neuromyelitis optica

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("aquaporin-4 antibody positive neuromyelitis optica",
                              "aquaporin-4 antibody negative neuromyelitis optica",
                              "aqp4-igg-positive neuromyelitis optica",
                              "aqp4-igg-negative neuromyelitis optica"
                            )
                          ),
                          "neuromyelitis optica"
         ))

Noninflammatory disorders of vagina

non_inflam_terms <- c("abnormal vaginal discharge itching",
                      "abnormal vaginal discharge smell",
                      "vaginal discharge")

gwas_study_info = 
  gwas_study_info |>
   mutate(collected_all_disease_terms  = 
          stringr::str_replace_all(collected_all_disease_terms,
                                  pattern = 
                                    vec_to_grep_pattern(
                                      non_inflam_terms
                                      ),
                                   "noninflammatory disorders of vagina"
                          )  
        )

Obesity

gwas_study_info = 
gwas_study_info |> 
 mutate(collected_all_disease_terms  = 
          stringr::str_replace_all(collected_all_disease_terms,
                                  pattern = 
                                    vec_to_grep_pattern(
                                      c("morbid obesity",
                                         "metabolically healthy obesity"
                                        )
                                      ),
                                   "obesity"
                          )  
        )

Obsessive-compulsive disorder

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c
                                              ("obsessive-compulsive trait",
                                                "obsessive-compulsive")
                                              ),
                          "obsessive-compulsive disorder"
         ))

Osteonecrosis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("idiopathic osteonecrosis of the femoral head"),
                          "osteonecrosis"
         ))

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("osteoradionecrosis"),
                          "osteonecrosis"
         ))

Other epilepsy

gwas_study_info =
  gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("mesial temporal lobe epilepsy with hippocampal sclerosis",
                              "rolandic epilepsy")
                            ),
                          "epilepsy"
         )) 

Chromosomal anomalies

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c(
                            "22q11.2 deletion syndrome",
                            "fragile x syndrome")
                            ),
                          "chromosomal anomalies"
         ))

Opiod dependence

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("heroin dependence",
                                              "opioid use disorder")),
                          "opioid dependence"
         ))

Other and unspecified cirrhosis of liver

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("hepatitis c induced liver cirrhosis",
                                                "biliary liver cirrhosis")
                                              ),
                          "other and unspecified cirrhosis of liver"
         ))

Other cardiac conduction disorders

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("familial long qt syndrome")
                                              ),
                          "other cardiac conduction disorders"
         ))

Other cerebral degenerations

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("brain atrophy"
                                                )
                                              ),
                          "other cerebral degenerations"
         ))

Other specified degenerative diseases of nervous system

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("corticobasal degeneration disorder"
                                                )
                                              ),
                          "other specified degenerative diseases of nervous system"
         ))

# progressive supranuclear palsy -> Dementia with cerebral degenerations

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("progressive supranuclear palsy"),
                          "dementia with cerebral degenerations"
         ))

Other chronic nonalcoholic liver disease

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("non-alcoholic fatty liver disease"
                                                )),
                          "other chronic nonalcoholic liver disease"
         ))

Other disorders of bone and cartilage

other_bone_cartilage_terms <- c("tietze syndrome"
                               )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(other_bone_cartilage_terms),
                          "other disorders of bone and cartilage"
         )
  )

Other eating disorders

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("binge eating")
                                              ),
                          "other eating disorders"
         ))

Other specified inflammatory liver diseases

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("primary biliary cholangitis",
                                                "primary sclerosing cholangitis"
                                                )
                                              ),
                          "non-alcoholic steatohepatitis"
         ))

Other haemoglobinopathies

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("hemoglobin e disease"),
                          "other haemoglobinopathies"
         ))

Other paralytic syndromes

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("paraplegia",
                              "quadriplegia")
                            ),
                          "other paralytic syndromes"
         ))

Parkinsons disease

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("young adult-onset parkinsonism"),
                          "parkinsons disease"
         ))

Perinatal jaundice

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("neonatal jaundice"),
                          "perinatal jaundice"
         ))

Periodontitis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("aggressive periodontitis"),
                          "periodontitis"
         ))

Phlebitis and thrombophlebitis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("thrombophlebitis")),
                          "phlebitis and thrombophlebitis"
         ))

Phobias

# social anxiety disorder -> social phobias
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("social anxiety disorder"),
                          "social phobias"
         ))

# specific phobia -> Specific \\(isolated\\) phobias

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("specific phobia"),
                          "specific \\(isolated\\) phobias"
         ))

Primary ovarian failure

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("primary ovarian insufficiency"),
                          "primary ovarian failure"
         ))

Premature menopause and other ovarian failure

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("premature menopause"),
                          "premature menopause and other ovarian failure"
         ))

Primary hyperaldosteronism

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("primary aldosteronism"),
                          "primary hyperaldosteronism"
         ))

Proteinuria

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("albuminuria",
                                                "moderate albuminuria")
                                              ),
                          "proteinuria"
         ))

Prurigo

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("prurigo"),
                          "other prurigo"
         ))

Precocious puberty

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("central precocious puberty"),
                          "precocious puberty"
         ))

Psoriasis

psoriasis_terms <- c("cutaneous psoriasis")


gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(psoriasis_terms),
                          "psoriasis")
  )

Psychosis (pyschotic, pyschotic symptoms)

psychosis_terms <- c("psychotic",
                     "psychotic symptoms"
                     )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(psychosis_terms),
                          "psychosis")
  )

Pulmonary fibrosis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("idiopathic pulmonary fibrosis"),
                          "pulmonary fibrosis"
         ))

Retinoschisis and retinoschisis and retinal cysts

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("retinoschisis"),
                          "retinoschisis and retinal cysts"
         ))

Separation of retinal layers

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("central serous retinopathy",
                                              "chronic central serous retinopathy")
                                              ),
                          "separation of retinal layers"
         ))

Rash and other nonspecific skin eruption

rash_terms <- c(
                "maculopapular eruption"
                )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(rash_terms),
                          "rash and other nonspecific skin eruption"
         )
  )

Rhinitis

rhinitis_terms <- c("non-allergic rhinitis")

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(rhinitis_terms),
                          "rhinitis"
         )
  )

Sciatica

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("ldh-related sciatica")
                                              ),
                          "sciatica"
         )) 

Schizoaffective disorder

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("schizoaffective disorder-bipolar type"),
                          "schizoaffective disorder")
         )

Schizophrenia

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("treatment refractory schizophrenia"),
                          "schizophrenia")
         )

Scoliosis

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("adolescent idiopathic scoliosis"),
                          "scoliosis")
         )

Sleep apnea

sleep_apnea_terms <- c("sleep apnea during non-rem sleep",
                      "sleep apnea during rem sleep",
                      "obstructive sleep apnea")

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(sleep_apnea_terms),
                          "sleep apnea")
  )

Sleep disorders

sleep_disorder_terms <- c("sleepiness",
                          "somnambulism",
                          "rem sleep behavior disorder"
                         )


gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(sleep_disorder_terms),
                          "sleep disorders"
         )
  )

Speech and language disorder

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("specific language impairment"
                                                )
                                              ),
                          "speech and language disorder"
         ))

Staphylococcus infections

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(
                            c("staphylococcus aureus infection",
                              "skin and soft tissue staphylococcus aureus infection",
                              "methicillin-resistant staphylococcus aureus infection")),
                          "staphylococcus infections"
         ))

Strabismus

url <-  "http://www.ebi.ac.uk/ols4/api/ontologies/doid/terms/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FDOID_540/descendants"

strabismus_terms <- get_descendants(url)
[1] "Number of terms collected:"
[1] 26
[1] "\n Some example terms"
[1] "abnormal retinal correspondence" "brown's tendon sheath syndrome" 
[3] "internuclear ophthalmoplegia"    "duane retraction syndrome 3"    
[5] "duane retraction syndrome 2"    
strabismus_terms = c("non-accomodative esotropia",
                     strabismus_terms)

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(strabismus_terms),
                          "strabismus"
         ))

Stroke

other_nonspec_stroke <- c("large artery stroke",
                         "small vessel stroke",
                         "stroke outcome",
                         "stroke disorder"
                         )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(other_nonspec_stroke),
                          "stroke"
         ))

Thyrotoxicosis with or without goiter

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("hyperthyroidism"),
                          "thyrotoxicosis with or without goiter"
         ))

Type 1 diabetes

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("latent autoimmune diabetes in adults"),
                          "type 1 diabetes mellitus"
         ))

Type 2 diabetes with a ophthalmic manifestations

type_2_eye_terms <- c("diabetes mellitus type 2 associated cataract",
                      "diabetic maculopathy, type 2 diabetes mellitus",
                      "diabetic macular edema, type 2 diabetes mellitus",
                      "proliferative diabetic retinopathy, type 2 diabetes mellitus"
                     )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(type_2_eye_terms),
                          "type 2 diabetes with ophthalmic manifestations"
         ))

# for 30487263, all discovery samples are type 2 diabetes - so 
# replace proliferative diabetic retinopathy with type 2 diabetes with ophthalmic manifestations
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(PUBMED_ID == 30487263,
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("proliferative diabetic retinopathy"),
                          "type 2 diabetes with ophthalmic manifestations"
         ),
         collected_all_disease_terms
         )
  )

# for pubmed id: 31482010
# all samples are type 2 diabetes
# so replace diabetic retinopathy with type 2 diabetes with ophthalmic manifestations
gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         ifelse(PUBMED_ID == 31482010,
                stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern("proliferative diabetic retinopathy"),
                          "type 2 diabetes with ophthalmic manifestations"
         ),
         collected_all_disease_terms
         )
)

Treatment resistant

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_remove_all(collected_all_disease_terms,
                       "^treatment-resistant |^treatment resistant |^treatment-resistant "
         ))

Unspecified condition associated with female genital organs and menstrual cycle

gwas_study_info = 
  gwas_study_info |>
  mutate(collected_all_disease_terms = 
    ifelse(grepl("Menstruation", 
                `DISEASE/TRAIT`,
                 ignore.case = TRUE),
           str_replace_all(collected_all_disease_terms,
                           pattern = vec_to_grep_pattern("decreased attention"),
                           "unspecified condition associated with female genital organs and menstrual cycle"),
           collected_all_disease_terms
    )
    )

Uveitis

uveitis_terms <- c("anterior uveitis",
                   "iritis",
                   "vogt-koyanagi-harada disease",
                   "birdshot chorioretinopathy",
                   "multifocal choroiditis"
                   )

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          pattern = vec_to_grep_pattern(uveitis_terms),
                          "uveitis"
         ))

Other specified retinal disorders

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("retinal edema"
                                              )),
                          "other specified retinal disorders"
         ))

Other disorders of eyelids

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("dermatochalasis",
                                                "filarial elephantiasis"
                                                )
                                              ),
                          "other disorders of eyelids"
         ))

Other disorders of iris and ciliary body

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("iris disorder"
                                                )
                                              ),
                          "other disorders of iris and ciliary body"
         ))

Background retinopathy and retinal vascular changes

gwas_study_info = gwas_study_info |>
  mutate(collected_all_disease_terms  = 
         stringr::str_replace_all(collected_all_disease_terms,
                          vec_to_grep_pattern(c("macular telangiectasia type 2"
                                                )
                                              ),
                          "background retinopathy and retinal vascular changes"
         ))

Save

How many unique traits are there now?

diseases <- stringr::str_split(pattern = ", ",
                               gwas_study_info$collected_all_disease_terms[gwas_study_info$collected_all_disease_terms != ""])  |>
  unlist() |>
  stringr::str_trim()

diseases <- unique(diseases)

print(length(diseases))
[1] 1708
fwrite(
  gwas_study_info,
  here::here("output/gwas_cat/gwas_study_info_group.csv")
  )

sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 15.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] jsonlite_2.0.0    httr_1.4.7        data.table_1.17.8 stringr_1.5.1    
[5] dplyr_1.1.4       workflowr_1.7.1  

loaded via a namespace (and not attached):
 [1] compiler_4.3.1    renv_1.0.3        promises_1.3.3    tidyselect_1.2.1 
 [5] Rcpp_1.1.0        git2r_0.36.2      callr_3.7.6       later_1.4.2      
 [9] jquerylib_0.1.4   yaml_2.3.10       fastmap_1.2.0     here_1.0.1       
[13] R6_2.6.1          generics_0.1.4    curl_6.4.0        knitr_1.50       
[17] tibble_3.3.0      rprojroot_2.1.0   bslib_0.9.0       pillar_1.11.0    
[21] rlang_1.1.6       cachem_1.1.0      stringi_1.8.7     httpuv_1.6.16    
[25] xfun_0.52         getPass_0.2-4     fs_1.6.6          sass_0.4.10      
[29] cli_3.6.5         withr_3.0.2       magrittr_2.0.3    ps_1.9.1         
[33] digest_0.6.37     processx_3.8.6    rstudioapi_0.17.1 lifecycle_1.0.4  
[37] vctrs_0.6.5       evaluate_1.0.4    glue_1.8.0        whisker_0.4.1    
[41] rmarkdown_2.29    tools_4.3.1       pkgconfig_2.0.3   htmltools_0.5.8.1