Last updated: 2025-09-16
Checks: 7 0
Knit directory:
genomics_ancest_disease_dispar/
This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20220216)
was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 02a0b9d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish
or
wflow_git_commit
). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rproj.user/
Ignored: data/.DS_Store
Ignored: data/gbd/.DS_Store
Ignored: data/gbd/ihme_gbd_2019_global_disease_burden_rate_all_ages.csv
Ignored: data/gbd/ihme_gbd_2019_global_paf_rate_percent_all_ages.csv
Ignored: data/gbd/ihme_gbd_2021_global_disease_burden_rate_all_ages.csv
Ignored: data/gbd/ihme_gbd_2021_global_paf_rate_percent_all_ages.csv
Ignored: data/gwas_catalog/
Ignored: data/who/
Ignored: output/gwas_cat/
Ignored: output/gwas_study_info_cohort_corrected.csv
Ignored: output/gwas_study_info_trait_corrected.csv
Ignored: output/gwas_study_info_trait_ontology_info.csv
Ignored: output/gwas_study_info_trait_ontology_info_l1.csv
Ignored: output/gwas_study_info_trait_ontology_info_l2.csv
Ignored: output/trait_ontology/
Ignored: renv/
Unstaged changes:
Modified: code/get_term_descendants.R
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/level_2_disease_group.Rmd
)
and HTML (docs/level_2_disease_group.html
) files. If you’ve
configured a remote Git repository (see ?wflow_git_remote
),
click on the hyperlinks in the table below to view the files as they
were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 02a0b9d | IJbeasley | 2025-09-16 | Improving cancer grouping |
html | 6f66696 | IJbeasley | 2025-09-16 | Build site. |
Rmd | 66cff1c | IJbeasley | 2025-09-16 | Even more disease term grouping |
html | 21b6c02 | IJbeasley | 2025-09-15 | Build site. |
html | 5ec3111 | IJbeasley | 2025-09-15 | Build site. |
html | 30d773e | IJbeasley | 2025-09-15 | Build site. |
html | 8d64a38 | IJbeasley | 2025-09-15 | Build site. |
Rmd | b3088d8 | IJbeasley | 2025-09-15 | workflowr::wflow_publish("analysis/level_2_disease_group.Rmd") |
html | b89d661 | IJbeasley | 2025-09-10 | Build site. |
Rmd | c0fcab7 | IJbeasley | 2025-09-10 | workflowr::wflow_publish("analysis/level_2_disease_group.Rmd") |
html | ead4d8e | IJbeasley | 2025-09-10 | Build site. |
Rmd | 3964f77 | IJbeasley | 2025-09-10 | workflowr::wflow_publish("analysis/level_2_disease_group.Rmd") |
html | 8fb639d | IJbeasley | 2025-09-10 | Build site. |
Rmd | edeb6f5 | IJbeasley | 2025-09-10 | workflowr::wflow_publish("analysis/level_2_disease_group.Rmd") |
html | fe91704 | IJbeasley | 2025-09-09 | Build site. |
Rmd | 9c64867 | IJbeasley | 2025-09-09 | Minor fixing of disease trait categorisation |
html | fa509c0 | IJbeasley | 2025-09-08 | Build site. |
Rmd | c9602c7 | IJbeasley | 2025-09-08 | More grouping to match GBD |
library(dplyr)
library(data.table)
library(ggplot2)
library(stringr)
source(here::here("code/get_term_descendants.R"))
gwas_study_info <- fread(here::here("output/gwas_cat/gwas_study_info_group_l1_v2.csv"))
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms = l1_all_disease_terms)
gwas_study_info |>
filter(grepl("lip and oral cavity cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: oral cavity cancer
2: mouth neoplasm
3: tongue cancer
4: major salivary gland cancer
5: human papilloma virus infection, oral cavity cancer
6: tongue neoplasm
7: major salivary gland carcinoma
8: lip cancer
9: oral squamous cell carcinoma
l2_all_disease_terms
<char>
1: lip and oral cavity cancer
2: lip and oral cavity cancer
3: lip and oral cavity cancer
4: lip and oral cavity cancer
5: human papilloma virus infection, lip and oral cavity cancer
6: lip and oral cavity cancer
7: lip and oral cavity cancer
8: lip and oral cavity cancer
9: lip and oral cavity cancer
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "nasopharyngeal cancer",
"nasopharynx cancer"
)
)
gwas_study_info |>
filter(grepl("nasopharynx cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: nasopharyngeal neoplasm nasopharynx cancer
gwas_study_info |>
filter(grepl("other pharynx cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: oropharynx cancer
2: laryngeal squamous cell carcinoma, hypopharynx cancer
3: human papilloma virus infection, oropharynx cancer
4: tonsil cancer
5: hypopharyngeal carcinoma
6: pharynx cancer, laryngeal carcinoma
l2_all_disease_terms
<char>
1: other pharynx cancer
2: larynx cancer, other pharynx cancer
3: human papilloma virus infection, other pharynx cancer
4: other pharynx cancer
5: other pharynx cancer
6: larynx cancer, other pharynx cancer
gwas_study_info |>
filter(grepl("esophageal cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: esophageal adenocarcinoma, barretts esophagus
2: esophageal adenocarcinoma
3: esophageal adenocarcinoma, gastroesophageal reflux disease
4: esophageal squamous cell carcinoma
5: esophageal carcinoma, gastric carcinoma
6: squamous cell carcinoma, esophageal carcinoma
7: esophageal carcinoma
8: esophageal adenocarcinoma, digestive system disease, barretts esophagus
9: neoplasm of esophagus
10: esophageal cancer
l2_all_disease_terms
<char>
1: barretts esophagus, esophageal cancer
2: esophageal cancer
3: esophageal cancer, gastroesophageal reflux disease
4: esophageal cancer
5: esophageal cancer, stomach cancer
6: esophageal cancer
7: esophageal cancer
8: barretts esophagus, digestive system disease, esophageal cancer
9: esophageal cancer
10: esophageal cancer
gwas_study_info |>
filter(grepl("stomach cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: gastric carcinoma
2: esophageal carcinoma, gastric carcinoma
3: gastric cardia carcinoma
4: gastric adenocarcinoma
5: lung carcinoma, squamous cell carcinoma, gastric carcinoma
6: gastric cancer
7: stomach neoplasm
8: gastric intestinal type adenocarcinoma
9: diffuse gastric adenocarcinoma
10: cardia cancer
l2_all_disease_terms
<char>
1: stomach cancer
2: esophageal cancer, stomach cancer
3: stomach cancer
4: stomach cancer
5: lung cancer, stomach cancer
6: stomach cancer
7: stomach cancer
8: stomach cancer
9: stomach cancer
10: stomach cancer
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "colorectal cancer",
"colon and rectum cancer"
)
)
gwas_study_info |>
filter(grepl("colon and rectum cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: colorectal cancer
2: sclerosing cholangitis, colorectal cancer
3: colorectal cancer, colorectal adenoma
4: metastatic colorectal cancer
5: lung carcinoma, estrogen-receptor negative breast cancer, ovarian endometrioid carcinoma, colorectal cancer, prostate carcinoma, ovarian serous carcinoma, breast carcinoma, ovarian carcinoma, lung adenocarcinoma, squamous cell lung carcinoma, cancer
6: rectum cancer
7: colonic neoplasm
8: colorectal adenocarcinoma
9: colon carcinoma
10: colorectal cancer, colorectal mucinous adenocarcinoma
11: colorectal carcinoma
12: colorectal cancer, peripheral neuropathy
13: colorectal cancer, stomatitis
14: colorectal cancer, neutropenia
15: colorectal cancer, hand-foot syndrome
16: colorectal cancer, exanthem
17: colorectal cancer, sleepiness
18: anal carcinoma
19: cecum cancer
20: sigmoid neoplasm
21: rectum cancer, colonic neoplasm
22: colorectal cancer, inflammatory bowel disease
23: colon carcinoma, sensory peripheral neuropathy
24: colon carcinoma, drug allergy
25: colorectal cancer, lung cancer
26: colorectal cancer, squamous cell lung carcinoma
27: colorectal cancer, skin disease
28: skin disease, colon carcinoma
29: age of onset of colorectal cancer
30: cecal neoplasm
31: colorectal cancer, breast carcinoma
32: metastatic colorectal cancer, disease progression measurement
33: polyp of large intestine, colorectal cancer
all_disease_terms
l2_all_disease_terms
<char>
1: colon and rectum cancer
2: colon and rectum cancer, sclerosing cholangitis
3: benign neoplasm, colon and rectum cancer
4: colon and rectum cancer
5: breast cancer, cancer, colon and rectum cancer, lung cancer, ovarian cancer, prostate cancer
6: colon and rectum cancer
7: colon and rectum cancer
8: colon and rectum cancer
9: colon and rectum cancer
10: colon and rectum cancer
11: colon and rectum cancer
12: colon and rectum cancer, peripheral neuropathy
13: colon and rectum cancer, stomatitis
14: colon and rectum cancer, neutropenia
15: colon and rectum cancer, hand-foot syndrome
16: colon and rectum cancer, exanthem
17: colon and rectum cancer, sleepiness
18: colon and rectum cancer
19: colon and rectum cancer
20: colon and rectum cancer
21: colon and rectum cancer
22: colon and rectum cancer, inflammatory bowel disease
23: colon and rectum cancer, sensory peripheral neuropathy
24: colon and rectum cancer, drug allergy
25: colon and rectum cancer, lung cancer
26: colon and rectum cancer, lung cancer
27: colon and rectum cancer, skin disease
28: colon and rectum cancer, skin disease
29: colon and rectum cancer
30: colon and rectum cancer
31: breast cancer, colon and rectum cancer
32: colon and rectum cancer
33: benign neoplasm, colon and rectum cancer
l2_all_disease_terms
gwas_study_info |>
filter(grepl("liver cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: hepatitis b virus infection, hepatocellular carcinoma
2: sclerosing cholangitis, cholangiocarcinoma
3: cholangiocarcinoma, sclerosing cholangitis
4: sclerosing cholangitis, hepatocellular carcinoma
5: hepatitis c virus infection, hepatocellular carcinoma
6: hepatocellular carcinoma, non-alcoholic steatohepatitis
7: hepatocellular carcinoma
8: biliary tract cancer
9: liver neoplasm
10: intrahepatic bile duct cancer, liver cancer
11: liver cancer
12: liver cancer, bile duct cancer
13: bile duct cancer
14: extrahepatic bile duct carcinoma
15: intrahepatic cholangiocarcinoma
16: alcohol-related disorders, hepatocellular carcinoma
17: hepatitis virus-related hepatocellular carcinoma
18: alcoholic liver cirrhosis, hepatocellular carcinoma
19: cholangiocarcinoma
l2_all_disease_terms
<char>
1: hepatitis b infection, liver cancer
2: liver cancer, sclerosing cholangitis
3: liver cancer, sclerosing cholangitis
4: liver cancer, sclerosing cholangitis
5: hepatitis c infection, liver cancer
6: liver cancer, non-alcoholic fatty liver disease
7: liver cancer
8: liver cancer
9: liver cancer
10: liver cancer
11: liver cancer
12: liver cancer
13: liver cancer
14: liver cancer
15: liver cancer
16: alcohol-related disorders, liver cancer
17: liver cancer
18: alcoholic liver disease, liver cancer
19: liver cancer
gwas_study_info |>
filter(grepl("gallbladder and biliary tract cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: sclerosing cholangitis, gallbladder neoplasm
2: gallbladder neoplasm
l2_all_disease_terms
<char>
1: gallbladder and biliary tract cancer, sclerosing cholangitis
2: gallbladder and biliary tract cancer
gwas_study_info |>
filter(grepl("pancreatic cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: pancreatic carcinoma
2: pancreatic ductal adenocarcinoma
3: pancreatic carcinoma, neutropenia
4: breast carcinoma, estrogen-receptor positive breast cancer, metastatic prostate cancer, pancreatic carcinoma, hypertension
5: breast carcinoma, estrogen-receptor positive breast cancer, metastatic prostate cancer, pancreatic carcinoma, proteinuria
6: breast carcinoma, estrogen-receptor positive breast cancer, metastatic prostate cancer, pancreatic carcinoma, hypertension, proteinuria
l2_all_disease_terms
<char>
1: pancreatic cancer
2: pancreatic cancer
3: neutropenia, pancreatic cancer
4: breast cancer, hypertension, pancreatic cancer, prostate cancer
5: breast cancer, pancreatic cancer, prostate cancer, proteinuria
6: breast cancer, hypertension, pancreatic cancer, prostate cancer, proteinuria
gwas_study_info |>
filter(grepl("larynx cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: laryngeal squamous cell carcinoma
2: laryngeal squamous cell carcinoma, hypopharynx cancer
3: laryngeal carcinoma
4: laryngeal neoplasm
5: glottis neoplasm
6: pharynx cancer, laryngeal carcinoma
l2_all_disease_terms
<char>
1: larynx cancer
2: larynx cancer, other pharynx cancer
3: larynx cancer
4: larynx cancer
5: larynx cancer
6: larynx cancer, other pharynx cancer
resp_cancer_terms = c("lung cancer",
"bronchus cancer",
"tracheal cancer",
"respiratory system cancer"
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = paste0(resp_cancer_terms, collapse = "(?=,|$)|\\b"),
"tracheal bronchus and lung cancer"
)
)
gwas_study_info |>
filter(grepl("tracheal bronchus and lung cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: non-small cell lung carcinoma
2: lung adenocarcinoma
3: squamous cell lung carcinoma
4: lung carcinoma, family history of lung cancer
5: lung adenocarcinoma, family history of lung cancer
6: squamous cell lung carcinoma, family history of lung cancer
7: lung carcinoma
8: small cell lung carcinoma
9: lung carcinoma, schizophrenia
10: lung carcinoma, squamous cell carcinoma, lung adenocarcinoma
11: lung carcinoma, estrogen-receptor negative breast cancer, ovarian endometrioid carcinoma, colorectal cancer, prostate carcinoma, ovarian serous carcinoma, breast carcinoma, ovarian carcinoma, lung adenocarcinoma, squamous cell lung carcinoma, cancer
12: non-small cell lung carcinoma, drug-induced liver injury
13: lung carcinoma, chronic obstructive pulmonary disease
14: lung carcinoma, squamous cell carcinoma, gastric carcinoma
15: lung cancer
16: respiratory system cancer
17: bronchus cancer
18: lung cancer, bronchus cancer
19: family history of lung cancer
20: breast cancer, lung cancer
21: head and neck carcinoma, lung cancer
22: small cell carcinoma
23: colorectal cancer, lung cancer
24: lung cancer, gastroesophageal reflux disease
25: peptic ulcer disease, lung cancer
26: colorectal cancer, squamous cell lung carcinoma
27: peptic ulcer disease, squamous cell lung carcinoma
28: non-small cell lung carcinoma, disease progression measurement
29: lung neoplasm
30: cancer
31: lung cancer, radiation-induced disorder
all_disease_terms
l2_all_disease_terms
<char>
1: tracheal bronchus and lung cancer
2: tracheal bronchus and lung cancer
3: tracheal bronchus and lung cancer
4: tracheal bronchus and lung cancer
5: tracheal bronchus and lung cancer
6: tracheal bronchus and lung cancer
7: tracheal bronchus and lung cancer
8: tracheal bronchus and lung cancer
9: tracheal bronchus and lung cancer, schizophrenia
10: tracheal bronchus and lung cancer
11: breast cancer, cancer, colon and rectum cancer, tracheal bronchus and lung cancer, ovarian cancer, prostate cancer
12: drug-induced liver injury, tracheal bronchus and lung cancer
13: chronic obstructive pulmonary disease, tracheal bronchus and lung cancer
14: tracheal bronchus and lung cancer, stomach cancer
15: tracheal bronchus and lung cancer
16: tracheal bronchus and lung cancer
17: tracheal bronchus and lung cancer
18: tracheal bronchus and lung cancer, tracheal bronchus and lung cancer
19: tracheal bronchus and lung cancer
20: breast cancer, tracheal bronchus and lung cancer
21: head and neck cancer, tracheal bronchus and lung cancer
22: tracheal bronchus and lung cancer
23: colon and rectum cancer, tracheal bronchus and lung cancer
24: gastroesophageal reflux disease, tracheal bronchus and lung cancer
25: tracheal bronchus and lung cancer, peptic ulcer disease
26: colon and rectum cancer, tracheal bronchus and lung cancer
27: tracheal bronchus and lung cancer, peptic ulcer disease
28: tracheal bronchus and lung cancer
29: tracheal bronchus and lung cancer
30: tracheal bronchus and tracheal bronchus and lung cancer
31: tracheal bronchus and lung cancer, radiation-induced disorder
l2_all_disease_terms
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = "malignant melanoma of skin",
"malignant skin melanoma"
)
)
gwas_study_info |>
filter(grepl("malignant skin melanoma", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: cutaneous melanoma malignant skin melanoma
2: melanoma malignant skin melanoma
3: neuroblastoma, cutaneous melanoma malignant skin melanoma, neuroblastoma
4: non-melanoma skin carcinoma non-malignant skin melanoma skin cancer
gwas_study_info |>
filter(grepl("non-melanoma skin cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: squamous cell carcinoma, basal cell carcinoma non-melanoma skin cancer
2: keratinocyte carcinoma non-melanoma skin cancer
3: basal cell carcinoma non-melanoma skin cancer
4: non-melanoma skin carcinoma non-melanoma skin cancer
5: skin neoplasm non-melanoma skin cancer
6: skin carcinoma in situ non-melanoma skin cancer
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "soft tissue sarcoma",
"soft tissue and other extraosseous sarcomas"
)
)
gwas_study_info |>
filter(grepl("soft tissue and other extraosseous sarcomas", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: sarcoma, fibrosarcoma sarcoma, soft tissue and other extraosseous sarcomas
2: kaposis sarcoma soft tissue and other extraosseous sarcomas
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "bone cancer|osteosarcoma",
"malignant neoplasm of bone and articular cartilage"
)
)
gwas_study_info |>
filter(grepl("malignant neoplasm of bone and articular cartilage", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: osteosarcoma
2: acute myeloid leukemia
3: myeloid leukemia
4: malignant bone neoplasm
5: acute myeloid leukemia, myelodysplastic syndrome
6: bone neoplasm
7: myelofibrosis
8: acute lymphoblastic leukemia, acute myeloid leukemia, myelodysplastic syndrome
l2_all_disease_terms
<char>
1: malignant neoplasm of bone and articular cartilage
2: malignant neoplasm of bone and articular cartilage
3: malignant neoplasm of bone and articular cartilage
4: malignant neoplasm of bone and articular cartilage
5: malignant neoplasm of bone and articular cartilage, myelodysplastic syndrome
6: malignant neoplasm of bone and articular cartilage
7: malignant neoplasm of bone and articular cartilage
8: malignant neoplasm of bone and articular cartilage, leukemia, myelodysplastic syndrome
gwas_study_info |>
filter(grepl("breast cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: estrogen-receptor negative breast cancer
2: breast carcinoma
3: estrogen-receptor positive breast cancer
4: breast carcinoma,
5: estrogen-receptor positive breast cancer, breast carcinoma
6: estrogen-receptor negative breast cancer, breast carcinoma
7: breast carcinoma, peripheral neuropathy
8: prostate carcinoma, breast carcinoma, ovarian carcinoma
9: male breast carcinoma
10: invasive lobular carcinoma
11: lung carcinoma, estrogen-receptor negative breast cancer, ovarian endometrioid carcinoma, colorectal cancer, prostate carcinoma, ovarian serous carcinoma, breast carcinoma, ovarian carcinoma, lung adenocarcinoma, squamous cell lung carcinoma, cancer
12: tp53 positive breast carcinoma
13: breast carcinoma, congestive heart failure
14: triple-negative breast cancer
15: breast carcinoma, chemotherapy-induced hypertension
16: estrogen-receptor negative breast cancer, estrogen-receptor positive breast cancer, breast carcinoma
17: childhood cancer, breast carcinoma
18: breast cancer
19: luminal a breast carcinoma
20: luminal b breast carcinoma
21: basal-like breast carcinoma
22: her2 positive breast carcinoma
23: breast carcinoma, chemotherapy-induced alopecia
24: progesterone-receptor negative breast cancer
25: her2 negative breast carcinoma
26: breast carcinoma, amenorrhea
27: breast carcinoma, post operative nausea and vomiting
28: breast cancer, covid-19
29: estrogen-receptor positive breast cancer, breast carcinoma, her2 negative breast carcinoma, progesterone-receptor positive breast cancer
30: her2 positive breast carcinoma, estrogen-receptor positive breast cancer, breast carcinoma, progesterone-receptor positive breast cancer
31: progesterone-receptor negative breast cancer, estrogen-receptor negative breast cancer, her2 positive breast carcinoma, breast carcinoma
32: breast carcinoma, triple-negative breast cancer
33: her2 positive breast carcinoma, musculoskeletal system disease
34: lobular breast carcinoma in situ
35: breast carcinoma in situ
36: breast neoplasm
37: breast cancer, ovarian carcinoma
38: breast cancer, lung cancer
39: estrogen-receptor negative breast cancer, estrogen-receptor positive breast cancer
40: estrogen-receptor positive breast cancer, triple-negative breast cancer
41: her2 positive breast carcinoma, triple-negative breast cancer
42: breast carcinoma, cardiotoxicity
43: breast carcinoma, uterine leiomyoma
44: estrogen-receptor positive breast cancer, uterine leiomyoma
45: estrogen-receptor negative breast cancer, uterine leiomyoma
46: ductal breast carcinoma in situ and lobular carcinoma in situ
47: luminal b breast carcinoma, luminal a breast carcinoma
48: her2 positive breast carcinoma, luminal a breast carcinoma
49: triple-negative breast cancer, luminal a breast carcinoma
50: luminal b breast carcinoma, triple-negative breast cancer
51: luminal b breast carcinoma, her2 negative breast carcinoma, triple-negative breast cancer
52: luminal b breast carcinoma, her2 positive breast carcinoma
53: breast cancer, radiation-induced disorder
54: colorectal cancer, breast carcinoma
55: breast carcinoma, dermatological toxicity
56: breast carcinoma, edema
57: breast carcinoma, telangiectasia of the skin
58: breast carcinoma, lymphedema
59: luminal b breast carcinoma, her2 negative breast carcinoma
60: breast carcinoma, estrogen-receptor positive breast cancer, metastatic prostate cancer, pancreatic carcinoma, hypertension
61: breast carcinoma, estrogen-receptor positive breast cancer, metastatic prostate cancer, pancreatic carcinoma, proteinuria
62: breast carcinoma, estrogen-receptor positive breast cancer, metastatic prostate cancer, pancreatic carcinoma, hypertension, proteinuria
63: brcax breast cancer
64: schizophrenia, breast carcinoma
65: schizophrenia, estrogen-receptor positive breast cancer
66: estrogen-receptor negative breast cancer, schizophrenia
67: breast cancer, neutropenia, leukopenia
all_disease_terms
l2_all_disease_terms
<char>
1: breast cancer
2: breast cancer
3: breast cancer
4: breast cancer
5: breast cancer
6: breast cancer
7: breast cancer, peripheral neuropathy
8: breast cancer, ovarian cancer, prostate cancer
9: breast cancer
10: breast cancer
11: breast cancer, cancer, colon and rectum cancer, tracheal bronchus and lung cancer, ovarian cancer, prostate cancer
12: breast cancer
13: breast cancer, congestive heart failure
14: breast cancer
15: breast cancer, hypertension
16: breast cancer
17: breast cancer, childhood cancer
18: breast cancer
19: breast cancer
20: breast cancer
21: breast cancer
22: breast cancer
23: breast cancer, chemotherapy-induced alopecia
24: breast cancer
25: breast cancer
26: amenorrhea, breast cancer
27: breast cancer, post operative nausea and vomiting
28: breast cancer, covid-19
29: breast cancer
30: breast cancer
31: breast cancer
32: breast cancer
33: breast cancer, musculoskeletal system disease
34: breast cancer
35: breast cancer
36: breast cancer
37: breast cancer, ovarian cancer
38: breast cancer, tracheal bronchus and lung cancer
39: breast cancer
40: breast cancer
41: breast cancer
42: breast cancer, cardiotoxicity
43: benign neoplasm, breast cancer
44: benign neoplasm, breast cancer
45: benign neoplasm, breast cancer
46: breast cancer
47: breast cancer
48: breast cancer
49: breast cancer
50: breast cancer
51: breast cancer
52: breast cancer
53: breast cancer, radiation-induced disorder
54: breast cancer, colon and rectum cancer
55: breast cancer, dermatological toxicity
56: breast cancer, edema
57: breast cancer, telangiectasia of the skin
58: breast cancer, lymphedema
59: breast cancer
60: breast cancer, hypertension, pancreatic cancer, prostate cancer
61: breast cancer, pancreatic cancer, prostate cancer, proteinuria
62: breast cancer, hypertension, pancreatic cancer, prostate cancer, proteinuria
63: breast cancer
64: breast cancer, schizophrenia
65: breast cancer, schizophrenia
66: breast cancer, schizophrenia
67: breast cancer, leukopenia, neutropenia
l2_all_disease_terms
gwas_study_info |>
filter(grepl("cervical cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: cervical carcinoma
2: cervical cancer
3: dysplasia of cervix, cervical cancer
4: dysplasia, cervical cancer
5: uterine cervix carcinoma in situ
6: cervical carcinoma, human papilloma virus infection
7: cervical intraepithelial neoplasia grade 2/3
8: cervical carcinoma, cervical intraepithelial neoplasia grade 2/3
l2_all_disease_terms
<char>
1: cervical cancer
2: cervical cancer
3: cervical cancer, dysplasia of cervix
4: cervical cancer, dysplasia
5: cervical cancer
6: cervical cancer, human papilloma virus infection
7: cervical cancer
8: cervical cancer
# ? is endometrial cancer a subset of uterine cancer for GBD?
# is for ontology: http://purl.obolibrary.org/obo/MONDO_0002715
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "endometrial cancer",
"uterine cancer"
)
)
gwas_study_info |>
filter(grepl("uterine cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: endometrial endometrioid carcinoma uterine cancer
2: endometrial carcinoma uterine cancer
3: endometrial neoplasm uterine cancer
4: uterine carcinoma uterine cancer
5: endometrial cancer, covid-19 covid-19, uterine cancer
6: uterine corpus cancer uterine cancer
7: ovarian endometrioid adenocarcinoma uterine cancer
8: uterine cancer uterine cancer
9: uterine adnexa cancer, ovarian cancer ovarian cancer, uterine cancer
10: endometrial cancer uterine cancer
11: endometrial carcinoma, endometriosis uterine cancer, endometriosis
gwas_study_info |>
filter(grepl("ovarian cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: ovarian carcinoma
2: malignant epithelial tumor of ovary
3: prostate carcinoma, breast carcinoma, ovarian carcinoma
4: lung carcinoma, estrogen-receptor negative breast cancer, ovarian endometrioid carcinoma, colorectal cancer, prostate carcinoma, ovarian serous carcinoma, breast carcinoma, ovarian carcinoma, lung adenocarcinoma, squamous cell lung carcinoma, cancer
5: ovarian mucinous adenocarcinoma
6: ovarian serous carcinoma
7: ovarian clear cell adenocarcinoma
8: ovarian endometrioid carcinoma
9: ovarian carcinoma, covid-19
10: high grade ovarian serous adenocarcinoma
11: ovarian serous adenocarcinoma
12: ovarian clear cell cancer
13: uterine adnexa cancer, ovarian cancer
14: ovarian cancer
15: ovarian neoplasm
16: breast cancer, ovarian carcinoma
17: ovarian carcinoma, cancer aggressiveness measurement
18: ovarian serous carcinoma, cancer aggressiveness measurement
l2_all_disease_terms
<char>
1: ovarian cancer
2: ovarian cancer
3: breast cancer, ovarian cancer, prostate cancer
4: breast cancer, cancer, colon and rectum cancer, tracheal bronchus and lung cancer, ovarian cancer, prostate cancer
5: ovarian cancer
6: ovarian cancer
7: ovarian cancer
8: ovarian cancer
9: covid-19, ovarian cancer
10: ovarian cancer
11: ovarian cancer
12: ovarian cancer
13: ovarian cancer, uterine cancer
14: ovarian cancer
15: ovarian cancer
16: breast cancer, ovarian cancer
17: ovarian cancer
18: ovarian cancer
gwas_study_info |>
filter(grepl("prostate cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: prostate carcinoma
2: cancer aggressiveness measurement, prostate carcinoma
3: prostate carcinoma, breast carcinoma, ovarian carcinoma
4: metastatic prostate cancer, peripheral neuropathy
5: metastatic prostate cancer
6: prostate carcinoma, erectile dysfunction
7: lung carcinoma, estrogen-receptor negative breast cancer, ovarian endometrioid carcinoma, colorectal cancer, prostate carcinoma, ovarian serous carcinoma, breast carcinoma, ovarian carcinoma, lung adenocarcinoma, squamous cell lung carcinoma, cancer
8: prostate carcinoma, adverse effect
9: prostate cancer
10: grade iii prostatic intraepithelial neoplasia
11: prostate carcinoma, type 2 diabetes mellitus
12: prostate cancer, disease progression measurement
13: prostate cancer, radiation-induced disorder
14: breast carcinoma, estrogen-receptor positive breast cancer, metastatic prostate cancer, pancreatic carcinoma, hypertension
15: breast carcinoma, estrogen-receptor positive breast cancer, metastatic prostate cancer, pancreatic carcinoma, proteinuria
16: breast carcinoma, estrogen-receptor positive breast cancer, metastatic prostate cancer, pancreatic carcinoma, hypertension, proteinuria
17: metastatic prostate cancer, disease progression measurement
l2_all_disease_terms
<char>
1: prostate cancer
2: prostate cancer
3: breast cancer, ovarian cancer, prostate cancer
4: peripheral neuropathy, prostate cancer
5: prostate cancer
6: erectile dysfunction, prostate cancer
7: breast cancer, cancer, colon and rectum cancer, tracheal bronchus and lung cancer, ovarian cancer, prostate cancer
8: complication, prostate cancer
9: prostate cancer
10: prostate cancer
11: prostate cancer, type 2 diabetes mellitus
12: prostate cancer
13: prostate cancer, radiation-induced disorder
14: breast cancer, hypertension, pancreatic cancer, prostate cancer
15: breast cancer, pancreatic cancer, prostate cancer, proteinuria
16: breast cancer, hypertension, pancreatic cancer, prostate cancer, proteinuria
17: prostate cancer
gwas_study_info |>
filter(grepl("testicular cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: testicular carcinoma
2: testicular carcinoma, cardiovascular disease
3: testicular neoplasm
l2_all_disease_terms
<char>
1: testicular cancer
2: cardiovascular disease, testicular cancer
3: testicular cancer
gwas_study_info |>
filter(grepl("kidney cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: renal cell carcinoma kidney cancer
2: nephroblastoma kidney cancer
3: kidney cancer kidney cancer
4: clear cell renal carcinoma kidney cancer
5: renal carcinoma kidney cancer
6: papillary renal cell carcinoma kidney cancer
7: kidney neoplasm kidney cancer
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "urinary bladder cancer",
"bladder cancer"
)
)
gwas_study_info |>
filter(grepl("bladder cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: urinary bladder carcinoma
2: urinary bladder carcinoma, disease progression measurement
3: urinary bladder cancer,
4: urinary bladder cancer
l2_all_disease_terms
<char>
1: bladder cancer
2: bladder cancer
3: bladder cancer
4: bladder cancer
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "\\bcentral nervous system cancer\\b",
"brain and central nervous system cancer"
)
)
gwas_study_info |>
filter(grepl("brain and central nervous system cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: glioblastoma multiforme
2: central nervous system cancer
3: central nervous system cancer, glioma
4: central nervous system cancer, glioblastoma multiforme
5: brain neoplasm
6: glioma
7: oligodendroglioma
8: nervous system cancer, brain cancer
9: brain cancer
10: malignant glioma
11: central nervous system non-hodgkin lymphoma
l2_all_disease_terms
<char>
1: brain and central nervous system cancer
2: brain and central nervous system cancer
3: brain and central nervous system cancer
4: brain and central nervous system cancer
5: brain and central nervous system cancer
6: brain and central nervous system cancer
7: brain and central nervous system cancer
8: brain and central nervous system cancer, nervous system cancer
9: brain and central nervous system cancer
10: brain and central nervous system cancer
11: brain and central nervous system cancer
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "\\bocular melanoma\\b|ocular cancer\\b",
"eye cancer"
)
)
gwas_study_info |>
filter(grepl("eye cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: uveal melanoma eye cancer
2: choroidal melanoma eye cancer
3: epithelioid cell uveal melanoma eye cancer
4: uveal melanoma, epithelioid cell uveal melanoma eye cancer
5: uveal melanoma disease severity eye cancer
6: ocular cancer eye cancer
7: eye neoplasm eye cancer
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "\\bneuroblastoma\\b|\\bperipheral nervous system cancer\\b",
"neuroblastoma and other peripheral nervous system cancers"
)
)
gwas_study_info |>
filter(grepl("neuroblastoma and other peripheral nervous system cancers", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: neuroblastoma
2: neuroblastoma, cutaneous melanoma
l2_all_disease_terms
<char>
1: neuroblastoma and other peripheral nervous system cancers
2: malignant skin melanoma, neuroblastoma and other peripheral nervous system cancers
gwas_study_info |>
filter(grepl("thyroid cancer", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: differentiated thyroid carcinoma thyroid cancer
2: papillary thyroid carcinoma thyroid cancer
3: follicular thyroid carcinoma thyroid cancer
4: thyroid carcinoma thyroid cancer
5: thyroid cancer thyroid cancer
gwas_study_info |>
filter(grepl("mesothelioma", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: malignant pleural mesothelioma mesothelioma
2: mesothelioma mesothelioma
3: pleural mesothelioma mesothelioma
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "hodgkins lymphoma",
"hodgkin lymphoma"
)
)
gwas_study_info |>
filter(grepl("hodgkin lymphoma", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: diffuse large b-cell lymphoma, multiple sclerosis
2: follicular lymphoma, multiple sclerosis
3: marginal zone b-cell lymphoma, multiple sclerosis
4: diffuse large b-cell lymphoma, rheumatoid arthritis
5: rheumatoid arthritis, follicular lymphoma
6: rheumatoid arthritis, marginal zone b-cell lymphoma
7: diffuse large b-cell lymphoma, systemic lupus erythematosus
8: systemic lupus erythematosus, follicular lymphoma
9: marginal zone b-cell lymphoma, systemic lupus erythematosus
10: nodular sclerosis hodgkin lymphoma
11: hodgkins lymphoma
12: diffuse large b-cell lymphoma
13: neoplasm of mature b-cells
14: marginal zone b-cell lymphoma
15: hodgkins lymphoma, multiple myeloma, chronic lymphocytic leukemia
16: diffuse large b-cell lymphoma, marginal zone b-cell lymphoma, follicular lymphoma, chronic lymphocytic leukemia
17: hodgkins lymphoma, multiple myeloma, non-hodgkins lymphoma
18: acute lymphoblastic leukemia, lymphoblastic lymphoma, venous thromboembolism
19: non-hodgkins lymphoma
20: follicular lymphoma
21: reticulum cell sarcoma
22: extranodal nasal nk/t cell lymphoma
23: hiv infection, non-hodgkins lymphoma
all_disease_terms
l2_all_disease_terms
<char>
1: multiple sclerosis, non-hodgkin lymphoma
2: multiple sclerosis, non-hodgkin lymphoma
3: multiple sclerosis, non-hodgkin lymphoma
4: non-hodgkin lymphoma, rheumatoid arthritis
5: non-hodgkin lymphoma, rheumatoid arthritis
6: non-hodgkin lymphoma, rheumatoid arthritis
7: non-hodgkin lymphoma, systemic lupus erythematosus
8: non-hodgkin lymphoma, systemic lupus erythematosus
9: non-hodgkin lymphoma, systemic lupus erythematosus
10: hodgkin lymphoma
11: hodgkin lymphoma
12: non-hodgkin lymphoma
13: non-hodgkin lymphoma
14: non-hodgkin lymphoma
15: hodgkin lymphoma, leukemia, multiple myeloma
16: leukemia, non-hodgkin lymphoma
17: hodgkin lymphoma, multiple myeloma, non-hodgkin lymphoma
18: leukemia, non-hodgkin lymphoma, venous thromboembolism
19: non-hodgkin lymphoma
20: non-hodgkin lymphoma
21: non-hodgkin lymphoma
22: non-hodgkin lymphoma
23: hiv infection, non-hodgkin lymphoma
l2_all_disease_terms
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "non-hodgkins lymphoma",
"non-hodgkin lymphoma"
)
)
gwas_study_info |>
filter(grepl("non-hodgkin lymphoma", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: diffuse large b-cell lymphoma, multiple sclerosis
2: follicular lymphoma, multiple sclerosis
3: marginal zone b-cell lymphoma, multiple sclerosis
4: diffuse large b-cell lymphoma, rheumatoid arthritis
5: rheumatoid arthritis, follicular lymphoma
6: rheumatoid arthritis, marginal zone b-cell lymphoma
7: diffuse large b-cell lymphoma, systemic lupus erythematosus
8: systemic lupus erythematosus, follicular lymphoma
9: marginal zone b-cell lymphoma, systemic lupus erythematosus
10: diffuse large b-cell lymphoma
11: neoplasm of mature b-cells
12: marginal zone b-cell lymphoma
13: diffuse large b-cell lymphoma, marginal zone b-cell lymphoma, follicular lymphoma, chronic lymphocytic leukemia
14: hodgkins lymphoma, multiple myeloma, non-hodgkins lymphoma
15: acute lymphoblastic leukemia, lymphoblastic lymphoma, venous thromboembolism
16: non-hodgkins lymphoma
17: follicular lymphoma
18: reticulum cell sarcoma
19: extranodal nasal nk/t cell lymphoma
20: hiv infection, non-hodgkins lymphoma
all_disease_terms
l2_all_disease_terms
<char>
1: multiple sclerosis, non-hodgkin lymphoma
2: multiple sclerosis, non-hodgkin lymphoma
3: multiple sclerosis, non-hodgkin lymphoma
4: non-hodgkin lymphoma, rheumatoid arthritis
5: non-hodgkin lymphoma, rheumatoid arthritis
6: non-hodgkin lymphoma, rheumatoid arthritis
7: non-hodgkin lymphoma, systemic lupus erythematosus
8: non-hodgkin lymphoma, systemic lupus erythematosus
9: non-hodgkin lymphoma, systemic lupus erythematosus
10: non-hodgkin lymphoma
11: non-hodgkin lymphoma
12: non-hodgkin lymphoma
13: leukemia, non-hodgkin lymphoma
14: hodgkin lymphoma, multiple myeloma, non-hodgkin lymphoma
15: leukemia, non-hodgkin lymphoma, venous thromboembolism
16: non-hodgkin lymphoma
17: non-hodgkin lymphoma
18: non-hodgkin lymphoma
19: non-hodgkin lymphoma
20: hiv infection, non-hodgkin lymphoma
l2_all_disease_terms
gwas_study_info |>
filter(grepl("multiple myeloma", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: multiple myeloma
2: multiple myeloma, peripheral neuropathy
3: multiple myeloma, chemotherapy-induced oral mucositis
4: multiple myeloma, monoclonal gammopathy
5: hodgkins lymphoma, multiple myeloma, chronic lymphocytic leukemia
6: hodgkins lymphoma, multiple myeloma, non-hodgkins lymphoma
7: multiple myeloma, clostridium difficile infection
l2_all_disease_terms
<char>
1: multiple myeloma
2: multiple myeloma, peripheral neuropathy
3: chemotherapy-induced oral mucositis, multiple myeloma
4: monoclonal gammopathy, multiple myeloma
5: hodgkin lymphoma, leukemia, multiple myeloma
6: hodgkin lymphoma, multiple myeloma, non-hodgkin lymphoma
7: clostridium difficile infection, multiple myeloma
gwas_study_info |>
filter(grepl("leukemia", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: acute lymphoblastic leukemia
2: multiple sclerosis, chronic lymphocytic leukemia
3: rheumatoid arthritis, chronic lymphocytic leukemia
4: systemic lupus erythematosus, chronic lymphocytic leukemia
5: acute lymphoblastic leukemia, asparaginase-induced acute pancreatitis
6: chronic lymphocytic leukemia
7: b-cell acute lymphoblastic leukemia
8: chronic myelogenous leukemia
9: hodgkins lymphoma, multiple myeloma, chronic lymphocytic leukemia
10: childhood acute lymphoblastic leukemia
11: acute lymphoblastic leukemia, peripheral neuropathy
12: diffuse large b-cell lymphoma, marginal zone b-cell lymphoma, follicular lymphoma, chronic lymphocytic leukemia
13: acute lymphoblastic leukemia, lymphoblastic lymphoma, venous thromboembolism
14: leukemia
15: b-cell acute lymphoblastic leukemia, adult onset asthma
16: b-cell acute lymphoblastic leukemia, childhood onset asthma
17: b-cell acute lymphoblastic leukemia, graves disease
18: b-cell acute lymphoblastic leukemia, hashimotos thyroiditis
19: b-cell acute lymphoblastic leukemia, hypothyroidism
20: b-cell acute lymphoblastic leukemia, primary biliary cirrhosis
21: b-cell acute lymphoblastic leukemia, sclerosing cholangitis
22: b-cell acute lymphoblastic leukemia, inflammatory bowel disease
23: b-cell acute lymphoblastic leukemia, crohns disease
24: b-cell acute lymphoblastic leukemia, ulcerative colitis
25: b-cell acute lymphoblastic leukemia, rheumatoid arthritis
26: b-cell acute lymphoblastic leukemia, multiple sclerosis
27: b-cell acute lymphoblastic leukemia, systemic scleroderma
28: b-cell acute lymphoblastic leukemia, systemic lupus erythematosus
29: b-cell acute lymphoblastic leukemia, type 1 diabetes mellitus
30: b-cell acute lymphoblastic leukemia, vitiligo
31: lymphoid leukemia
32: acute lymphoblastic leukemia, hyperbilirubinemia
33: b-cell acute lymphoblastic leukemia with t(1;19)(q23;p13.3); e2a-pbx1 (tcf3-pbx1)
34: childhood acute lymphoblastic leukemia, b-cell acute lymphoblastic leukemia with t(1;19)(q23;p13.3); e2a-pbx1 (tcf3-pbx1)
35: monocytic leukemia
36: childhood t acute lymphoblastic leukemia
37: acute lymphoblastic leukemia, neurotoxicity
38: acute lymphoblastic leukemia, acute myeloid leukemia, myelodysplastic syndrome
39: myeloid neoplasm
all_disease_terms
l2_all_disease_terms
<char>
1: leukemia
2: leukemia, multiple sclerosis
3: leukemia, rheumatoid arthritis
4: leukemia, systemic lupus erythematosus
5: leukemia, pancreatitis
6: leukemia
7: leukemia
8: leukemia
9: hodgkin lymphoma, leukemia, multiple myeloma
10: leukemia
11: leukemia, peripheral neuropathy
12: leukemia, non-hodgkin lymphoma
13: leukemia, non-hodgkin lymphoma, venous thromboembolism
14: leukemia
15: asthma, leukemia
16: asthma, leukemia
17: graves disease, leukemia
18: hashimotos thyroiditis, leukemia
19: hypothyroidism, leukemia
20: leukemia, primary biliary cirrhosis
21: leukemia, sclerosing cholangitis
22: inflammatory bowel disease, leukemia
23: crohns disease, leukemia
24: leukemia, ulcerative colitis
25: leukemia, rheumatoid arthritis
26: leukemia, multiple sclerosis
27: leukemia, systemic scleroderma
28: leukemia, systemic lupus erythematosus
29: leukemia, type 1 diabetes mellitus
30: leukemia, vitiligo
31: leukemia
32: hyperbilirubinemia, leukemia
33: b-cell acute lymphoblastic leukemia with t(1;19)(q23;p13.3); e2a-pbx1 (tcf3-pbx1)
34: b-cell acute lymphoblastic leukemia with t(1;19)(q23;p13.3); e2a-pbx1 (tcf3-pbx1), leukemia
35: leukemia
36: leukemia
37: leukemia, neurotoxicity
38: malignant neoplasm of bone and articular cartilage, leukemia, myelodysplastic syndrome
39: leukemia
l2_all_disease_terms
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
case_when(
l2_all_disease_terms == "cancer" ~ "other malignant neoplasms",
TRUE ~ l2_all_disease_terms
)
)
other_malignant_terms <- c("digestive system cancer",
"retroperitoneal cancer",
"peritoneal cancer",
"ewing sarcoma",
"digestive system cancer",
"female reproductive organ cancer",
"intestinal cancer",
"squamous cell cancer",
"vulvar cancer",
"head and neck cancer",
"testicular germ cell tumor",
"malignant tumor of floor of mouth",
"malignant lymphoid tumor",
"neuroendocrine tumor",
"nasal cavity cancer" #? not sure if should be somewhere else ..
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = paste0(other_malignant_terms, collapse = "(?=,|$)|\\b"),
"other malignant neoplasms"
)
)
gwas_study_info |>
filter(grepl("other malignant neoplasms", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms
<char>
1: squamous cell carcinoma
2: cancer
3: neuroendocrine neoplasm
4: small intestine neuroendocrine tumor
5: pancreatic neuroendocrine tumor
6: pulmonary neuroendocrine tumor
7: testicular germ cell tumor
8: cutaneous squamous cell carcinoma
9: head and neck malignant neoplasia
10: ewing sarcoma
11: carcinoid tumor
12: head and neck squamous cell carcinoma, pain
13: digestive system carcinoma, chronic obstructive pulmonary disease
14: digestive system carcinoma
15: head and neck squamous cell carcinoma
16: head and neck malignant neoplasia, osteoradionecrosis
17: head and neck carcinoma
18: digestive system cancer
19: reproductive system cancer
20: malignant tumor of floor of mouth
21: nasal cavity cancer
22: retroperitoneal cancer, peritoneum cancer
23: cancer
24: female reproductive organ cancer
25: intestinal cancer
26: vulvar neoplasm
27: vulvar carcinoma
28: in situ carcinoma
29: carcinoma
30: lymphoid neoplasm
31: head and neck carcinoma, lung cancer
32: head and neck malignant neoplasia, radiation-induced disorder
33: head and neck malignant neoplasia, neuropathic pain
34: head and neck carcinoma, mucositis
35: head and neck carcinoma, fibrosis
36: head and neck carcinoma, mucositis, dysphagia
37: head and neck carcinoma, fibrosis, dysphagia, xerostomia
38: head and neck carcinoma, fibrosis, mucositis, dysphagia, xerostomia
all_disease_terms
l2_all_disease_terms
<char>
1: other malignant neoplasms
2: other malignant neoplasms
3: other malignant neoplasms
4: other malignant neoplasms
5: other malignant neoplasms
6: other malignant neoplasms
7: other malignant neoplasms
8: other malignant neoplasms
9: other malignant neoplasms
10: other malignant neoplasms
11: other malignant neoplasms
12: other malignant neoplasms, pain
13: chronic obstructive pulmonary disease, other malignant neoplasms
14: other malignant neoplasms
15: other malignant neoplasms
16: other malignant neoplasms, osteoradionecrosis
17: other malignant neoplasms
18: other malignant neoplasms
19: other malignant neoplasms
20: other malignant neoplasms
21: other malignant neoplasms
22: other malignant neoplasms, other malignant neoplasms
23: other malignant neoplasms, other malignant neoplasms
24: other malignant neoplasms
25: other malignant neoplasms
26: other malignant neoplasms
27: other malignant neoplasms
28: other malignant neoplasms
29: other malignant neoplasms
30: other malignant neoplasms
31: other malignant neoplasms, tracheal bronchus and lung cancer
32: other malignant neoplasms, radiation-induced disorder
33: other malignant neoplasms, neuropathic pain
34: other malignant neoplasms, mucositis
35: fibrosis, other malignant neoplasms
36: dysphagia, other malignant neoplasms, mucositis
37: dysphagia, fibrosis, other malignant neoplasms, xerostomia
38: dysphagia, fibrosis, other malignant neoplasms, mucositis, xerostomia
l2_all_disease_terms
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
case_when(
l2_all_disease_terms == "benign neoplasm" ~ "other neoplasms",
TRUE ~ l2_all_disease_terms
)
)
gwas_study_info |>
filter(grepl("other neoplasms", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: benign prostatic hyperplasia other neoplasms
2: colorectal adenoma other neoplasms
3: colorectal cancer, endometrial neoplasm other neoplasms
4: upper aerodigestive tract neoplasm other neoplasms
5: pituitary gland adenoma other neoplasms
6: nasal cavity polyp other neoplasms
7: nasopharyngeal neoplasm, hearing loss other neoplasms
8: metachronous colorectal adenoma other neoplasms
9: testicular neoplasm, hearing loss other neoplasms
10: hematopoietic and lymphoid system neoplasm other neoplasms
11: benign lipomatous neoplasm other neoplasms
12: benign neoplasm other neoplasms
13: uterine neoplasm other neoplasms
14: benign neoplasm of eye other neoplasms
15: respiratory system neoplasm other neoplasms
16: secondary malignant neoplasm other neoplasms
17: uterine leiomyoma other neoplasms
18: benign colon neoplasm other neoplasms
19: benign digestive system neoplasm other neoplasms
20: benign chondrogenic neoplasm other neoplasms
21: benign soft tissue neoplasm other neoplasms
22: benign neoplasm of skin other neoplasms
23: uterine benign neoplasm other neoplasms
24: benign ovarian neoplasm other neoplasms
25: female reproductive organ cancer other neoplasms
26: benign urinary system neoplasm other neoplasms
27: nervous system benign neoplasm other neoplasms
28: benign neoplasm of spinal cord other neoplasms
29: benign thyroid gland neoplasm other neoplasms
30: benign endocrine neoplasm other neoplasms
31: benign neoplasm of adrenal gland other neoplasms
32: benign neoplasm of parathyroid gland other neoplasms
33: benign neoplasm of pituitary gland other neoplasms
34: lymphangioma, hemangioma other neoplasms
35: malignant renal pelvis neoplasm other neoplasms
36: malignant urinary system neoplasm other neoplasms
37: rectosigmoid junction neoplasm other neoplasms
38: digestive system neoplasm other neoplasms
39: connective tissue neoplasm, bone neoplasm other neoplasms
40: connective tissue neoplasm other neoplasms
41: polyp other neoplasms
42: stomach polyp other neoplasms
43: polyp of large intestine other neoplasms
44: female genital tract polyp other neoplasms
45: uterine polyp other neoplasms
46: cervical polyp other neoplasms
47: breast benign neoplasm other neoplasms
48: pancreatic intraductal papillary-mucinous neoplasm other neoplasms
49: malignant laryngeal neoplasm other neoplasms
50: bone neoplasm other neoplasms
51: melanocytic nevus other neoplasms
52: urogenital neoplasm other neoplasms
53: intraductal breast neoplasm other neoplasms
54: connective and soft tissue neoplasm other neoplasms
55: uterine cervix neoplasm other neoplasms
56: malignant colon neoplasm other neoplasms
57: meningeal neoplasm other neoplasms
58: benign brain neoplasm other neoplasms
59: metastatic malignant neoplasm other neoplasms
60: lobular capilliary hemangioma other neoplasms
61: lymph node neoplasm other neoplasms
62: aldosterone-producing adenoma other neoplasms
63: polyp of colon other neoplasms
64: hemangioma of subcutaneous tissue other neoplasms
65: adenomatous colon polyp other neoplasms
66: anal polyp other neoplasms
67: urinary system neoplasm other neoplasms
68: benign neoplasm of stomach other neoplasms
69: polyp of gallbladder other neoplasms
70: gallbladder neoplasm other neoplasms
71: pancreatic neoplasm other neoplasms
72: hepatic hemangioma other neoplasms
73: hemangioma other neoplasms
74: polyp of rectum other neoplasms
all_disease_terms l2_all_disease_terms
gwas_study_info |>
filter(grepl("rheumatic heart disease", l2_all_disease_terms)) |>
select(all_disease_terms, l2_all_disease_terms) |>
distinct()
all_disease_terms l2_all_disease_terms
<char> <char>
1: rheumatic heart disease rheumatic heart disease
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "acne",
"acne vulgaris"
)
)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "adhd",
"attention-deficit/hyperactivity disorder"
)
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "alcohol-related disorders|alcohol and nicotine codependence",
"alcohol use disorders"
)
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "alcohol use disorder",
"alcohol use disorders"
)
)
dementia <- c("alzheimers disease biomarker measurement",
"alzheimers disease neuropathologic change",
"aids dementia",
"dementia",
"frontotemporal dementia",
"lewy body dementia",
"vascular dementia",
"alzheimers disease"
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = paste0(dementia, collapse = "(?=,|$)|\\b"),
"alzheimer's disease and other dementias"
)
)
url <- "http://www.ebi.ac.uk/ols4/api/ontologies/mesh/terms/http%253A%252F%252Fid.nlm.nih.gov%252Fmesh%252FD001008/descendants"
anxiety_terms <- get_descendants(url)
[1] "Number of terms collected:"
[1] 15
[1] "\n Some example terms"
[1] "obsessive-compulsive disorder" "generalized anxiety disorder"
[3] "neurocirculatory asthenia" "excoriation disorder"
[5] "anxiety, separation"
anxiety_terms <- c(anxiety_terms, "obsessive-compulsive symptom measurement")
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = paste0(anxiety_terms, collapse = "(?=,|$)|\\b"),
"anxiety disorders"
)
) |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "anxiety disorder|anxiety measurement",
"anxiety disorders"
)
) |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "anxiety",
"anxiety disorders"
)
)
afib_terms <- c("atrial fibrillation",
"atrial flutter",
"post-operative atrial fibrillation")
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = paste0(afib_terms, collapse = "(?=,|$)|\\b"),
"atrial fibrillation and flutter"
)
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "methamphetamine",
"amphetamine"
)
)
url <- "http://www.ebi.ac.uk/ols4/api/ontologies/cvdo/terms/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FDOID_3627/descendants"
aortic_aneurysm_terms <- get_descendants(url)
[1] "Number of terms collected:"
[1] 6
[1] "\n Some example terms"
[1] "ruptured thoracoabdominal aortic aneurysm"
[2] "ruptured abdominal aortic aneurysm"
[3] "ruptured thoracic aortic aneurysm"
[4] "abdominal aortic aneurysm"
[5] "ruptured aortic aneurysm"
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = paste0(aortic_aneurysm_terms, collapse = "(?=,|$)|\\b"),
"aortic aneurysm"
)
)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = "autism",
"autism spectrum disorders"
)
)
vision_loss_terms <- c("blindness",
"color vision disorder",
"vision disorder",
"myopia",
"refractive error",
"hyperopia",
"astigmatism",
"corneal astigmatism",
"presbyopia",
"anisometropia",
"esotropia",
"non-accomodative esotropia",
"accommodative esotropia")
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = paste0(vision_loss_terms, collapse = "(?=,|$)|\\b"),
"blindness and vision loss"
)
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "cannabis dependence",
"cannabis use disorders"
)
)
url <- "http://www.ebi.ac.uk/ols4/api/ontologies/snomed/terms/http%253A%252F%252Fsnomed.info%252Fid%252F328383001/descendants"
chronic_liver_disease_terms <- get_descendants(url)
[1] "Number of terms collected:"
[1] 114
[1] "\n Some example terms"
[1] "hepatic ascites co-occurrent with chronic active hepatitis due to toxic liver disease"
[2] "cirrhosis of liver co-occurrent and due to primary sclerosing cholangitis (disorder)"
[3] "chronic hepatitis c co-occurrent with human immunodeficiency virus infection"
[4] "primary biliary cirrhosis co-occurrent with systemic scleroderma (disorder)"
[5] "pulmonary fibrosis, hepatic hyperplasia, bone marrow hypoplasia syndrome"
chronic_liver_disease_terms <- c("primary biliary cirrhosis",
"alcoholic liver cirrhosis",
"chronic hepatitis B virus infection",
"acute-on-chronic liver failure",
chronic_liver_disease_terms)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = paste0(chronic_liver_disease_terms, collapse = "(?=,|$)|\\b"),
"cirrhosis and other chronic liver diseases"
)
)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = "liver disease",
"cirrhosis and other chronic liver diseases"
)
)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "cocaine-related disorders",
"cocaine use disorders"
)
)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "depressive symptom measurement|major depressive disorder",
"depressive disorders"
)
) |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "depressive disorder",
"depressive disorders"
)
)
gal_bile_terms = c("gallbladder disease",
"bile duct disorder",
"biliary tract disease")
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = paste0(gal_bile_terms, collapse = "(?=,|$)|\\b"),
"gallbladder and biliary diseases"
)
)
diseases <- stringr::str_split(pattern = ", ",
gwas_study_info$l2_all_disease_terms) |>
unlist() |>
stringr::str_trim()
pregnancy_terms <- grep("pregnancy", diseases, value = T)
gyno_terms <- c("endometriosis","placenta disease", pregnancy_terms)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = paste0(gyno_terms, collapse = "(?=,|$)|\\b"),
"gynecological diseases"
)
)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "bulimia nervosa|anorexia nervosa|binge eating|eating disorder",
"eating disorders"
)
) |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "anorexia",
"eating disorders"
)
)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = "headache disorder|cluster headache|migraine",
"headache disorders"
)
)
== coronary artery disease (https://www.ncbi.nlm.nih.gov/books/NBK209964/)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = "coronary artery disease",
"ischemic heart disease"
)
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "opioid dependence|opioid use disorder",
"opioid use disorders"
)
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "parkinsons disease",
"parkinson's disease"
)
)
url <- "http://www.ebi.ac.uk/ols4/api/ontologies/mondo/terms/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FMONDO_0002028/descendants"
personality_disorders <- get_descendants(url)
[1] "Number of terms collected:"
[1] 10
[1] "\n Some example terms"
[1] "obsessive-compulsive personality disorder"
[2] "narcissistic personality disorder"
[3] "schizotypal personality disorder"
[4] "histrionic personality disorder"
[5] "antisocial personality disorder"
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = paste0(personality_disorders, collapse = "(?=,|$)|\\b"),
"personality disorders"
)
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "heroin dependence|drug dependence|nictone dependence|substance abuse|drug misuse|alcohol use disorders delirium",
"other drug use disorders"
)
)
url <- "http://www.ebi.ac.uk/ols4/api/ontologies/doid/terms/http%253A%252F%252Fpurl.obolibrary.org%252Fobo%252FDOID_535/descendants"
sleep_disorders <- get_descendants(url)
[1] "Number of terms collected:"
[1] 16
[1] "\n Some example terms"
[1] "periodic limb movement disorder" "advanced sleep phase syndrome 3"
[3] "advanced sleep phase syndrome 2" "advanced sleep phase syndrome 1"
[5] "advanced sleep phase syndrome 4"
url <- "http://www.ebi.ac.uk/ols4/api/ontologies/efo/terms/http%253A%252F%252Fwww.ebi.ac.uk%252Fefo%252FEFO_0008568/descendants"
other_sleep_disorders <- get_descendants(url)
[1] "Number of terms collected:"
[1] 26
[1] "\n Some example terms"
[1] "autosomal dominant cerebellar ataxia, deafness and narcolepsy"
[2] "hereditary sensory neuropathy-deafness-dementia syndrome"
[3] "rapid eye movement sleep disorder"
[4] "substance-induced sleep disorder"
[5] "drug induced central sleep apnea"
sleep_disorders <- c(sleep_disorders,
other_sleep_disorders)
gwas_study_info = gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms,
pattern = paste0(sleep_disorders, collapse = "(?=,|$)|\\b"),
"sleep disorders"
)
)
and remove alz, parks, dementia
other_mental_disorders <- c("schizophrenia",
"manic or hypomanic episode",
"mental or behavioural disorder",
"mental disorder"
)
other_neuro <- c("mild neurocognitive disorder",
"hiv-associated neurocognitive disorder")
disturbances of sensation of smell and taste
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "disorderss",
"disorders"
)
)
gwas_study_info =
gwas_study_info |>
mutate(l2_all_disease_terms =
stringr::str_replace_all(l2_all_disease_terms ,
pattern = "anxiety disorders disorderss",
"anxiety disorders"
)
)
gbd_data <- data.table::fread(here::here("data/gbd/ihme_gbd_2019_global_disease_burden_rate_all_ages.csv"))
gbd_data$cause <- stringr::str_remove_all(gbd_data$cause, ",")
diseases <- stringr::str_split(pattern = ", ",
gwas_study_info$l2_all_disease_terms[gwas_study_info$l2_all_disease_terms != ""]) |>
unlist() |>
stringr::str_trim()
gbd_data$cause[!tolower(gbd_data$cause) %in% unique(diseases)] |> sort()
[1] "Acute glomerulonephritis"
[2] "Acute glomerulonephritis"
[3] "Acute glomerulonephritis"
[4] "Bacterial skin diseases"
[5] "Bacterial skin diseases"
[6] "Bacterial skin diseases"
[7] "Cardiomyopathy and myocarditis"
[8] "Cardiomyopathy and myocarditis"
[9] "Cardiomyopathy and myocarditis"
[10] "Chronic kidney disease"
[11] "Chronic kidney disease"
[12] "Chronic kidney disease"
[13] "Congenital birth defects"
[14] "Congenital birth defects"
[15] "Congenital birth defects"
[16] "Diabetes mellitus type 1"
[17] "Diabetes mellitus type 1"
[18] "Diabetes mellitus type 1"
[19] "Diabetes mellitus type 2"
[20] "Diabetes mellitus type 2"
[21] "Diabetes mellitus type 2"
[22] "Drug use disorders"
[23] "Drug use disorders"
[24] "Drug use disorders"
[25] "Endocrine metabolic blood and immune disorders"
[26] "Endocrine metabolic blood and immune disorders"
[27] "Endocrine metabolic blood and immune disorders"
[28] "Fungal skin diseases"
[29] "Fungal skin diseases"
[30] "Fungal skin diseases"
[31] "Hemoglobinopathies and hemolytic anemias"
[32] "Hemoglobinopathies and hemolytic anemias"
[33] "Hemoglobinopathies and hemolytic anemias"
[34] "Idiopathic developmental intellectual disability"
[35] "Idiopathic developmental intellectual disability"
[36] "Idiopathic epilepsy"
[37] "Idiopathic epilepsy"
[38] "Idiopathic epilepsy"
[39] "Inguinal femoral and abdominal hernia"
[40] "Inguinal femoral and abdominal hernia"
[41] "Inguinal femoral and abdominal hernia"
[42] "Interstitial lung disease and pulmonary sarcoidosis"
[43] "Interstitial lung disease and pulmonary sarcoidosis"
[44] "Interstitial lung disease and pulmonary sarcoidosis"
[45] "Lower extremity peripheral arterial disease"
[46] "Lower extremity peripheral arterial disease"
[47] "Lower extremity peripheral arterial disease"
[48] "Neuroblastoma and other peripheral nervous cell tumors"
[49] "Neuroblastoma and other peripheral nervous cell tumors"
[50] "Neuroblastoma and other peripheral nervous cell tumors"
[51] "Non-rheumatic valvular heart disease"
[52] "Non-rheumatic valvular heart disease"
[53] "Non-rheumatic valvular heart disease"
[54] "Oral disorders"
[55] "Oral disorders"
[56] "Oral disorders"
[57] "Other cardiovascular and circulatory diseases"
[58] "Other cardiovascular and circulatory diseases"
[59] "Other chronic respiratory diseases"
[60] "Other digestive diseases"
[61] "Other mental disorders"
[62] "Other mental disorders"
[63] "Other mental disorders"
[64] "Other musculoskeletal disorders"
[65] "Other musculoskeletal disorders"
[66] "Other neurological disorders"
[67] "Other neurological disorders"
[68] "Other neurological disorders"
[69] "Other sense organ diseases"
[70] "Other sense organ diseases"
[71] "Other sense organ diseases"
[72] "Other skin and subcutaneous diseases"
[73] "Other skin and subcutaneous diseases"
[74] "Other skin and subcutaneous diseases"
[75] "Paralytic ileus and intestinal obstruction"
[76] "Paralytic ileus and intestinal obstruction"
[77] "Paralytic ileus and intestinal obstruction"
[78] "Pulmonary Arterial Hypertension"
[79] "Pulmonary Arterial Hypertension"
[80] "Pulmonary Arterial Hypertension"
[81] "Scabies"
[82] "Scabies"
[83] "Scabies"
[84] "Sudden infant death syndrome"
[85] "Total burden related to Non-alcoholic fatty liver disease (NAFLD)"
[86] "Total burden related to Non-alcoholic fatty liver disease (NAFLD)"
[87] "Total burden related to Non-alcoholic fatty liver disease (NAFLD)"
[88] "Upper digestive system diseases"
[89] "Upper digestive system diseases"
[90] "Upper digestive system diseases"
[91] "Urinary diseases and male infertility"
[92] "Urinary diseases and male infertility"
[93] "Urinary diseases and male infertility"
[94] "Vascular intestinal disorders"
[95] "Vascular intestinal disorders"
[96] "Vascular intestinal disorders"
[97] "Viral skin diseases"
[98] "Viral skin diseases"
[99] "Viral skin diseases"
gbd_data =
gbd_data |>
mutate(cause = tolower(cause))
gwas_disease_traits = data.frame(cause = diseases)
# gwas_study_info |>
# filter(DISEASE_STUDY == T) |>
# select(all_disease_terms, l1_all_disease_terms, cause = l2_all_disease_terms) |>
# distinct()
left_join(gwas_disease_traits,
gbd_data) |>
head()
Joining with `by = join_by(cause)`
Warning in left_join(gwas_disease_traits, gbd_data): Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 3 of `x` matches multiple rows in `y`.
ℹ Row 19 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
"many-to-many"` to silence this warning.
cause measure
1 idiopathic cardiomyopathy <NA>
2 cleft lip <NA>
3 tracheal bronchus and lung cancer DALYs (Disability-Adjusted Life Years)
4 tracheal bronchus and lung cancer Prevalence
5 tracheal bronchus and lung cancer Incidence
6 tracheal bronchus and lung cancer DALYs (Disability-Adjusted Life Years)
location sex age metric year val upper lower
1 <NA> <NA> <NA> <NA> NA NA NA NA
2 <NA> <NA> <NA> <NA> NA NA NA NA
3 Global Both All ages Rate 2019 580.36100 627.79984 532.74652
4 Global Both All ages Rate 2019 40.27440 43.51721 37.12978
5 Global Both All ages Rate 2019 28.16826 30.49575 25.77712
6 Global Both All ages Rate 2019 580.36100 627.79984 532.74652
gwas_study_info |> select(cause = l2_all_disease_terms) |>
distinct() |>
left_join(gbd_data) |>
head()
Joining with `by = join_by(cause)`
cause measure
<char> <char>
1: <NA>
2: idiopathic cardiomyopathy <NA>
3: cleft lip <NA>
4: tracheal bronchus and lung cancer DALYs (Disability-Adjusted Life Years)
5: tracheal bronchus and lung cancer Prevalence
6: tracheal bronchus and lung cancer Incidence
location sex age metric year val upper lower
<char> <char> <char> <char> <int> <num> <num> <num>
1: <NA> <NA> <NA> <NA> NA NA NA NA
2: <NA> <NA> <NA> <NA> NA NA NA NA
3: <NA> <NA> <NA> <NA> NA NA NA NA
4: Global Both All ages Rate 2019 580.36100 627.79984 532.74652
5: Global Both All ages Rate 2019 40.27440 43.51721 37.12978
6: Global Both All ages Rate 2019 28.16826 30.49575 25.77712
diseases <- stringr::str_split(pattern = ", ",
gwas_study_info$l2_all_disease_terms[gwas_study_info$l2_all_disease_terms != ""]) |>
unlist() |>
stringr::str_trim()
length(unique(diseases))
[1] 1617
# make frequency table
freq <- table(as.factor(diseases))
# sort in decreasing order
freq_sorted <- sort(freq, decreasing = TRUE)
# show top N, e.g. top 10
head(freq_sorted, 10)
kidney disease hypertension
10915 7096
type 2 diabetes mellitus other neoplasms
922 521
depressive disorders alzheimer's disease and other dementias
513 509
ischemic heart disease breast cancer
501 379
schizophrenia asthma
368 348
gwas_study_info <- fwrite(gwas_study_info,
here::here("output/gwas_cat/gwas_study_info_trait_group_l2.csv"))
sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 15.6.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] jsonlite_2.0.0 httr_1.4.7 stringr_1.5.1 ggplot2_3.5.2
[5] data.table_1.17.8 dplyr_1.1.4 workflowr_1.7.1
loaded via a namespace (and not attached):
[1] gtable_0.3.6 compiler_4.3.1 renv_1.0.3 promises_1.3.3
[5] tidyselect_1.2.1 Rcpp_1.1.0 git2r_0.36.2 callr_3.7.6
[9] later_1.4.2 jquerylib_0.1.4 scales_1.4.0 yaml_2.3.10
[13] fastmap_1.2.0 here_1.0.1 R6_2.6.1 generics_0.1.4
[17] curl_6.4.0 knitr_1.50 tibble_3.3.0 rprojroot_2.1.0
[21] RColorBrewer_1.1-3 bslib_0.9.0 pillar_1.11.0 rlang_1.1.6
[25] cachem_1.1.0 stringi_1.8.7 httpuv_1.6.16 xfun_0.52
[29] getPass_0.2-4 fs_1.6.6 sass_0.4.10 cli_3.6.5
[33] withr_3.0.2 magrittr_2.0.3 ps_1.9.1 grid_4.3.1
[37] digest_0.6.37 processx_3.8.6 rstudioapi_0.17.1 lifecycle_1.0.4
[41] vctrs_0.6.5 evaluate_1.0.4 glue_1.8.0 farver_2.1.2
[45] whisker_0.4.1 rmarkdown_2.29 tools_4.3.1 pkgconfig_2.0.3
[49] htmltools_0.5.8.1