Last updated: 2020-06-23

Checks: 7 0

Knit directory: PSYMETAB/

This reproducible R Markdown analysis was created with workflowr (version 1.6.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20191126) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    ._docs
    Ignored:    .drake/
    Ignored:    analysis/.Rhistory
    Ignored:    analysis/._GWAS.Rmd
    Ignored:    analysis/._data_processing_in_genomestudio.Rmd
    Ignored:    analysis/._quality_control.Rmd
    Ignored:    analysis/GWAS/
    Ignored:    analysis/PRS/
    Ignored:    analysis/QC/
    Ignored:    analysis/figure/
    Ignored:    analysis_prep_1_clustermq.out
    Ignored:    analysis_prep_2_clustermq.out
    Ignored:    analysis_prep_3_clustermq.out
    Ignored:    analysis_prep_4_clustermq.out
    Ignored:    data/processed/
    Ignored:    data/raw/
    Ignored:    download_impute_1_clustermq.out
    Ignored:    init_analysis_1_clustermq.out
    Ignored:    init_analysis_2_clustermq.out
    Ignored:    init_analysis_3_clustermq.out
    Ignored:    init_analysis_4_clustermq.out
    Ignored:    init_analysis_5_clustermq.out
    Ignored:    init_analysis_6_clustermq.out
    Ignored:    packrat/lib-R/
    Ignored:    packrat/lib-ext/
    Ignored:    packrat/lib/
    Ignored:    post_impute_1_clustermq.out
    Ignored:    pre_impute_qc_1_clustermq.out
    Ignored:    process_init_10_clustermq.out
    Ignored:    process_init_11_clustermq.out
    Ignored:    process_init_12_clustermq.out
    Ignored:    process_init_13_clustermq.out
    Ignored:    process_init_14_clustermq.out
    Ignored:    process_init_15_clustermq.out
    Ignored:    process_init_16_clustermq.out
    Ignored:    process_init_17_clustermq.out
    Ignored:    process_init_18_clustermq.out
    Ignored:    process_init_19_clustermq.out
    Ignored:    process_init_1_clustermq.out
    Ignored:    process_init_20_clustermq.out
    Ignored:    process_init_21_clustermq.out
    Ignored:    process_init_22_clustermq.out
    Ignored:    process_init_23_clustermq.out
    Ignored:    process_init_24_clustermq.out
    Ignored:    process_init_25_clustermq.out
    Ignored:    process_init_26_clustermq.out
    Ignored:    process_init_27_clustermq.out
    Ignored:    process_init_28_clustermq.out
    Ignored:    process_init_29_clustermq.out
    Ignored:    process_init_2_clustermq.out
    Ignored:    process_init_30_clustermq.out
    Ignored:    process_init_31_clustermq.out
    Ignored:    process_init_3_clustermq.out
    Ignored:    process_init_4_clustermq.out
    Ignored:    process_init_5_clustermq.out
    Ignored:    process_init_6_clustermq.out
    Ignored:    process_init_7_clustermq.out
    Ignored:    process_init_8_clustermq.out
    Ignored:    process_init_9_clustermq.out
    Ignored:    prs_1_clustermq.out
    Ignored:    prs_2_clustermq.out
    Ignored:    prs_3_clustermq.out
    Ignored:    prs_4_clustermq.out

Untracked files:
    Untracked:  analysis/genetic_quality_control.Rmd
    Untracked:  analysis/plans.Rmd
    Untracked:  analysis_prep.log
    Untracked:  download_impute.log
    Untracked:  grs.log
    Untracked:  init_analysis.log
    Untracked:  process_init.log
    Untracked:  prs.log

Unstaged changes:
    Modified:   analysis/GWAS.Rmd
    Modified:   analysis/data_sources.Rmd
    Deleted:    analysis/project.Rmd
    Modified:   analysis/quality_control.Rmd
    Modified:   cache_log.csv
    Modified:   post_impute.log
    Modified:   slurm_clustermq.tmpl

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
Rmd 9c2cda4 Jenny Sjaarda 2020-06-23 wflow_publish(“analysis/pheno_quality_control.Rmd”)

The following document outlines and summarizes the phenotypic quality control and processing procedure that was followed to create a clean dataset.

  • Phenoptyic data was extracted and provided by Celine (see Data sources).
  • In January 2020, Celine detected various problems with the phenotype data with unknown explanation (possible manual error).
  • After discussion with Celine, Chin, and Enrique (manager of the database), it was decided that we need to make a few changes to way data is entered into the database to help avoid manual errors.
  • Summary of the agreed changes is below (email correspondance between Celine and Enrique on 13/02/2020 and translation):

    Je fais un petit résumé de ce dont on a convenu :
    1. Modifier l’entrée d’un nouveau mois, pour que ce soit uniquement possible de choisir une entrée proposée (0-1-2-3-6-…) et non taper un autre chiffre
    2. Ajouter un champ pour l’identité de la personne qui entre ou modifie un bilan et un champ pour la date à laquelle a lieu cette modification (emplacement proposé : en dessous de « contrôle » ?)
    3. Eliminer l’option de reprendre les données d’un mois précédent et à la place : vider tous les champs SAUF ceux entrés dans les fenêtres « Antipsychotiques » et « Co-Médication »
    4. Ajouter une validation de la date d’évaluation au moment de sauver les données d’un bilan entré
    A short summary of what we agreed:
    1. Modify the entry for a new month, so that it is only possible to choose a proposed entry (0-1-2-3-6 -…) and not enter another number
    2. Add a field for the identity of the person entering or modifying a balance sheet and a field for the date on which this modification takes place (proposed location: below “control”?)
    3. Eliminate the option to resume data from a previous month and instead: empty all fields EXCEPT those entered in the “Antipsychotics” and “Co-Medication” windows
    4. Add validation of the evaluation date when saving the data of an entered balance sheet
  • After this update and manual revisions, new data was provided on 16/03/2020: data/raw/phenotype_data/PHENO_GWAS_160320_noaccent.csv

pheno_file <- "data/raw/phenotype_data/PHENO_GWAS_160320_noaccent.csv"

pheno_raw <- readr::read_delim(pheno_file, col_types = cols(.default = col_character()), delim = ",") %>% type_convert(col_types = cols())

process_pheno_raw <- function(pheno_raw) {
  
  output <- pheno_raw %>%
    
    mutate(Date = as.Date(Date, format = '%d.%m.%Y')) %>% arrange(Date)  %>%
    mutate(AP1 = gsub(" ", "_", AP1)) %>% mutate_at("AP1", as.factor) %>%
    mutate(AP1 = gsub("_.*$", "", AP1)) %>%
    mutate(AP1 = na_if(AP1, "")) %>% ## merge retard/depot with original
    group_by(GEN) %>%
    mutate(sex = check_sex(Sexe)) %>%
    mutate_at("PatientsTaille", as.numeric) %>%
    mutate(height = check_height(PatientsTaille)) %>% ### take average of all heights
    mutate_at(vars(Quetiapine:Doxepine), list(ever_drug = ever_drug)) %>%
    ungroup() %>%  ### create ever on any drug
    mutate(BMI = Poids / (PatientsTaille / 100) ^ 2) %>%
    group_by(GEN, PatientsRecNum) %>% mutate(drug_instance = row_number()) %>%
    mutate(date_difference = as.numeric(difftime(lag(Date), Date, units = "days"))) %>%
    mutate(
      follow_up = case_when(
        Mois > 12 ~ "greater_12months",
        date_difference > 0 ~ "month_descrepency",
        Mois == 0 ~ "new_regimen",
        abs(date_difference) >= (Mois - lag(Mois)) *
          30 - leeway_time &
          abs(date_difference) <= (Mois - lag(Mois)) * 30 + leeway_time ~ "sensible",
        is.na(date_difference) ~ "NA",
        ### these are follow-ups that didn't start at month 0
        TRUE ~ "leeway_exceeds"
      )
    ) %>%
    mutate(AP1_mod = paste0(AP1, "_round", rename_meds(follow_up))) %>%
    mutate(problems = case_when(Mois < lag(Mois) |
                                  date_difference >= 0 ~ "problem",
                                TRUE ~ "fine")) %>%
    mutate(drug_match = check_drug(PatientsRecNum, AP1))
  
  return(output)

}

options("tidylog.display" = list())  # turn off

t <- process_pheno_raw(pheno_raw)

options("tidylog.display" = NULL)    # turn on

## missing date
missing_date <- t %>% filter(is.na(Date)) %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "missing_Date")
filter (grouped): removed 10,000 rows (>99%), 19 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
## missing sex
na_sex <- t %>% filter(is.na(sex)) %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "sex_problem")
filter (grouped): removed 9,780 rows (98%), 239 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
## missing AP1
missing_AP1 <- t %>% filter(is.na(AP1)) %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "missing_AP1")
filter (grouped): removed 9,987 rows (>99%), 32 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
## missing PatientsRecNum (none)
missing_patrec <- t %>% filter(is.na(PatientsRecNum)) %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "missing_PatientsRecNum")
filter (grouped): removed all rows (100%)
mutate (grouped): new variable 'problem_category' with 0 unique values and 100% NA
## month mismatch
leeway_exceeds <- t %>% filter(follow_up == "leeway_exceeds") %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "month_mismatch")
filter (grouped): removed 9,813 rows (98%), 206 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
head(leeway_exceeds)
# A tibble: 6 x 4
# Groups:   GEN, PatientsRecNum [6]
  GEN      Date       PatientsRecNum problem_category
  <chr>    <date>              <dbl> <chr>           
1 YSFHMSHX 7-07-18                56 month_mismatch  
2 QFIDUFIG 8-08-24               169 month_mismatch  
3 BSBZPQOV 8-09-03                28 month_mismatch  
4 QCXAFXPK 8-09-16               190 month_mismatch  
5 LHORDBHE 8-09-21                 3 month_mismatch  
6 SHHQLBQL 8-12-08               150 month_mismatch  
t %>% filter(GEN=="YSFHMSHX") %>% dplyr::select(follow_up, Date, date_difference, Mois)
filter (grouped): removed 10,003 rows (>99%), 16 rows remaining
Adding missing grouping variables: `GEN`, `PatientsRecNum`
# A tibble: 16 x 6
# Groups:   GEN, PatientsRecNum [3]
   GEN      PatientsRecNum follow_up        Date       date_difference  Mois
   <chr>             <dbl> <chr>            <date>               <dbl> <dbl>
 1 YSFHMSHX             56 new_regimen      7-01-14                 NA     0
 2 YSFHMSHX             56 leeway_exceeds   7-07-18               -185     3
 3 YSFHMSHX             56 sensible         7-10-28               -102     6
 4 YSFHMSHX             56 sensible         8-01-28                -92     9
 5 YSFHMSHX             56 sensible         8-04-27                -90    12
 6 YSFHMSHX             57 new_regimen      8-10-30                 NA     0
 7 YSFHMSHX             57 sensible         8-11-26                -27     1
 8 YSFHMSHX             57 sensible         8-12-28                -32     3
 9 YSFHMSHX             58 new_regimen      9-02-19                 NA     0
10 YSFHMSHX             58 sensible         9-03-18                -27     1
11 YSFHMSHX             58 sensible         9-04-19                -32     2
12 YSFHMSHX             58 sensible         9-05-25                -36     3
13 YSFHMSHX             58 sensible         9-08-31                -98     6
14 YSFHMSHX             58 sensible         9-12-07                -98     9
15 YSFHMSHX             58 sensible         10-03-15               -98    12
16 YSFHMSHX             58 greater_12months 17-02-22             -2536  1204
## month discrepency
problem_ids <- t %>% filter(problems == "problem") %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "month_discrepency")
filter (grouped): removed 9,980 rows (>99%), 39 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
head(problem_ids)
# A tibble: 6 x 4
# Groups:   GEN, PatientsRecNum [6]
  GEN      Date       PatientsRecNum problem_category 
  <chr>    <date>              <dbl> <chr>            
1 JWWJQJGS 10-03-01             2762 month_discrepency
2 YTYDZYJH 10-03-14             6880 month_discrepency
3 CZAFDOTO 10-10-07              476 month_discrepency
4 PBAIFEMQ 13-06-11             1425 month_discrepency
5 UGCKMMCC 14-03-26             1663 month_discrepency
6 TIEQMSVB 15-02-04             2592 month_discrepency
t %>% filter(GEN=="JWWJQJGS") %>% dplyr::select(follow_up, Date, date_difference, Mois, problems)
filter (grouped): removed 10,010 rows (>99%), 9 rows remaining
Adding missing grouping variables: `GEN`, `PatientsRecNum`
# A tibble: 9 x 7
# Groups:   GEN, PatientsRecNum [2]
  GEN      PatientsRecNum follow_up    Date       date_difference  Mois problems
  <chr>             <dbl> <chr>        <date>               <dbl> <dbl> <chr>   
1 JWWJQJGS           2762 NA           10-01-24                NA    12 fine    
2 JWWJQJGS           2762 leeway_exce… 10-03-01               -36     2 problem 
3 JWWJQJGS            493 new_regimen  10-03-02                NA     0 fine    
4 JWWJQJGS            493 sensible     10-03-21               -19     1 fine    
5 JWWJQJGS            493 sensible     10-06-17               -88     3 fine    
6 JWWJQJGS           2762 sensible     10-06-17              -108     6 fine    
7 JWWJQJGS            493 sensible     10-09-21               -96     6 fine    
8 JWWJQJGS           2762 sensible     10-09-21               -96     9 fine    
9 JWWJQJGS            493 sensible     11-01-24              -125    12 fine    
# drug mismatch
problem_drugs <- t %>% filter(drug_match=="non-match") %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "drug_mismatch")
filter (grouped): removed 9,979 rows (>99%), 40 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
flagged_rows <- rbind(missing_date, na_sex, missing_AP1, leeway_exceeds, problem_ids, problem_drugs)
write.table(flagged_rows, "data/raw/phenotype_data/PHENO_GWAS_160320_flagged_rows.txt", row.names = F,  col.names = T, quote = T)

table(flagged_rows$problem_category)

    drug_mismatch       missing_AP1      missing_Date month_discrepency 
               40                32                19                39 
   month_mismatch       sex_problem 
              206               239 
  • Summary of problems identified:
    1. Missing date: empty date column, 19 individuals/19 rows, e.g. DMLWTARC.
    2. Sex problems: either sex is missing for all instances of an individual, or both sexes are listed for one individual, 47 individuals/239 rows, e.g GYYEHMDR (both sexes are listed), IDDAXPMK (empty sex fields).
    3. Missing AP1: follow-up drug is missing, 28 individuals/32 rows, e.g. YSTTKYJE.
    4. Month discrepancy: if participant data is sorted by date, Mois column is less than the previous row (for e.g. month 3 occurs on January 1/2010, but month 0 occurs on March 1/2010 for the same PatientsRecNum), 38 individuals/39 rows, e.g. JWWJQJGS on 01-03-2010 indicates month 2 which occurs after month 12 on 01-24-2010 at the same PatientsRecNum of 2762.
    5. Month mismatch: date between two follow-ups is > 90 days off based on the Date and Mois column, these may not be as big of a problem - but I have still flagged them (for e.g. say participant has the following entries: month 0 on January 1, and month 3 on September 1. The number of days between those two dates is 244 which is greater than 3 - the month follow-up it was supposed to be based on the Mois column - times 30 + 90 days, in this case 3*30 + 90 = 180. Since 244 is greater than 180, this follow-up for at the month 3 mark would be flagged). Note that I chose 90 days as an arbitrary cutoff. There are 162 individuals in this category/206 rows, e.g. YSFHMSHX on 07-07-2018.
    6. Drug mis-match: two drugs listed for the same GEN and PatientsRecNum, 25 individuals, not sure the number of rows as it’s unclear which ones are correct, e.g. BPOCXXYD has both aripiprazole and amisulpride listed for PatientsRecNum 18.
  • Sent data/raw/phenotype_data/PHENO_GWAS_160320_flagged_rows.txt to Celine for revision (along with help of Claire, Marianna and Nermine).
  • New data received on 16/04/2020: PHENO_GWAS_160420.xlsx (processed according to description at Data Sources page).
  • Procedure above was repeated to see if all problems were solved.
pheno_file2 <- "data/raw/phenotype_data/PHENO_GWAS_160420_noaccent.csv"
pheno_raw2 <- readr::read_delim(pheno_file2, col_types = cols(.default = col_character()), delim = ",") %>% type_convert(col_types = cols())

options("tidylog.display" = list())  # turn off

t <- process_pheno_raw(pheno_raw2)

options("tidylog.display" = NULL)    # turn on

## repeat above procedure to see if all problems are fixed

## missing date
missing_date <- t %>% filter(is.na(Date)) %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "missing_Date")
filter (grouped): removed 9,864 rows (>99%), 10 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
## missing sex
na_sex <- t %>% filter(is.na(sex)) %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "sex_problem")
filter (grouped): removed 9,872 rows (>99%), 2 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
## missing AP1
missing_AP1 <- t %>% filter(is.na(AP1)) %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "missing_AP1")
filter (grouped): removed 9,867 rows (>99%), 7 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
## missing PatientsRecNum (none)
missing_patrec <- t %>% filter(is.na(PatientsRecNum)) %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "missing_PatientsRecNum")
filter (grouped): removed all rows (100%)
mutate (grouped): new variable 'problem_category' with 0 unique values and 100% NA
## month mismatch
leeway_exceeds <- t %>% filter(follow_up == "leeway_exceeds") %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "month_mismatch")
filter (grouped): removed 9,687 rows (98%), 187 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
## month discrepency
problem_ids <- t %>% filter(problems == "problem") %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "month_discrepency")
filter (grouped): removed 9,851 rows (>99%), 23 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
# drug mismatch
problem_drugs <- t %>% filter(drug_match=="non-match") %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "drug_mismatch")
filter (grouped): removed 9,857 rows (>99%), 17 rows remaining
mutate (grouped): new variable 'problem_category' with one unique value and 0% NA
flagged_rows2 <- rbind(missing_date, na_sex, missing_AP1, leeway_exceeds, problem_ids, problem_drugs)

table(flagged_rows2$problem_category)

    drug_mismatch       missing_AP1      missing_Date month_discrepency 
               17                 7                10                23 
   month_mismatch       sex_problem 
              187                 2 
t %>% filter(drug_match=="non-match") %>% dplyr::select(AP1, drug_match) %>% unique %>% arrange(GEN)
filter (grouped): removed 9,857 rows (>99%), 17 rows remaining
Adding missing grouping variables: `GEN`, `PatientsRecNum`
# A tibble: 8 x 4
# Groups:   GEN, PatientsRecNum [4]
  GEN      PatientsRecNum AP1          drug_match
  <chr>             <dbl> <chr>        <chr>     
1 BPOCXXYD             18 Aripiprazole non-match 
2 BPOCXXYD             18 Amisulpride  non-match 
3 JTEXKBJN           5374 Mirtazapine  non-match 
4 JTEXKBJN           5374 Lithium      non-match 
5 KOPRATFS           3418 Mirtazapine  non-match 
6 KOPRATFS           3418 Quetiapine   non-match 
7 VGWWZXDK            311 Amisulpride  non-match 
8 VGWWZXDK            311 Risperidone  non-match 
  • Still a few problems remained:
    1. Missing date: visits with missing date should be removed.
    2. Sex problems: participants with missing sex information or both sexes listed under different visits should be removed.
    3. Missing AP1: visits with missing AP1 information should be removed.
    4. Month discrepancy: patients who have visits after a previous visit (according to the Date column), but the Mois column suggests this visit occured before a previous visit have been checked and the Mois column should be ignored. Celine’s explanation for why these were not all corrected …we can not correct the month entry without deleting and reentering all data from one visit… thus we only checked that the dates were correct! It means that the mistakes you still have with month mismatchs, you should use dates without considering the mois column.
    5. Month mismatch: similar to (4), Mois column should be ignored.
    6. Drug mis-match: 4 participants identified with multiple drugs listed for the same PatientsRecNum, these were sent to Celine and corrected in subsequent data extraction (see table above for list of particpants corrected).
  • New data received on 16/04/2020: PHENO_GWAS_160420_corr.xlsx (processed according to description at Data Sources page).
  • Procedure above was repeated to confirm that Drug mis-match issues were solved (no other changes were made to the database).
pheno_file3 <- "data/raw/phenotype_data/PHENO_GWAS_160420_corr_noaccent.csv"
pheno_raw3 <- readr::read_delim(pheno_file3, col_types = cols(.default = col_character()), delim = ",") %>% type_convert(col_types = cols())

options("tidylog.display" = list())  # turn off

t <- process_pheno_raw(pheno_raw3)

options("tidylog.display" = NULL)    # turn on

# drug mismatch
problem_drugs <- t %>% filter(drug_match=="non-match") %>% dplyr::select(GEN, Date, PatientsRecNum) %>%
  mutate(problem_category = "drug_mismatch")
filter (grouped): removed all rows (100%)
mutate (grouped): new variable 'problem_category' with 0 unique values and 100% NA
t %>% filter(drug_match=="non-match") %>% dplyr::select(AP1, drug_match) %>% unique %>% arrange(GEN)
filter (grouped): removed all rows (100%)
Adding missing grouping variables: `GEN`, `PatientsRecNum`
# A tibble: 0 x 4
# Groups:   GEN, PatientsRecNum [0]
# … with 4 variables: GEN <chr>, PatientsRecNum <dbl>, AP1 <chr>,
#   drug_match <chr>
# empty table 

sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS: /data/sgg2/jenny/bin/R-3.5.3/lib64/R/lib/libRblas.so
LAPACK: /data/sgg2/jenny/bin/R-3.5.3/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tidylog_1.0.1           OpenImageR_1.1.6        fuzzyjoin_0.1.5        
 [4] kableExtra_1.1.0        R.utils_2.9.2           R.oo_1.23.0            
 [7] R.methodsS3_1.7.1       TwoSampleMR_0.4.25      reader_1.0.6           
[10] NCmisc_1.1.6            optparse_1.6.4          readxl_1.3.1           
[13] ggthemes_4.2.0          tryCatchLog_1.1.6       futile.logger_1.4.3    
[16] DataExplorer_0.8.0      taRifx_1.0.6.1          qqman_0.1.4            
[19] MASS_7.3-51.5           bit64_0.9-7             bit_1.1-14             
[22] rslurm_0.5.0            rmeta_3.0               devtools_2.2.1         
[25] usethis_1.5.1           data.table_1.12.8       clustermq_0.8.8.1      
[28] future.batchtools_0.8.1 future_1.15.1           rlang_0.4.5            
[31] knitr_1.26              drake_7.12.0.9000       forcats_0.4.0          
[34] stringr_1.4.0           dplyr_0.8.3             purrr_0.3.3            
[37] readr_1.3.1             tidyr_1.0.3             tibble_2.1.3           
[40] ggplot2_3.2.1           tidyverse_1.3.0         pacman_0.5.1           
[43] processx_3.4.1          workflowr_1.6.0        

loaded via a namespace (and not attached):
  [1] backports_1.1.6      plyr_1.8.5           igraph_1.2.5        
  [4] lazyeval_0.2.2       storr_1.2.1          listenv_0.8.0       
  [7] digest_0.6.25        htmltools_0.4.0      tiff_0.1-5          
 [10] fansi_0.4.1          magrittr_1.5         checkmate_1.9.4     
 [13] memoise_1.1.0        base64url_1.4        remotes_2.1.0       
 [16] globals_0.12.5       modelr_0.1.5         prettyunits_1.1.0   
 [19] jpeg_0.1-8.1         colorspace_1.4-1     rvest_0.3.5         
 [22] rappdirs_0.3.1       haven_2.2.0          xfun_0.11           
 [25] callr_3.4.0          crayon_1.3.4         jsonlite_1.6        
 [28] brew_1.0-6           glue_1.4.0           gtable_0.3.0        
 [31] webshot_0.5.2        pkgbuild_1.0.6       scales_1.1.0        
 [34] futile.options_1.0.1 DBI_1.1.0            Rcpp_1.0.3          
 [37] xtable_1.8-4         viridisLite_0.3.0    progress_1.2.2      
 [40] txtq_0.2.0           clisymbols_1.2.0     htmlwidgets_1.5.1   
 [43] httr_1.4.1           getopt_1.20.3        calibrate_1.7.5     
 [46] ellipsis_0.3.0       pkgconfig_2.0.3      dbplyr_1.4.2        
 [49] utf8_1.1.4           tidyselect_0.2.5     reshape2_1.4.3      
 [52] later_1.0.0          munsell_0.5.0        cellranger_1.1.0    
 [55] tools_3.5.3          cli_2.0.1            generics_0.0.2      
 [58] broom_0.5.3          fastmap_1.0.1        evaluate_0.14       
 [61] yaml_2.2.0           fs_1.3.1             packrat_0.5.0       
 [64] nlme_3.1-143         mime_0.8             whisker_0.4         
 [67] formatR_1.7          proftools_0.99-2     xml2_1.2.2          
 [70] compiler_3.5.3       rstudioapi_0.10      png_0.1-7           
 [73] filelock_1.0.2       testthat_2.3.1       reprex_0.3.0        
 [76] stringi_1.4.5        ps_1.3.0             desc_1.2.0          
 [79] lattice_0.20-38      vctrs_0.2.4          pillar_1.4.3        
 [82] lifecycle_0.1.0      networkD3_0.4        httpuv_1.5.2        
 [85] R6_2.4.1             promises_1.1.0       gridExtra_2.3       
 [88] sessioninfo_1.1.1    codetools_0.2-16     lambda.r_1.2.4      
 [91] assertthat_0.2.1     pkgload_1.0.2        rprojroot_1.3-2     
 [94] withr_2.1.2          batchtools_0.9.12    parallel_3.5.3      
 [97] hms_0.5.3            grid_3.5.3           rmarkdown_1.18      
[100] git2r_0.26.1         shiny_1.4.0          lubridate_1.7.4