Last updated: 2023-06-16

Checks: 7 0

Knit directory: Cardiotoxicity/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20230109) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version fd4fe27. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .RData
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    data/41588_2018_171_MOESM3_ESMeQTL_ST2_for paper.csv
    Ignored:    data/Arr_GWAS.txt
    Ignored:    data/Arr_geneset.RDS
    Ignored:    data/BC_cell_lines.csv
    Ignored:    data/CADGWASgene_table.csv
    Ignored:    data/CAD_geneset.RDS
    Ignored:    data/Clamp_Summary.csv
    Ignored:    data/Cormotif_24_k1-5_raw.RDS
    Ignored:    data/DAgostres24.RDS
    Ignored:    data/DAtable1.csv
    Ignored:    data/DDEMresp_list.csv
    Ignored:    data/DDE_reQTL.txt
    Ignored:    data/DDEresp_list.csv
    Ignored:    data/DEG-GO/
    Ignored:    data/DEG_cormotif.RDS
    Ignored:    data/DF_Plate_Peak.csv
    Ignored:    data/DRC48hoursdata.csv
    Ignored:    data/Da24counts.txt
    Ignored:    data/Dx24counts.txt
    Ignored:    data/Dx_reQTL_specific.txt
    Ignored:    data/Ep24counts.txt
    Ignored:    data/GOIsig.csv
    Ignored:    data/GOplots.R
    Ignored:    data/GTEX_setsimple.csv
    Ignored:    data/GTEx_gene_list.csv
    Ignored:    data/HFGWASgene_table.csv
    Ignored:    data/HF_geneset.RDS
    Ignored:    data/Heart_Left_Ventricle.v8.egenes.txt
    Ignored:    data/Hf_GWAS.txt
    Ignored:    data/K_cluster
    Ignored:    data/K_cluster_kisthree.csv
    Ignored:    data/K_cluster_kistwo.csv
    Ignored:    data/LDH48hoursdata.csv
    Ignored:    data/Mt24counts.txt
    Ignored:    data/RINsamplelist.txt
    Ignored:    data/Seonane2019supp1.txt
    Ignored:    data/TOP2Bi-24hoursGO_analysis.csv
    Ignored:    data/TR24counts.txt
    Ignored:    data/Top2biresp_cluster24h.csv
    Ignored:    data/Viabilitylistfull.csv
    Ignored:    data/allexpressedgenes.txt
    Ignored:    data/allgenes.txt
    Ignored:    data/allmatrix.RDS
    Ignored:    data/avgLD50.RDS
    Ignored:    data/backGL.txt
    Ignored:    data/cormotif_3hk1-8.RDS
    Ignored:    data/cormotif_initalK5.RDS
    Ignored:    data/cormotif_initialK5.RDS
    Ignored:    data/cormotif_initialall.RDS
    Ignored:    data/counts24hours.RDS
    Ignored:    data/cpmcount.RDS
    Ignored:    data/cpmnorm_counts.csv
    Ignored:    data/crispr_genes.csv
    Ignored:    data/cvd_GWAS.txt
    Ignored:    data/dat_cpm.RDS
    Ignored:    data/data_outline.txt
    Ignored:    data/efit2.RDS
    Ignored:    data/efit2results.RDS
    Ignored:    data/ensembl_backup.RDS
    Ignored:    data/ensgtotal.txt
    Ignored:    data/filenameonly.txt
    Ignored:    data/filtered_cpm_counts.csv
    Ignored:    data/filtered_raw_counts.csv
    Ignored:    data/filtermatrix_x.RDS
    Ignored:    data/folder_05top/
    Ignored:    data/geneDoxonlyQTL.csv
    Ignored:    data/gene_corr_frame.RDS
    Ignored:    data/gene_prob_tran3h.RDS
    Ignored:    data/gene_probabilityk5.RDS
    Ignored:    data/gostresTop2bi_ER.RDS
    Ignored:    data/gostresTop2bi_LR
    Ignored:    data/gostresTop2bi_LR.RDS
    Ignored:    data/gostresTop2bi_TI.RDS
    Ignored:    data/gostrescoNR
    Ignored:    data/gtex/
    Ignored:    data/heartgenes.csv
    Ignored:    data/individualDRCfile.RDS
    Ignored:    data/individual_DRC48.RDS
    Ignored:    data/individual_LDH48.RDS
    Ignored:    data/knowfig4.csv
    Ignored:    data/knowfig5.csv
    Ignored:    data/mymatrix.RDS
    Ignored:    data/nonresponse_cluster24h.csv
    Ignored:    data/norm_LDH.csv
    Ignored:    data/norm_counts.csv
    Ignored:    data/old_sets/
    Ignored:    data/plan2plot.png
    Ignored:    data/raw_counts.csv
    Ignored:    data/response_cluster24h.csv
    Ignored:    data/sigVDA24.txt
    Ignored:    data/sigVDA3.txt
    Ignored:    data/sigVDX24.txt
    Ignored:    data/sigVDX3.txt
    Ignored:    data/sigVEP24.txt
    Ignored:    data/sigVEP3.txt
    Ignored:    data/sigVMT24.txt
    Ignored:    data/sigVMT3.txt
    Ignored:    data/sigVTR24.txt
    Ignored:    data/sigVTR3.txt
    Ignored:    data/siglist.RDS
    Ignored:    data/table3a.omar
    Ignored:    data/toplistall.RDS
    Ignored:    data/tvl24hour.txt
    Ignored:    data/tvl24hourw.txt
    Ignored:    data/venn_code.R

Untracked files:
    Untracked:  .RDataTmp
    Untracked:  .RDataTmp1
    Untracked:  .RDataTmp2
    Untracked:  OmicNavigator_learn.R
    Untracked:  analysis/GOI_plots.Rmd
    Untracked:  analysis/Seoane_chrom.Rmd
    Untracked:  code/cpm_boxplot.R
    Untracked:  cormotif_probability_genelist.csv
    Untracked:  individual-legenddark2.png
    Untracked:  installed_old.rda
    Untracked:  motif_ER.txt
    Untracked:  motif_LR.txt
    Untracked:  motif_NR.txt
    Untracked:  motif_TI.txt
    Untracked:  output/output-old/
    Untracked:  reneebasecode.R

Unstaged changes:
    Modified:   analysis/DEG-GO_analysis.Rmd
    Modified:   analysis/index.Rmd
    Modified:   analysis/other_analysis.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/Knowles2019.Rmd) and HTML (docs/Knowles2019.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd fd4fe27 reneeisnowhere 2023-06-16 fix spelling
html d6ad3fe reneeisnowhere 2023-06-16 Build site.
Rmd 4191e24 reneeisnowhere 2023-06-16 adding lcpm plots for dox spec
html bd0e45f reneeisnowhere 2023-06-15 Build site.
Rmd af93421 reneeisnowhere 2023-06-15 adding in seperate files
Rmd 1a1a034 reneeisnowhere 2023-06-15 updates galore
Rmd f8f511a reneeisnowhere 2023-06-15 updates and simplifications of code
Rmd 7fc7ec7 reneeisnowhere 2023-06-14 updating code

package loading

Knowles data

The code below is how I wrangled the knowles supplemental lists into entrezgene_ids to overlap with my expressed gene lists

Enrichment of Pairwise

backGL <- read.csv("data/backGL.txt",     row.names =1)
drug_palNoVeh <- c("#8B006D","#DF707E","#F1B72B", "#3386DD","#707031")
drug_palc <- c("#8B006D","#DF707E","#F1B72B", "#3386DD","#707031","#41B333")
time <- rep((rep(c("3h", "24h"), c(6,6))), 6)
time <- ordered(time, levels =c("3h", "24h"))
drug <- rep(c("Daunorubicin","Doxorubicin","Epirubicin","Mitoxantrone","Trastuzumab", "Vehicle"),12)
mat_drug <- c("Daunorubicin","Doxorubicin","Epirubicin","Mitoxantrone")
indv <- as.factor(rep(c(1,2,3,4,5,6), c(12,12,12,12,12,12)))
labeltop <- (interaction(substring(drug, 0, 2), indv, time))
level_order2 <- c('75','87','77','79','78','71')


toplistall <- readRDS("data/toplistall.RDS")
knowles4 <-readRDS("output/knowles4.RDS")
knowles5 <-readRDS("output/knowles5.RDS")
DOXmeSNPs <- readRDS("output/DOXmeSNPs.RDS")
DOXreQTLs <- readRDS("output/DOXreQTLs.RDS")
DNRmeSNPs <- readRDS("output/DNRmeSNPs.RDS")
DNRreQTLs <- readRDS("output/DNRreQTLs.RDS")
EPImeSNPs <- readRDS("output/EPImeSNPs.RDS")
EPIreQTLs <- readRDS("output/EPIreQTLs.RDS")
MTXmeSNPs <- readRDS("output/MTXmeSNPs.RDS")
MTXreQTLs <- readRDS("output/MTXreQTLs.RDS")

toplist24hr <-   toplistall %>%
  mutate(id = as.factor(id)) %>%
  mutate(time=factor(time, levels=c("3_hours","24_hours"))) %>%
  filter(time=="24_hours")

toplist3hr <-   toplistall %>%
  mutate(id = as.factor(id)) %>%
  mutate(time=factor(time, levels=c("3_hours","24_hours"))) %>%
  filter(time=="3_hours")

Determining the genetic basis of anthracycline-cardiotoxicity by molecular response QTL mapping in induced cardiomyocytes
David A Knowles, Courtney K Burrows†, John D Blischak, Kristen M Patterson, Daniel J Serie, Nadine Norton, Carole Ober, Jonathan K Pritchard, Yoav Gilad

Knowles \(~~et~ al.~\) eLife 2018;7:e33480. DOI: https://doi.org/10.7554/eLife.33480

My first question was about transcriptional response at the 24 hour mark with my treatments. 3 hour RNA-seq had low numbers of DEGs,so my initial focus is at 24 hours. This time also happens to be when the Knowles paper collected their RNA-seq data.

Supplementary 4 contains a list of 518 SNPs within 1 Mb of TSS, which had a detectable marginal effect on expression (5% FDR). When converted from ensembl gene id to entrez gene id, my list of genes for this supplement = 521. I will call these meSNPS for marginal effect QTLs.

Supplementary 5 contains a list of 376 response eQTLs (reQTLs). These are variants that were associated with modulation of transcriptomic response to doxorubicin treatment. After database name conversion, I have 377 genes.

meSNPdf <- toplistall %>% 
  mutate(id = as.factor(id)) %>%
  mutate(time=factor(time, levels=c("3_hours","24_hours"))) %>%
  mutate(sigcount = if_else(adj.P.Val < 0.05,'sig','notsig'))%>%
  mutate(meSNP = if_else( ENTREZID %in% knowles4$entrezgene_id, "y" , "no")) %>% 
  dplyr::select(ENTREZID,id,time,meSNP,sigcount) %>%
  group_by(id,time,meSNP,sigcount) %>% 
  tally() %>% 
  pivot_wider(id_cols = c(id,time,sigcount), names_from = meSNP,names_glue = "meSNP_{meSNP}", values_from = n)

meSNPdf %>% 
  kable(., caption= "Number of genes from minimally expressed SNPs found in DEGs (sig and nonsig) broken down by drug,time, and significance") %>% 
  kable_paper("striped", full_width = FALSE) %>% 
  kable_styling(full_width = FALSE, position = "left",bootstrap_options = c("striped"),font_size = 18) %>% 
  scroll_box(width = "80%", height = "400px")
Number of genes from minimally expressed SNPs found in DEGs (sig and nonsig) broken down by drug,time, and significance
id time sigcount meSNP_no meSNP_y
Daunorubicin 3_hours notsig 13032 497
Daunorubicin 3_hours sig 549 6
Daunorubicin 24_hours notsig 6916 304
Daunorubicin 24_hours sig 6665 199
Doxorubicin 3_hours notsig 13565 503
Doxorubicin 3_hours sig 16 NA
Doxorubicin 24_hours notsig 7249 319
Doxorubicin 24_hours sig 6332 184
Epirubicin 3_hours notsig 13365 499
Epirubicin 3_hours sig 216 4
Epirubicin 24_hours notsig 7551 331
Epirubicin 24_hours sig 6030 172
Mitoxantrone 3_hours notsig 13524 502
Mitoxantrone 3_hours sig 57 1
Mitoxantrone 24_hours notsig 12284 473
Mitoxantrone 24_hours sig 1297 30
Trastuzumab 3_hours notsig 13581 503
Trastuzumab 24_hours notsig 13581 503
reQTLdf <- toplistall %>% 
  mutate(id = as.factor(id)) %>%
  mutate(time=factor(time, levels=c("3_hours","24_hours"))) %>%
  mutate(sigcount = if_else(adj.P.Val < 0.05,'sig','notsig'))%>%
  mutate(reQTL = if_else( ENTREZID %in% knowles5$entrezgene_id, "y" , "no")) %>% 
  dplyr::select(ENTREZID,id,time,reQTL,sigcount) %>%
  group_by(id,time,reQTL,sigcount) %>% 
  tally() %>% 
  pivot_wider(id_cols = c(id,time,sigcount), names_from = reQTL,names_glue = "reQTL_{reQTL}", values_from = n)

reQTLdf %>% 
  kable(., caption= "Number of genes from response eQTLS found in DEGs (sig and nonsig) broken down by drug,time, and significance") %>% 
  kable_paper("striped", full_width = FALSE) %>% 
  kable_styling(full_width = FALSE, position = "left",bootstrap_options = c("striped"),font_size = 18) %>% 
  scroll_box(width = "80%", height = "400px")
Number of genes from response eQTLS found in DEGs (sig and nonsig) broken down by drug,time, and significance
id time sigcount reQTL_no reQTL_y
Daunorubicin 3_hours notsig 13168 361
Daunorubicin 3_hours sig 542 13
Daunorubicin 24_hours notsig 7033 187
Daunorubicin 24_hours sig 6677 187
Doxorubicin 3_hours notsig 13694 374
Doxorubicin 3_hours sig 16 NA
Doxorubicin 24_hours notsig 7374 194
Doxorubicin 24_hours sig 6336 180
Epirubicin 3_hours notsig 13494 370
Epirubicin 3_hours sig 216 4
Epirubicin 24_hours notsig 7684 198
Epirubicin 24_hours sig 6026 176
Mitoxantrone 3_hours notsig 13652 374
Mitoxantrone 3_hours sig 58 NA
Mitoxantrone 24_hours notsig 12423 334
Mitoxantrone 24_hours sig 1287 40
Trastuzumab 3_hours notsig 13710 374
Trastuzumab 24_hours notsig 13710 374
dataframK_45 <- toplistall %>%
  mutate(id = as.factor(id)) %>%
  mutate(time=factor(time, levels=c("3_hours","24_hours"))) %>%
  mutate(K4 = if_else(ENTREZID %in% knowles4$entrezgene_id,'y','no'))%>%
  mutate(K5 = if_else(ENTREZID %in% knowles5$entrezgene_id,'y','no'))%>%
  filter(adj.P.Val<0.05) %>%
   group_by(time, id) %>%
 dplyr::summarize(n=n(), meSNP=sum(if_else(K4=="y",1,0)), reQTL=sum(if_else(K5=="y",1,0))) %>% 
  as.data.frame()


dataframK_45 %>% 
kable(., caption= "Summary of meSNPs and reQTLs found in my DEGs") %>% 
  kable_paper("striped", full_width = FALSE) %>% 
  kable_styling(font_size = 18) %>% 
  scroll_box(width = "80%", height = "400px")
Summary of meSNPs and reQTLs found in my DEGs
time id n meSNP reQTL
3_hours Daunorubicin 555 6 13
3_hours Doxorubicin 16 0 0
3_hours Epirubicin 220 4 4
3_hours Mitoxantrone 58 1 0
24_hours Daunorubicin 6864 199 187
24_hours Doxorubicin 6516 184 180
24_hours Epirubicin 6202 172 176
24_hours Mitoxantrone 1327 30 40

Visualization of proportions

toplistall %>%
  mutate(id = as.factor(id)) %>%
  mutate(time=factor(time, levels=c("3_hours","24_hours"))) %>%
  mutate(sigcount = if_else(adj.P.Val < 0.05,'sig','notsig'))%>%
  mutate(meSNP=if_else(ENTREZID %in%knowles4$entrezgene_id,"y","no")) %>% 
  mutate(reQTL=if_else(ENTREZID %in%knowles5$entrezgene_id,"a","b")) %>% 
  filter(time =="24_hours") %>% 
  group_by(id,sigcount,meSNP) %>% 
  summarize(K4count=n())%>% 
    pivot_wider(id_cols = c(id,sigcount), names_from=c(meSNP), values_from=K4count) %>% 
   mutate(SNPprop=(y/(y+no)*100)) %>% 
       ggplot(., aes(x=id, y=SNPprop)) +
       geom_col()+
       geom_text(aes(x=id, label = sprintf("%.2f",SNPprop), vjust=-.2))+
       #geom_text(aes(label = expression(paste0("number"~a,"out of",~b))))+
       facet_wrap(~sigcount)+
       ggtitle("non-significant and significant enrichment proporitions of Knowles meSNPs ")

toplistall %>%
  mutate(id = as.factor(id)) %>%
  mutate(time=factor(time, levels=c("3_hours","24_hours"))) %>%
  mutate(sigcount = if_else(adj.P.Val < 0.05,'sig','notsig'))%>%
  mutate(meSNP=if_else(ENTREZID %in%knowles4$entrezgene_id,"y","no")) %>% 
  mutate(reQTL=if_else(ENTREZID %in%knowles5$entrezgene_id,"a","b")) %>% 
  filter(time =="24_hours") %>% 
  group_by(id,sigcount,reQTL) %>% 
  summarize(K5count=n())%>% 
    pivot_wider(id_cols = c(id,sigcount), names_from=c(reQTL), values_from=K5count) %>% 
   mutate(QTLprop=(a/(a+b)*100)) %>% 
       ggplot(., aes(x=id, y=QTLprop)) +
       geom_col()+
       geom_text(aes(x=id, label = sprintf("%.2f",QTLprop), vjust=-.2))+
       #geom_text(aes(label = expression(paste0("number"~a,"out of",~b))))+
       facet_wrap(~sigcount)+
       ggtitle("non-significant and significant enrichment proporitions of Knowles reQTLs ")

chi_squarek4 <- toplistall %>% 
  mutate(id = as.factor(id)) %>%
  dplyr::filter(id!="Trastuzumab") %>% 
  mutate(sigcount = if_else(adj.P.Val <0.05,'sig','notsig'))%>%
  mutate(meSNP=if_else(ENTREZID %in%knowles4$entrezgene_id,"y","no")) %>% 
  group_by(id,time) %>% 
  summarise(pvalue= chisq.test(meSNP, sigcount)$p.value) 

chi_squarek5 <-   toplistall %>% 
  mutate(id = as.factor(id)) %>%
  dplyr::filter(id!="Trastuzumab") %>% 
  mutate(sigcount = if_else(adj.P.Val <0.05,'sig','notsig'))%>%
  mutate(reQTL=if_else(ENTREZID %in%knowles5$entrezgene_id,"a","b")) %>% 
  group_by(id,time) %>% 
  summarise(pvalue= chisq.test(reQTL, sigcount)$p.value) 

chi_squarek4 %>% 
  kable(., caption= "meSNPs (mininmally expressed QTLS) chi-squared pvalues") %>% 
  kable_paper("striped", full_width = FALSE) %>%  
  kable_styling(full_width = FALSE, position = "right",bootstrap_options = c("striped"),font_size = 18) %>%   scroll_box(width = "50%", height = "400px")
meSNPs (mininmally expressed QTLS) chi-squared pvalues
id time pvalue
Daunorubicin 24_hours 0.0000338
Daunorubicin 3_hours 0.0018777
Doxorubicin 24_hours 0.0000113
Doxorubicin 3_hours 0.9232984
Epirubicin 24_hours 0.0000074
Epirubicin 3_hours 0.2189641
Mitoxantrone 24_hours 0.0086490
Mitoxantrone 3_hours 0.6853643
chi_squarek5 %>% 
  kable(., caption= "reQTLS (response eQTLS) chi-squared pvalues") %>% 
  kable_paper("striped", full_width = FALSE) %>%  
  kable_styling(full_width = FALSE, position = "right",bootstrap_options = c("striped"),font_size = 18) %>%   scroll_box(width = "50%", height = "400px")
reQTLS (response eQTLS) chi-squared pvalues
id time pvalue
Daunorubicin 24_hours 0.6576304
Daunorubicin 3_hours 0.7387684
Doxorubicin 24_hours 0.4965954
Doxorubicin 3_hours 1.0000000
Epirubicin 24_hours 0.2539396
Epirubicin 3_hours 0.5705567
Mitoxantrone 24_hours 0.4445513
Mitoxantrone 3_hours 0.3946217

We can “see” the proportions of SNPs are not enriched significantly in the DE genes compared to the non-DE genes (chi square test).
We do not see significant enrichment of reQTLS in our DE gene sets over the non-DE gene sets.

So what about enrichment of meSNPs in DE compared to reQTLs in significantly DE gene sets?

Knowles_count <- 
  toplistall %>%
  filter(id!='Trastuzumab') %>% 
  mutate(id = as.factor(id)) %>%
  mutate(time=factor(time, levels=c("3_hours","24_hours"))) %>%
  group_by(time, id) %>%
  mutate(K4 = if_else(ENTREZID %in% knowles4$entrezgene_id,1,0))%>%
  mutate(K5 = if_else(ENTREZID %in% knowles5$entrezgene_id,1,0))%>%
  filter(adj.P.Val<0.05) %>%
  dplyr::summarize(n=n(), K4=sum(K4), K5=sum(K5)) %>% 
  as.tibble() %>% 
  dplyr::select(time,id,K4,K5) %>% rename("K4_y"='K4',"K5_y"='K5') %>% 
  mutate(time = case_match(time, '3_hours'~'3 hrs',
                           '24_hours'~'24 hrs',.default = time)) %>% 
  mutate(K4_n= 503-K4_y, K5_n=371-K5_y) %>% 
  pivot_longer(!c(time,id), names_to='QTL',values_to="gene_count") %>%
  separate(QTL,into=c("QTL_type",'group'),sep = '_') %>%  
  mutate(QTL_type =case_match(QTL_type, 'K4'~'meQTLs',
                         'K5'~'reQTLs',.default = QTL_type)) %>% 
  mutate(time=factor(time, levels=c("3 hrs","24 hrs"))) %>% 
  group_by(id,time,QTL_type) %>% 
  mutate(percent=gene_count/sum(gene_count)*100)
  
  ggplot(Knowles_count, aes(x=QTL_type,y=gene_count, group=group,  fill=group))+
    geom_col(position='fill')+
    # guides(fill="none")+
    facet_wrap(time~id,nrow=2,ncol=4)+
    theme_classic()+
   scale_color_manual(values=drug_palNoVeh)+
    scale_fill_manual(values=drug_palNoVeh)+
       scale_y_continuous(label = scales::percent)

  ###This works
  chi_frame <- Knowles_count %>%
  pivot_wider(id_cols = c(time, id,QTL_type), names_from = group, values_from = gene_count) 

testDNR3chi <- matrix(c(6,13,497, 358), nrow=2, ncol=2, byrow=FALSE,
dimnames=list(c("k4", "k5"),c( "y", "n"))) 
    DNR_3chi <- chisq.test(testDNR3chi,correct=FALSE)$p.value
testDNR24chi <- matrix(c(199,187,304, 184), nrow=2, ncol=2, byrow=FALSE,
dimnames=list(c("k4", "k5"),c( "y", "n")))  
    DNR_24chi <- chisq.test(testDNR24chi,correct=FALSE)$p.value

testDOX3chi <- matrix(c(0,0,503, 371), nrow=2, ncol=2, byrow=FALSE,
dimnames=list(c("k4", "k5"),c( "y", "n")))  
    DOX_3chi <- chisq.test(testDOX3chi,correct=FALSE)$p.value
testDOX24chi <- matrix(c(184,180,318, 191), nrow=2, ncol=2, byrow=FALSE,
dimnames=list(c("k4", "k5"),c( "y", "n"))) 
    DOX_24chi <- chisq.test(testDOX24chi,correct=FALSE)$p.value

testEPI3chi <- matrix(c(4,4,499, 367), nrow=2, ncol=2, byrow=FALSE,
dimnames=list(c("k4", "k5"),c( "y", "n"))) 
    EPI_3chi <- chisq.test(testEPI3chi,correct=FALSE)$p.value
testEPI24chi <- matrix(c(172,176,331, 195), nrow=2, ncol=2, byrow=FALSE,
dimnames=list(c("k4", "k5"),c( "y", "n"))) 
    EPI_24chi <- chisq.test(testEPI24chi,correct=FALSE)$p.value
  
testMTX3chi <- matrix(c(1,0,502, 370), nrow=2, ncol=2, byrow=FALSE,
dimnames=list(c("k4", "k5"),c( "y", "n"))) 
    MTX_3chi <- chisq.test(testMTX3chi,correct=FALSE)$p.value
testMTX24chi <- matrix(c(30,40,473, 331), nrow=2, ncol=2, byrow=FALSE,
dimnames=list(c("k4", "k5"),c( "y", "n"))) 
    MTX_24chi <- chisq.test(testMTX24chi,correct=FALSE)$p.value
K4nK5chi <- data.frame(treatment=c('DNR_3','DNR_24','DOX_3','DOX_24','EPI_3','EPI_24','MTX_3','MTX_24'), chi_p.value=c(DNR_3chi,DNR_24chi,DOX_3chi,DOX_24chi,EPI_3chi,EPI_24chi,MTX_3chi,MTX_24chi))

K4nK5chi %>% 
  separate(treatment, into= c('Drug','time')) %>% 
  pivot_wider(id_cols = Drug, names_from = time, values_from = chi_p.value) %>% 
  kable(., caption= "Chi Square p. values from chi-square test between proportions of sig-DE meQTLs and reQTLS by time and treatment") %>% 
  kable_paper("striped", full_width = TRUE) %>%  
  kable_styling(full_width = FALSE, font_size = 16) %>% 
  scroll_box( height = "500px")
Chi Square p. values from chi-square test between proportions of sig-DE meQTLs and reQTLS by time and treatment
Drug 3 24
DNR 0.0205682 0.0014217
DOX NaN 0.0004405
EPI 0.6641977 0.0000770
MTX 0.3908069 0.0095036

Here we found that 24 hour reQTLs are significantly enriched in sig-DEgens compared to meQTLS, with Daunorubicin 3 hour treatment also showing significant enrichment.

toplistall %>% 
  filter(time=="24_hours") %>% 
  group_by(time, id) %>% 
  mutate(sigcount = if_else(adj.P.Val < 0.05,'sig','notsig'))%>%
  ggplot(., aes(x=id, y=abs(logFC)))+
  geom_boxplot(aes(fill=id))+
  ggpubr::fill_palette(palette =drug_palNoVeh)+
  guides(fill=guide_legend(title = "Treatment"))+
  facet_wrap(sigcount~time)+
  theme_bw()+
  xlab("")+
  ylab("abs |Log Fold Change|")+
  theme_bw()+
  facet_wrap(~time)+
  #ggtitle("All QTLs from all expressed genes (n=507)")+
  theme(plot.title = element_text(size = rel(1.5), hjust = 0.5),
        axis.title = element_text(size = 15, color = "black"),
        # axis.ticks = element_line(linewidth = 1.5),
        axis.line = element_line(linewidth = 1.5),
        strip.background = element_rect(fill = "transparent"),
        axis.text = element_text(size = 8, color = "black", angle = 0),
        strip.text.x = element_text(size = 12, color = "black", face = "bold"))

Dox-reQTLs

When I look ot the reQTLs across treatments at 24 hours, I see that dox has a total of 180 reQTLS. Because this gene set was specifically about doxorubicin response eQTLs, I wanted to see if these reQTLs were only Dox specific, AC specific, Top2i specific, or show any type of pattern.

reQTLcombine=list(DNRreQTLs$ENTREZID,DOXreQTLs$ENTREZID,EPIreQTLs$ENTREZID,MTXreQTLs$ENTREZID)

length(unique(c(DNRreQTLs$ENTREZID,DOXreQTLs$ENTREZID,EPIreQTLs$ENTREZID,MTXreQTLs$ENTREZID)))
[1] 218
print("number of unique genes expressed in pairwise DE set")
[1] "number of unique genes expressed in pairwise DE set"
QTLoverlaps <- VennDiagram::get.venn.partitions(reQTLcombine)
DoxonlyQTL <-  QTLoverlaps$..values..[[14]]
     ggVennDiagram::ggVennDiagram(reQTLcombine, category.names = c("DNR-187\nsigDEG","DOX-180 \nsigDEG\n","EPI-176\nsigDEG\n","MTX-40\nsigDEG"),label = "count") +scale_x_continuous(expand = expansion(mult = .2))+scale_y_continuous(expand = expansion(mult = .1))

Here, I found 119 reQTLs that were related to all anthracyclines, with 31 reQTLS related to all Top2i drugs at 24 hours.

 DOXeQTL_table <- toplistall %>% 
  mutate(sigcount = if_else(adj.P.Val < 0.05,'sig','notsig'))%>%
  mutate(DOXreQTLs=if_else(ENTREZID %in%DOXreQTLs$ENTREZID,"y","no")) %>% 
  dplyr::filter(time =="24_hours") %>% 
  dplyr::select(ENTREZID,id,DOXreQTLs,sigcount) %>%
  group_by(id,DOXreQTLs,sigcount) %>% 
  tally() %>% as.data.frame() %>% 
  pivot_wider(.,id_cols = c(id,DOXreQTLs),names_from = sigcount,values_from = n) %>% 
  dplyr::select(id,DOXreQTLs,sig) 
DOXeQTL_table%>% 
  kable(., caption= "All 24 hour sidDEGs broken down by drug and DOXreQTL status") %>% 
  kable_paper("striped", full_width = FALSE) %>% 
  kable_styling(full_width = FALSE, position = "left",bootstrap_options = c("striped"),font_size = 18) %>% 
  scroll_box(width = "80%", height = "400px")
All 24 hour sidDEGs broken down by drug and DOXreQTL status
id DOXreQTLs sig
Daunorubicin no 6695
Daunorubicin y 169
Doxorubicin no 6336
Doxorubicin y 180
Epirubicin no 6048
Epirubicin y 154
Mitoxantrone no 1291
Mitoxantrone y 36
Trastuzumab no NA
Trastuzumab y NA
DOXeQTL_table %>%
  dplyr::filter(DOXreQTLs=="y") %>% 
  mutate(opp= 180-sig) %>% 
  filter(id !=c('Trastuzumab','Doxorubicin')) %>% 
  rename("yes"=sig, "no"=opp) %>% 
  pivot_longer(!c(id,DOXreQTLs), names_to="group", values_to = "count") %>% 
  mutate(id=factor(id, levels=c("Daunorubicin","Epirubicin", "Mitoxantrone"))) %>% 
ggplot(., aes(y=id,x=count,fill=group))+
     geom_col(position='fill')+
    theme_classic()+
   scale_color_manual(values=c("red4","navy"))+
    scale_fill_manual(values=c("#2297E6","red2"))+
  ggtitle("DOXreQTLs overlaps")

cpm_boxplot <- function(cpmcounts, GOI,brewer_palette, fill_colors, ylab) {
  ##GOI needs to be ENTREZID
  df <- cpmcounts
  ##order of dataframe should have time,id, ENTREZID,SYMBOL logFC,AveExpr,and adj.P.Val (8 columns)
    df_plot <- df %>% 
      dplyr::filter(rownames(.)==GOI) %>%
      pivot_longer(everything(),
                   names_to = "treatment",values_to = "counts") %>%
      separate(treatment, c("drug","indv","time")) %>%
      mutate(time=factor(time, levels =c("3h", "24h"))) %>%
      mutate(indv=factor(indv, levels = c(1,2,3,4,5,6))) %>%
      mutate(drug =case_match(drug, "Da"~"Daunorubicin",
                            "Do"~"Doxorubicin",
                            "Ep"~"Epirubicin",
                            "Mi"~"Mitoxantrone",
                            "Tr"~"Trastuzumab",
                            "Ve"~"Vehicle", .default = drug))
    plot <- ggplot2::ggplot(df_plot, aes(x=drug, y=counts))+
      geom_boxplot(position="identity",aes(fill=drug))+
      geom_point(aes(col=indv, size=2, alpha=0.5))+
      guides(alpha= "none", size= "none")+
      scale_color_brewer(palette = brewer_palette, guide = "none")+
      scale_fill_manual(values=fill_colors)+
      facet_wrap("time", nrow=1, ncol=2)+
      theme_bw()+
      ylab(ylab)+
      xlab("")+
      theme(strip.background = element_rect(fill = "white"),
          plot.title = element_text(size=18,hjust = 0.5),
          axis.title = element_text(size = 15, color = "black"),
          axis.ticks = element_line(linewidth = 1.5),
          axis.line = element_line(linewidth = 1.5),
          axis.text.x = element_text(size = 12, color = "white", angle = 0),
          strip.text.x = element_text(size = 15, color = "black", face = "bold"))
    print(plot)
}

DOX specific-Dox dE-reqtl

DoxonlyQTL <-  as.numeric(QTLoverlaps$..values..[[14]])
# geneDoxonlyQTL <- getBM(attributes=my_attributes,filters ='entrezgene_id',
#                   values =DoxonlyQTL, mart = ensembl)
# write.csv(geneDoxonlyQTL,"data/geneDoxonlyQTL.csv")
# saveRDS(cpmcounts,"data/cpmcount.RDS")
geneDoxonlyQTL <- read.csv("data/geneDoxonlyQTL.csv",row.names = 1)
cpmcounts <- readRDS("data/cpmcount.RDS")

for (g in seq(from=1, to=length(geneDoxonlyQTL$entrezgene_id))){
  a <- geneDoxonlyQTL$hgnc_symbol[g]
  cpm_boxplot(cpmcounts,GOI=geneDoxonlyQTL[g,1],"Dark2",drug_palc,
                ylab=bquote(~italic(.(a))~log[2]~"cpm "))

}


sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

attached base packages:
[1] grid      stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] ComplexHeatmap_2.12.1 broom_1.0.5           kableExtra_1.3.4     
 [4] sjmisc_2.8.9          scales_1.2.1          ggpubr_0.6.0         
 [7] cowplot_1.1.1         RColorBrewer_1.1-3    biomaRt_2.52.0       
[10] ggsignif_0.6.4        lubridate_1.9.2       forcats_1.0.0        
[13] stringr_1.5.0         dplyr_1.1.2           purrr_1.0.1          
[16] readr_2.1.4           tidyr_1.3.0           tibble_3.2.1         
[19] ggplot2_3.4.2         tidyverse_2.0.0       limma_3.52.4         
[22] workflowr_1.7.0      

loaded via a namespace (and not attached):
  [1] backports_1.4.1        circlize_0.4.15        BiocFileCache_2.4.0   
  [4] systemfonts_1.0.4      GenomeInfoDb_1.32.4    digest_0.6.31         
  [7] foreach_1.5.2          htmltools_0.5.5        fansi_1.0.4           
 [10] magrittr_2.0.3         memoise_2.0.1          cluster_2.1.4         
 [13] doParallel_1.0.17      tzdb_0.4.0             Biostrings_2.64.1     
 [16] matrixStats_1.0.0      svglite_2.1.1          timechange_0.2.0      
 [19] prettyunits_1.1.1      RVenn_1.1.0            colorspace_2.1-0      
 [22] blob_1.2.4             rvest_1.0.3            rappdirs_0.3.3        
 [25] xfun_0.39              callr_3.7.3            crayon_1.5.2          
 [28] RCurl_1.98-1.12        jsonlite_1.8.5         iterators_1.0.14      
 [31] glue_1.6.2             gtable_0.3.3           zlibbioc_1.42.0       
 [34] XVector_0.36.0         webshot_0.5.4          GetoptLong_1.0.5      
 [37] car_3.1-2              shape_1.4.6            BiocGenerics_0.42.0   
 [40] abind_1.4-5            futile.options_1.0.1   DBI_1.1.3             
 [43] rstatix_0.7.2          Rcpp_1.0.10            viridisLite_0.4.2     
 [46] progress_1.2.2         units_0.8-2            clue_0.3-64           
 [49] proxy_0.4-27           bit_4.0.5              stats4_4.2.2          
 [52] httr_1.4.6             pkgconfig_2.0.3        XML_3.99-0.14         
 [55] farver_2.1.1           sass_0.4.6             dbplyr_2.3.2          
 [58] utf8_1.2.3             tidyselect_1.2.0       labeling_0.4.2        
 [61] rlang_1.1.1            later_1.3.1            AnnotationDbi_1.58.0  
 [64] munsell_0.5.0          tools_4.2.2            cachem_1.0.8          
 [67] cli_3.6.1              generics_0.1.3         RSQLite_2.3.1         
 [70] ggVennDiagram_1.2.2    sjlabelled_1.2.0       evaluate_0.21         
 [73] fastmap_1.1.1          yaml_2.3.7             processx_3.8.1        
 [76] knitr_1.43             bit64_4.0.5            fs_1.6.2              
 [79] KEGGREST_1.36.3        whisker_0.4.1          formatR_1.14          
 [82] xml2_1.3.4             compiler_4.2.2         rstudioapi_0.14       
 [85] filelock_1.0.2         curl_5.0.1             png_0.1-8             
 [88] e1071_1.7-13           bslib_0.5.0            stringi_1.7.12        
 [91] highr_0.10             ps_1.7.5               futile.logger_1.4.3   
 [94] classInt_0.4-9         vctrs_0.6.2            pillar_1.9.0          
 [97] lifecycle_1.0.3        jquerylib_0.1.4        GlobalOptions_0.1.2   
[100] bitops_1.0-7           insight_0.19.2         httpuv_1.6.11         
[103] R6_2.5.1               promises_1.2.0.1       KernSmooth_2.23-21    
[106] IRanges_2.30.1         codetools_0.2-19       lambda.r_1.2.4        
[109] rprojroot_2.0.3        rjson_0.2.21           withr_2.5.0           
[112] S4Vectors_0.34.0       GenomeInfoDbData_1.2.8 parallel_4.2.2        
[115] hms_1.1.3              VennDiagram_1.7.3      class_7.3-22          
[118] rmarkdown_2.22         carData_3.0-5          git2r_0.32.0          
[121] sf_1.0-13              getPass_0.2-2          Biobase_2.56.0