Last updated: 2023-11-27

Checks: 7 0

Knit directory: Cardiotoxicity/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20230109) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 522ca8c. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .RData
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    analysis/variance_values by gene.png
    Ignored:    data/41588_2018_171_MOESM3_ESMeQTL_ST2_for paper.csv
    Ignored:    data/Arr_GWAS.txt
    Ignored:    data/Arr_geneset.RDS
    Ignored:    data/BC_cell_lines.csv
    Ignored:    data/BurridgeDOXTOX.RDS
    Ignored:    data/CADGWASgene_table.csv
    Ignored:    data/CAD_geneset.RDS
    Ignored:    data/CALIMA_Data/
    Ignored:    data/Clamp_Summary.csv
    Ignored:    data/Cormotif_24_k1-5_raw.RDS
    Ignored:    data/Counts_RNA_ERMatthews.txt
    Ignored:    data/DAgostres24.RDS
    Ignored:    data/DAtable1.csv
    Ignored:    data/DDEMresp_list.csv
    Ignored:    data/DDE_reQTL.txt
    Ignored:    data/DDEresp_list.csv
    Ignored:    data/DEG-GO/
    Ignored:    data/DEG_cormotif.RDS
    Ignored:    data/DF_Plate_Peak.csv
    Ignored:    data/DRC48hoursdata.csv
    Ignored:    data/Da24counts.txt
    Ignored:    data/Dx24counts.txt
    Ignored:    data/Dx_reQTL_specific.txt
    Ignored:    data/EPIstorelist24.RDS
    Ignored:    data/Ep24counts.txt
    Ignored:    data/FC_necela.RDS
    Ignored:    data/FC_necela_names.RDS
    Ignored:    data/Full_LD_rep.csv
    Ignored:    data/GOIsig.csv
    Ignored:    data/GOplots.R
    Ignored:    data/GTEX_setsimple.csv
    Ignored:    data/GTEX_sig24.RDS
    Ignored:    data/GTEx_gene_list.csv
    Ignored:    data/HFGWASgene_table.csv
    Ignored:    data/HF_geneset.RDS
    Ignored:    data/Heart_Left_Ventricle.v8.egenes.txt
    Ignored:    data/Heatmap_mat.RDS
    Ignored:    data/Heatmap_sig.RDS
    Ignored:    data/Hf_GWAS.txt
    Ignored:    data/K_cluster
    Ignored:    data/K_cluster_kisthree.csv
    Ignored:    data/K_cluster_kistwo.csv
    Ignored:    data/Knowles_log2cpm_real.RDS
    Ignored:    data/Knowles_variation_data.RDS
    Ignored:    data/Knowlesvarlist.RDS
    Ignored:    data/LD50_05via.csv
    Ignored:    data/LDH48hoursdata.csv
    Ignored:    data/Mt24counts.txt
    Ignored:    data/NoRespDEG_final.csv
    Ignored:    data/RINsamplelist.txt
    Ignored:    data/RNA_seq_trial.RDS
    Ignored:    data/Seonane2019supp1.txt
    Ignored:    data/TMMnormed_x.RDS
    Ignored:    data/TOP2Bi-24hoursGO_analysis.csv
    Ignored:    data/TR24counts.txt
    Ignored:    data/TableS10.csv
    Ignored:    data/TableS11.csv
    Ignored:    data/TableS9.csv
    Ignored:    data/Top2_expression.RDS
    Ignored:    data/Top2biresp_cluster24h.csv
    Ignored:    data/Var_test_list.RDS
    Ignored:    data/Var_test_list24.RDS
    Ignored:    data/Var_test_list24alt.RDS
    Ignored:    data/Var_test_list3.RDS
    Ignored:    data/Vargenes.RDS
    Ignored:    data/Viabilitylistfull.csv
    Ignored:    data/allexpressedgenes.txt
    Ignored:    data/allfinal3hour.RDS
    Ignored:    data/allgenes.txt
    Ignored:    data/allmatrix.RDS
    Ignored:    data/allmymatrix.RDS
    Ignored:    data/annotation_data_frame.RDS
    Ignored:    data/averageviabilitytable.RDS
    Ignored:    data/avgLD50.RDS
    Ignored:    data/avg_LD50.RDS
    Ignored:    data/backGL.txt
    Ignored:    data/burr_genes.RDS
    Ignored:    data/calcium_data.RDS
    Ignored:    data/clamp_summary.RDS
    Ignored:    data/cormotif_3hk1-8.RDS
    Ignored:    data/cormotif_initalK5.RDS
    Ignored:    data/cormotif_initialK5.RDS
    Ignored:    data/cormotif_initialall.RDS
    Ignored:    data/cormotifprobs.csv
    Ignored:    data/counts24hours.RDS
    Ignored:    data/cpmcount.RDS
    Ignored:    data/cpmnorm_counts.csv
    Ignored:    data/crispr_genes.csv
    Ignored:    data/ctnnt_results.txt
    Ignored:    data/cvd_GWAS.txt
    Ignored:    data/dat_cpm.RDS
    Ignored:    data/data_outline.txt
    Ignored:    data/drug_noveh1.csv
    Ignored:    data/efit2.RDS
    Ignored:    data/efit2_final.RDS
    Ignored:    data/efit2results.RDS
    Ignored:    data/ensembl_backup.RDS
    Ignored:    data/ensgtotal.txt
    Ignored:    data/filcpm_counts.RDS
    Ignored:    data/filenameonly.txt
    Ignored:    data/filtered_cpm_counts.csv
    Ignored:    data/filtered_raw_counts.csv
    Ignored:    data/filtermatrix_x.RDS
    Ignored:    data/folder_05top/
    Ignored:    data/geneDoxonlyQTL.csv
    Ignored:    data/gene_corr_df.RDS
    Ignored:    data/gene_corr_frame.RDS
    Ignored:    data/gene_prob_tran3h.RDS
    Ignored:    data/gene_probabilityk5.RDS
    Ignored:    data/geneset_24.RDS
    Ignored:    data/gostresTop2bi_ER.RDS
    Ignored:    data/gostresTop2bi_LR
    Ignored:    data/gostresTop2bi_LR.RDS
    Ignored:    data/gostresTop2bi_TI.RDS
    Ignored:    data/gostrescoNR
    Ignored:    data/gtex/
    Ignored:    data/heartgenes.csv
    Ignored:    data/hsa_kegg_anno.RDS
    Ignored:    data/individualDRCfile.RDS
    Ignored:    data/individual_DRC48.RDS
    Ignored:    data/individual_LDH48.RDS
    Ignored:    data/indv_noveh1.csv
    Ignored:    data/kegglistDEG.RDS
    Ignored:    data/kegglistDEG24.RDS
    Ignored:    data/kegglistDEG3.RDS
    Ignored:    data/knowfig4.csv
    Ignored:    data/knowfig5.csv
    Ignored:    data/label_list.RDS
    Ignored:    data/ld50_table.csv
    Ignored:    data/mean_vardrug1.csv
    Ignored:    data/mean_varframe.csv
    Ignored:    data/mymatrix.RDS
    Ignored:    data/new_ld50avg.RDS
    Ignored:    data/nonresponse_cluster24h.csv
    Ignored:    data/norm_LDH.csv
    Ignored:    data/norm_counts.csv
    Ignored:    data/old_sets/
    Ignored:    data/organized_drugframe.csv
    Ignored:    data/plan2plot.png
    Ignored:    data/plot_intv_list.RDS
    Ignored:    data/plot_list_DRC.RDS
    Ignored:    data/qval24hr.RDS
    Ignored:    data/qval3hr.RDS
    Ignored:    data/qvalueEPItemp.RDS
    Ignored:    data/raw_counts.csv
    Ignored:    data/response_cluster24h.csv
    Ignored:    data/sampsettrz.RDS
    Ignored:    data/sigVDA24.txt
    Ignored:    data/sigVDA3.txt
    Ignored:    data/sigVDX24.txt
    Ignored:    data/sigVDX3.txt
    Ignored:    data/sigVEP24.txt
    Ignored:    data/sigVEP3.txt
    Ignored:    data/sigVMT24.txt
    Ignored:    data/sigVMT3.txt
    Ignored:    data/sigVTR24.txt
    Ignored:    data/sigVTR3.txt
    Ignored:    data/siglist.RDS
    Ignored:    data/siglist_final.RDS
    Ignored:    data/siglist_old.RDS
    Ignored:    data/slope_table.csv
    Ignored:    data/supp10_24hlist.RDS
    Ignored:    data/supp10_3hlist.RDS
    Ignored:    data/supp_normLDH48.RDS
    Ignored:    data/supp_pca_all_anno.RDS
    Ignored:    data/table3a.omar
    Ignored:    data/test_run_sample_list.txt
    Ignored:    data/testlist.txt
    Ignored:    data/toplistall.RDS
    Ignored:    data/trtonly_24h_genes.RDS
    Ignored:    data/trtonly_3h_genes.RDS
    Ignored:    data/tvl24hour.txt
    Ignored:    data/tvl24hourw.txt
    Ignored:    data/venn_code.R
    Ignored:    data/viability.RDS

Untracked files:
    Untracked:  .RDataTmp
    Untracked:  .RDataTmp1
    Untracked:  .RDataTmp2
    Untracked:  .RDataTmp3
    Untracked:  3hr all.pdf
    Untracked:  Code_files_list.csv
    Untracked:  Data_files_list.csv
    Untracked:  Doxorubicin_vehicle_3_24.csv
    Untracked:  Doxtoplist.csv
    Untracked:  EPIqvalue_analysis.Rmd
    Untracked:  GWAS_list_of_interest.xlsx
    Untracked:  KEGGpathwaylist.R
    Untracked:  OmicNavigator_learn.R
    Untracked:  SigDoxtoplist.csv
    Untracked:  analysis/DRC_viability_check.Rmd
    Untracked:  analysis/cellcycle_kegg_genes.R
    Untracked:  analysis/ciFIT.R
    Untracked:  analysis/export_to_excel.R
    Untracked:  analysis/featureCountsPLAY.R
    Untracked:  cleanupfiles_script.R
    Untracked:  code/biomart_gene_names.R
    Untracked:  code/constantcode.R
    Untracked:  code/corMotifcustom.R
    Untracked:  code/cpm_boxplot.R
    Untracked:  code/extracting_ggplot_data.R
    Untracked:  code/movingfilesto_ppl.R
    Untracked:  code/pearson_extract_func.R
    Untracked:  code/pearson_tox_extract.R
    Untracked:  code/plot1C.fun.R
    Untracked:  code/spearman_extract_func.R
    Untracked:  code/venndiagramcolor_control.R
    Untracked:  cormotif_p.post.list_4.csv
    Untracked:  figS1024h.pdf
    Untracked:  individual-legenddark2.png
    Untracked:  installed_old.rda
    Untracked:  motif_ER.txt
    Untracked:  motif_LR.txt
    Untracked:  motif_NR.txt
    Untracked:  motif_TI.txt
    Untracked:  output/DNR_DEGlist.csv
    Untracked:  output/DNRvenn.RDS
    Untracked:  output/DOX_DEGlist.csv
    Untracked:  output/DOXvenn.RDS
    Untracked:  output/EPI_DEGlist.csv
    Untracked:  output/EPIvenn.RDS
    Untracked:  output/FC_necela.RDS
    Untracked:  output/Figures/
    Untracked:  output/GTEXv8_gene_median_tpm.RDS
    Untracked:  output/GTEXv8_gene_tpm_heart_left_ventricle.RDS
    Untracked:  output/KEGGcellcyclegenes.RDS
    Untracked:  output/Knowles_log2cpm.csv
    Untracked:  output/MTX_DEGlist.csv
    Untracked:  output/MTXvenn.RDS
    Untracked:  output/SETA_analysis_reyes.RDS
    Untracked:  output/TRZ_DEGlist.csv
    Untracked:  output/TableS8.csv
    Untracked:  output/Volcanoplot_10
    Untracked:  output/Volcanoplot_10.RDS
    Untracked:  output/allfinal_sup10.RDS
    Untracked:  output/counts_v8_heart_left_ventricle_gct.RDS
    Untracked:  output/crisprfoldchange.RDS
    Untracked:  output/endocytosisgenes.csv
    Untracked:  output/gene_corr_fig9.RDS
    Untracked:  output/legend_b.RDS
    Untracked:  output/motif_ERrep.RDS
    Untracked:  output/motif_LRrep.RDS
    Untracked:  output/motif_NRrep.RDS
    Untracked:  output/motif_TI_rep.RDS
    Untracked:  output/output-old/
    Untracked:  output/rank24genes.csv
    Untracked:  output/rank3genes.csv
    Untracked:  output/reneem@ls6.tacc.utexas.edu
    Untracked:  output/sequencinginformationforsupp.csv
    Untracked:  output/sequencinginformationforsupp.prn
    Untracked:  output/sigVDA24.txt
    Untracked:  output/sigVDA3.txt
    Untracked:  output/sigVDX24.txt
    Untracked:  output/sigVDX3.txt
    Untracked:  output/sigVEP24.txt
    Untracked:  output/sigVEP3.txt
    Untracked:  output/sigVMT24.txt
    Untracked:  output/sigVMT3.txt
    Untracked:  output/sigVTR24.txt
    Untracked:  output/sigVTR3.txt
    Untracked:  output/supplementary_motif_list_GO.RDS
    Untracked:  output/toptablebydrug.RDS
    Untracked:  output/trop_knowles_fun.csv
    Untracked:  output/tvl24hour.txt
    Untracked:  output/x_counts.RDS
    Untracked:  reneebasecode.R

Unstaged changes:
    Modified:   analysis/GTEx_genes.Rmd
    Modified:   analysis/Var_genes.Rmd
    Modified:   analysis/after_comments.Rmd
    Modified:   analysis/variance_scrip.Rmd
    Modified:   output/daplot.RDS
    Modified:   output/dxplot.RDS
    Modified:   output/epplot.RDS
    Modified:   output/mtplot.RDS
    Modified:   output/plan2plot.png
    Modified:   output/trplot.RDS
    Modified:   output/veplot.RDS

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/Knowels_trop_analysis.Rmd) and HTML (docs/Knowels_trop_analysis.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 522ca8c reneeisnowhere 2023-11-27 adding more analysis
html b474072 reneeisnowhere 2023-11-21 Build site.
Rmd 5c45b92 reneeisnowhere 2023-11-21 adding knowles data again
html b4d55c4 reneeisnowhere 2023-11-21 Build site.
Rmd b1b9563 reneeisnowhere 2023-11-21 adding logcpm with knowles data
html b442590 reneeisnowhere 2023-11-21 Build site.
Rmd b464f89 reneeisnowhere 2023-11-21 updates to links again
html 19a3502 reneeisnowhere 2023-11-21 Build site.
Rmd d17d270 reneeisnowhere 2023-11-21 adding links
html c72467c reneeisnowhere 2023-11-09 Build site.
Rmd 8a6ebc1 reneeisnowhere 2023-11-09 updates on plots
html 9af9df6 reneeisnowhere 2023-11-09 Build site.
Rmd 60c64d1 reneeisnowhere 2023-11-09 adding boxplots
html d12232a reneeisnowhere 2023-11-07 Build site.
Rmd 441f82b reneeisnowhere 2023-11-07 adding more code
html cae46a9 reneeisnowhere 2023-11-07 Build site.
html bd0342c reneeisnowhere 2023-11-07 Build site.
Rmd 453ebe6 reneeisnowhere 2023-11-07 adding code
Rmd d6ecce9 reneeisnowhere 2023-11-07 adding code
html ae9124e reneeisnowhere 2023-10-30 Build site.
Rmd 74c2dc1 reneeisnowhere 2023-10-30 updated
Rmd d970e84 reneeisnowhere 2023-10-30 adding more analysis

library(tidyverse)
library(ggsignif)
library(cowplot)
library(ggpubr)
library(scales)
library(sjmisc)
library(kableExtra)
library(broom)
library(ComplexHeatmap)
library(ggVennDiagram)
library(biomaRt)
library(limma)
library(edgeR)
library(RColorBrewer)
palette_colors_mine <- colorRampPalette(colors = c("green","white","purple","red" ))(60)
scales::show_col(palette_colors_mine)

Version Author Date
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Here I will attempt to recreate my correlation analysis on the knowles data using their troponin and RNAseq log2cpm.

### genes I want to know about
interest_genes <- read.csv("output/GOI_genelist.txt", row.names = 1)
trop_knowles <- read.csv("output/trop_knowles_fun.csv", row.names = 1)
Knowles_log2cpm <- readRDS("data/Knowles_log2cpm_real.RDS")
trop0.625 <- trop_knowles %>% 
  filter(dosage <1) 
store <- Knowles_log2cpm %>% 
  dplyr::select( 'ESGN',ends_with(c('0.625', '0'))) %>% 
  dplyr::filter(ESGN %in% interest_genes$ensembl_gene_id) %>% 
  pivot_longer(cols=!ESGN, names_to = "ind", values_to = "counts") %>% 
  separate(ind,into=c("cell_line","dosage"), sep = ":") %>%
  mutate(dosage = as.numeric(dosage)) %>% 
  full_join(., trop0.625, by=c("cell_line", "dosage")) %>% 
  group_by(cell_line) %>% 
  full_join(., interest_genes, by = c("ESGN" = "ensembl_gene_id"))
  
  
  ###new graph stuff
  for (gene in interest_genes$ensembl_gene_id){
    gene_plot <- store %>% 
      dplyr::filter(ESGN == gene) %>%
      ggplot(., aes(x=troponin, y=counts))+
      geom_point(aes(col=cell_line))+
      geom_smooth(method="lm")+
      
      facet_wrap(hgnc_symbol~dosage, scales="free")+
      theme_classic()+
      xlab("troponin I expression") +
      ylab("Gene counts in log2 cpm") +
      ggtitle(expression(paste("Correlation between counts and troponin I Knowles")))+
      scale_color_manual(values = palette_colors_mine, aesthetics = c("color", "fill"), guide=FALSE)+
      # guides(fill="none")+
     ggpubr:: stat_cor(method="spearman",
                       # cor.coef.name="rho",
               aes(label = paste(..r.label.., ..p.label.., sep = "~`,\n`~")),
               color = "black",
               label.x.npc = 0.01,
               label.y.npc=0.01, 
               size = 3)+
      theme(plot.title = element_text(size = rel(1.5), hjust = 0.5,face = "bold"),
            axis.title = element_text(size = 15, color = "black"),
            axis.ticks = element_line(size = 1.5),
            axis.text = element_text(size = 8, color = "black", angle = 20),
            strip.text.x = element_text(size = 12, color = "black", face = "italic"))
   print(gene_plot)
   
  }

Version Author Date
bd0342c reneeisnowhere 2023-11-07

Version Author Date
b442590 reneeisnowhere 2023-11-21
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Version Author Date
b442590 reneeisnowhere 2023-11-21
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Version Author Date
bd0342c reneeisnowhere 2023-11-07

Version Author Date
b442590 reneeisnowhere 2023-11-21
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Version Author Date
b442590 reneeisnowhere 2023-11-21
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Version Author Date
b442590 reneeisnowhere 2023-11-21
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Version Author Date
b442590 reneeisnowhere 2023-11-21
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Version Author Date
b442590 reneeisnowhere 2023-11-21
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Version Author Date
b442590 reneeisnowhere 2023-11-21
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Version Author Date
b442590 reneeisnowhere 2023-11-21
9af9df6 reneeisnowhere 2023-11-09
bd0342c reneeisnowhere 2023-11-07

Knowles Boxplots of Fig9 genes

Knowles_log2cpm_box <- readRDS("data/Knowles_log2cpm_real.RDS")

store_box <- Knowles_log2cpm_box %>% 
  # dplyr::select( 'ESGN',ends_with(c('0.625', '0'))) %>% 
  dplyr::filter(ESGN %in% interest_genes$ensembl_gene_id) %>% 
  pivot_longer(cols=!ESGN, names_to = "ind", values_to = "counts") %>% 
  separate(ind,into=c("cell_line","dosage"), sep = ":") %>%
  mutate(dosage = as.numeric(dosage)) %>% 
  # full_join(., trop0.625, by=c("cell_line", "dosage")) %>% 
  group_by(cell_line) %>% 
  full_join(., interest_genes, by = c("ESGN" = "ensembl_gene_id"))
store_box %>% 
  mutate(dosage=factor(dosage, levels=c('0','0.000', '0.625','1.25', '2.5','5'))) %>% 
  ggplot(., aes(x=dosage,y=counts), group=dosage)+
  geom_boxplot()+
  facet_wrap(~hgnc_symbol)

Version Author Date
9af9df6 reneeisnowhere 2023-11-09

RNA-seq trial analysis

Analysis of expressed genes

RNA_seq_trial<- readRDS("data/RNA_seq_trial.RDS")

all_cpmcount <-  read_table("data/Counts_RNA_ERMatthews.txt")
cpm_count_main <- readRDS("data/cpmcount.RDS") %>% rownames_to_column(var = "ENTREZID")
colnames(cpm_count_main) <- colnames(all_cpmcount)


test_run_sample_list <- read.csv("data/test_run_sample_list.txt", row.names = 1)

colnames(RNA_seq_trial) <- c("ENTREZID",test_run_sample_list$Sample_ID)

lcpm_trial <- RNA_seq_trial %>% 
  column_to_rownames("ENTREZID") %>% 
  cpm(., log=TRUE) %>% 
  as.data.frame() #%>% 
 

row_means <- rowMeans(lcpm_trial)
x_trial <- lcpm_trial[row_means > 0,]
dim(x_trial)
[1] 13277     4
list_genes_trial <- rownames(x_trial)
ggVennDiagram::ggVennDiagram(list(list_genes_trial, cpm_count_main$ENTREZID),
                             category.names = c("Trialgenes","Maingenes"),
              show_intersect = TRUE,
              set_color = "black",
              label = "count",
              label_percent_digit = 1,
              label_size = 4,
              label_alpha = 0,
              label_color = "black",
              edge_lty = "solid", set_size = 4.5)#+

Correlation of counts files

lcpm_trial_full <- RNA_seq_trial %>% 
  column_to_rownames("ENTREZID") %>% 
  cpm(., log=TRUE) %>% 
  as.data.frame() %>% 
  rownames_to_column(var = "ENTREZID")

lcpm_trial_full %>%
  column_to_rownames(var="ENTREZID") %>%
  cor(.) %>% 
  Heatmap(.,layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(., i, j)), x, y, 
            gp = gpar(fontsize = 10))})

Version Author Date
ae9124e reneeisnowhere 2023-10-30
lcpm_main <- all_cpmcount %>% 
  column_to_rownames("ENTREZID") %>% 
  cpm(., log=TRUE) %>% 
  as.data.frame() %>% 
  rownames_to_column(var = "ENTREZID") %>% 
  dplyr::select(ENTREZID, all_of(starts_with("DOX"))) %>% 
  dplyr::select(ENTREZID, all_of(ends_with("3h")))  
  
combined_data <- lcpm_main %>%
  full_join(., lcpm_trial_full, by= "ENTREZID") %>%
  column_to_rownames("ENTREZID") %>% 
  dplyr::select(starts_with("DOX"),`3hr_0.5`)%>% 
  cor(.,) 


  
  Heatmap(combined_data,column_title = "Full gene list",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(combined_data, i, j)), x, y, 
            gp = gpar(fontsize = 10))})

Version Author Date
b4d55c4 reneeisnowhere 2023-11-21
ae9124e reneeisnowhere 2023-10-30
  only79_ind <- lcpm_main %>%
  full_join(., lcpm_trial_full, by= "ENTREZID") %>% 
    dplyr::select(ENTREZID,'3hr_0.5',"DOX.4.3h") %>% 
    column_to_rownames("ENTREZID") %>% 
  cor(.,) 

  
  Heatmap(only79_ind,column_title = "Full gene list_79",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(only79_ind, i, j)), x, y, 
            gp = gpar(fontsize = 10))})

Version Author Date
9af9df6 reneeisnowhere 2023-11-09
ae9124e reneeisnowhere 2023-10-30
lcpm_main_veh <- all_cpmcount %>% 
  column_to_rownames("ENTREZID") %>% 
  cpm(., log=TRUE) %>% 
  as.data.frame() %>% 
  rownames_to_column(var = "ENTREZID") %>% 
  dplyr::select(ENTREZID, all_of(c(starts_with("DOX"),starts_with("VEH")))) %>% 
   dplyr::select(ENTREZID, all_of(ends_with("3h")))  
  

combined_data_veh<- lcpm_main_veh %>%
  full_join(., lcpm_trial_full, by= "ENTREZID") %>% 
  column_to_rownames("ENTREZID") %>% 
  cor(.,) 
  
  
  
  Heatmap(combined_data_veh, column_title = "all genes in list, no filtering",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(combined_data_veh, i, j)), x, y, 
            gp = gpar(fontsize = 8))})

Version Author Date
ae9124e reneeisnowhere 2023-10-30
lcpm_trial_filter_main <- lcpm_trial_full %>% 
  filter(ENTREZID %in% cpm_count_main$ENTREZID)
 


lcpm_trial_filter_main %>% 
column_to_rownames(var="ENTREZID") %>%
  cor(.) %>% 
  Heatmap(.,column_title = "Using 14,084 expressed genes from Main data",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(., i, j)), x, y, 
            gp = gpar(fontsize = 8))})

Version Author Date
ae9124e reneeisnowhere 2023-10-30
lcpm_trial_filter <- lcpm_trial_full %>% 
  filter(ENTREZID %in% list_genes_trial)
 

lcpm_trial_filter %>% 
column_to_rownames(var="ENTREZID") %>%
  cor(.) %>% 
  Heatmap(.,column_title = "Using 13277 expressed genes",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(., i, j)), x, y, 
            gp = gpar(fontsize = 8))})

Version Author Date
ae9124e reneeisnowhere 2023-10-30
lcpm_main_filter_trial <- lcpm_main_veh %>% 
  filter(ENTREZID %in% list_genes_trial)

lcpm_trial_filter %>% 
  full_join(., lcpm_main_filter_trial, by = "ENTREZID") %>% 
  column_to_rownames(var="ENTREZID") %>%
  cor(.) %>% 
  Heatmap(.,column_title = "Using 13277 expressed genes",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(., i, j)), x, y, 
            gp = gpar(fontsize = 8))})

Version Author Date
ae9124e reneeisnowhere 2023-10-30
lcpm_trial_filter_main %>% 
  left_join(., lcpm_main, by = "ENTREZID") %>% 
  column_to_rownames(var="ENTREZID") %>%
  dplyr::select(DOX.4.3h,starts_with(("3hr")))%>% 
  cor(.) %>% 
  Heatmap(.,column_title = "Using 14084 expressed genes, just 79-1",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(., i, j)), x, y, 
            gp = gpar(fontsize = 8))})

Version Author Date
bd0342c reneeisnowhere 2023-11-07
hr3_indv4 <- lcpm_trial_filter_main %>% 
  left_join(., lcpm_main, by = "ENTREZID") %>% 
  column_to_rownames(var="ENTREZID") %>%
  dplyr::select(DOX.4.3h,`3hr_0.5`,`3hr_0.0`)%>% 
  cor(.) %>% 
  Heatmap(.,column_title = "Using 14084 expressed genes, just 79-1",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(., i, j)), x, y, 
            gp = gpar(fontsize = 8))})
  
plot(hr3_indv4)

Version Author Date
bd0342c reneeisnowhere 2023-11-07

correlation heatmap of 3hr Dox 1-6 individuals and trial data

lcpm_trial_filter_main %>% 
  left_join(., lcpm_main, by = "ENTREZID") %>% 
  column_to_rownames(var="ENTREZID") %>%
  dplyr::select(starts_with("DOX"),`3hr_0.5`)%>% 
  cor(.) %>% 
  Heatmap(.,column_title = "Using 14084 expressed genes, just 79-1 with all 3 hour samples",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(., i, j)), x, y, 
            gp = gpar(fontsize = 8))})

Version Author Date
9af9df6 reneeisnowhere 2023-11-09
lcpm_main %>% 
  left_join(., lcpm_trial_full, by = "ENTREZID") %>% 
  column_to_rownames(var="ENTREZID") %>%
  dplyr::select(starts_with("DOX"),`3hr_0.5`)%>% 
  cor(.) %>% 
  Heatmap(.,column_title = "all 29395 genes expriment with trial 0.5 uM",
          layer_fun = function(j, i, x, y, width, height, fill) {
              grid.text(sprintf("%.3f", pindex(., i, j)), x, y, 
            gp = gpar(fontsize = 8))})

Version Author Date
b4d55c4 reneeisnowhere 2023-11-21

barplots

backGL <-read_csv("data/backGL.txt", 
    col_types = cols(...1 = col_skip()))

GOI_genelist <- read.csv("output/GOI_genelist.txt", row.names = 1)

cpm_boxplot_trial <-function(lcpm_trial, GOI, ylab) {
    ##GOI needs to be ENTREZID
  df_plot <- lcpm_trial %>% 
    dplyr::filter(rownames(.)== GOI) %>%
    pivot_longer(everything(),
                 names_to = "treatment",
                 values_to = "counts") %>%
    separate(treatment, c("time","conc"), sep= "_") %>%
    mutate(conc = factor(conc,levels=c('0.0','0.1','0.5','1.0'), labels = c ("NT", "0.1 uM", "0.5 uM", "1.0 uM")))
  
 plota <-  ggplot2::ggplot(df_plot, aes(x=conc, y= counts))+
    geom_col(position="identity")+
    theme_bw()+
    ylab(ylab)+
    xlab("")+
     ggtitle(paste(GOI))+
      theme(
        # strip.background = element_rect(fill = "white",linetype=1, linewidth = 0.5),
          plot.title = element_text(size=12,hjust = 0.5,face="bold"),
          axis.title = element_text(size = 10, color = "black"),
          axis.ticks = element_line(linewidth = 1.0),
          panel.background = element_rect(colour = "black", size=1),
          # axis.text.x = element_blank(),
          strip.text.x = element_text(margin = margin(2,0,2,0, "pt"),face = "bold"))
    print(plota)
}
  



for (g in seq(1:11)){
  datafilter <- GOI_genelist
  a <- GOI_genelist[g,3]
  # b <- datafilter[g,1]
  cpm_boxplot_trial(lcpm_trial,GOI=datafilter[g,1],
                           ylab =bquote(~italic(.(a))~log[2]~"cpm "))
  
}  

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

Version Author Date
c72467c reneeisnowhere 2023-11-09
d12232a reneeisnowhere 2023-11-07

expression of trial RNA seq data ### Knowles log2cpm 24hr and my log2cpm 24hr

library(ggsignif)
kcpm <- store_box %>%  
  mutate(dosage=factor(dosage, levels=c('0','0.000', '0.625','1.25', '2.5','5'))) %>% 
  dplyr::filter(dosage==("0")|dosage == "0.625") %>% 
  mutate(expr="K")
  
lcpm_24h <- all_cpmcount %>% 
  column_to_rownames("ENTREZID") %>% 
  cpm(., log=TRUE) %>% 
  as.data.frame() %>% 
  rownames_to_column(var = "ENTREZID") %>% 
  dplyr::select(ENTREZID, all_of(starts_with(c("DOX","VEH")))) %>% 
  dplyr::select(ENTREZID, all_of(ends_with("24h")))  %>% 
  dplyr::filter(ENTREZID %in% interest_genes$entrezgene_id) %>% 
  pivot_longer(cols=!ENTREZID, names_to = "ind", values_to = "counts") %>% 
  mutate(ENTREZID = as.numeric(ENTREZID)) %>% 
  full_join(., interest_genes, by = c("ENTREZID"="entrezgene_id")) %>% 
  mutate(expr="ME") %>% 
  rename("ESGN"="ensembl_gene_id","entrezgene_id"="ENTREZID") %>% 
  separate(ind, into = c("dosage","cell_line",NA)) %>% 
  mutate(dosage=case_match(dosage,"DOX"~"0.5", .default = dosage)) 
  
lcpm_24h %>% 
  rbind(.,kcpm) %>% 
  mutate(dosage=factor(dosage, levels=c('0','0.625',"VEH","0.5"))) %>% 
  ggplot(., aes(x=dosage,y=counts), group=expr)+
  geom_boxplot()+
  facet_wrap(~hgnc_symbol, scales="free_y" )#+

Version Author Date
b474072 reneeisnowhere 2023-11-21
b4d55c4 reneeisnowhere 2023-11-21
  # geom_signif(
  #   comparisons = list(
  #     c('0', '0.5'),
  #     c('0', '0.625'),
  #     c('0.5','0.625')
  #   ),
  #   test = "t.test",
  #   tip_length = 0.01,
  #   map_signif_level = FALSE,
  #   textsize = 4,
  #   step_increase = 0.3
  # ) 

Replication of variance figures using Knowels data

store_var <- Knowles_log2cpm %>% 
  dplyr::select( 'ESGN',ends_with(c('0.625', '0'))) %>% 
  rowwise() %>% 
  mutate(mean_DOX=mean(c_across(ends_with('0.625'))),
         var_DOX=var(c_across(ends_with('0.625'))),
        mean_NT=mean(c_across(ends_with('0'))),
         var_NT=var(c_across(ends_with('0')))) %>% 
  mutate(data=list(var.test(c_across(ends_with("0.625")),c_across(ends_with("0"))))) %>% 
  dplyr::select("ESGN","mean_DOX","var_DOX","mean_NT", "var_NT","data")
saveRDS(store_var, "data/Knowles_variation_data.RDS")

knowlesdrug<- store_var %>% 
  dplyr::select("ESGN","mean_DOX","var_DOX","mean_NT", "var_NT") %>% 
  pivot_longer(cols = !"ESGN", names_to = "short", values_to = "values") %>% 
  separate(short, into=c("calc","treatment")) #%>% 
knowlesdrug %>% 
  as.data.frame() %>% 
  dplyr::filter(calc == "mean") %>% 
  ggplot(., aes(x= treatment, y=values))+
  geom_boxplot()+
  ggtitle("Knowles Means across all genes")+
  geom_signif(
    comparisons = list(
      c("DOX", "NT")),
    test = "t.test",
    tip_length = 0.01,
    map_signif_level = FALSE
    # textsize = 4,
    # # y_position = 11,
    # step_increase = 0.05
  )

knowlesdrug %>% 
  as.data.frame() %>% 
  dplyr::filter(calc == "var") %>% 
  ggplot(., aes(x= treatment, y=values))+
  geom_boxplot(outlier.shape= NA)+
  ggtitle(" Knowles Variance across all genes")+
  
  geom_signif(
    comparisons = list(
      c("DOX", "NT")),
    test = "t.test",
    tip_length = 0.01,
    y_position = 0.5,
    # vjust=1,
    map_signif_level = FALSE)+
 ylim(NA,1.25)

library(qvalue)


p_list <- map_df(store_var$data,~as.data.frame(.x$p.value))
 rownames(p_list) <- store_var$ESGN
estDOXk <- qvalue(p_list) 
 hist(estDOXk)

 plot(estDOXk)

summary(estDOXk) 

Call:
qvalue(p = p_list)

pi0:    0.3993136   

Cumulative number of significant calls:

          <1e-04 <0.001 <0.01 <0.025 <0.05 <0.1    <1
p-value     1919   2762  4065   4832  5533 6453 12317
q-value     1616   2447  3895   4810  5723 6956 12317
local FDR   1143   1759  2724   3313  3800 4507 12317
knowlesvar <- data.frame("pvalues"=estDOXk$pvalues,"qvalues"=estDOXk$qvalues,"lfdr"= estDOXk$lfdr)  
 colnames(knowlesvar) <- c("pvalues", "qvalues","lfdr")
intersecting_K <- knowlesvar %>% 
  filter(lfdr<0.1)
my_qval_list24 <- readRDS("data/qval24hr.RDS") 

EPI508_list <- my_qval_list24 %>% 
  dplyr::select(ENTREZID,EPIqvalues) %>% 
  filter(EPIqvalues<0.1) %>% 
  dplyr::select(ENTREZID) %>%
  mutate(ENTREZID=as.numeric(ENTREZID)) %>% 
  left_join(.,backGL, by="ENTREZID")
Knowlesvarlist <- readRDS("data/Knowlesvarlist.RDS")
  
# Knowlesvarlist<- getBM(attributes=my_attributes,filters ='ensembl_gene_id',values = rownames(intersecting_K), mart = ensembl)

length(intersect(EPI508_list$ENTREZID,Knowlesvarlist$entrezgene_id))
[1] 299
intersect_genes <- EPI508_list %>% 
  dplyr::filter(ENTREZID %in% Knowlesvarlist$entrezgene_id)

intersect_genes %>% 
kable(.,caption = "EPI Highly variable genes") %>% 
  kable_paper("striped", full_width = FALSE) %>%  
  kable_styling(full_width = TRUE, bootstrap_options = c("striped","hover")) %>% 
  scroll_box(width = "100%", height = "400px")
EPI Highly variable genes
ENTREZID SYMBOL
49856 WRAP73
23463 ICMT
5195 PEX14
23207 PLEKHM2
79363 CPLANE2
55160 ARHGEF10L
55616 ASAP3
57095 PITHD1
84065 TMEM222
23673 STX12
56063 TMEM234
127544 RNF19B
113444 SMIM12
27095 TRAPPC3
112950 MED8
9670 IPO13
387338 NSUN4
8543 LMO4
64858 DCLRE1B
128077 LIX1L
6944 VPS72
65005 MRPL9
11266 DUSP12
5279 PIGC
83479 DDX59
134 ADORA1
163859 SDE2
5664 PSEN2
65094 JMJD4
126731 CCSAP
22796 COG2
79723 SUV39H2
253430 IPMK
26091 HERC4
219738 FAM241B
11319 ECD
5532 PPP3CB
79933 SYNPO2L
118924 FRA10AC1
10360 NPM3
9937 DCLRE1A
9184 BUB3
161 AP2A2
10612 TRIM3
65975 STK33
113174 SAAL1
627 BDNF
79797 ZNF408
10978 CLP1
55048 VPS37C
144097 SPINDOC
57410 SCYL1
10432 RBM14
338692 ANKRD13D
5883 RAD9A
116985 ARAP1
282679 AQP11
51585 PCF11
60492 CCDC90B
54851 ANKRD49
10929 SRSF8
5049 PAFAH1B2
219902 TLCD5
219899 TBCEL
9538 EI24
219833 KCNJ5-AS1
57102 C12orf4
8079 MLF2
25977 NECAP1
79657 RPAP3
55652 SLC48A1
1019 CDK4
23041 MON2
29110 TBK1
253827 MSRB3
117177 RAB3IP
22822 PHLDA1
89795 NAV3
1407 CRY1
51228 GLTP
400073 C12orf76
5564 PRKAB1
51499 TRIAP1
387893 KMT5A
11066 SNRNP35
5901 RAN
283537 SLC46A3
2963 GTF2F2
337867 UBAC2
3621 ING1
253959 RALGAPA1
123016 TTC8
51527 GSKIP
2972 BRF1
22893 BAHD1
9325 TRIP4
5371 PML
60490 PPCDC
84219 WDR24
23059 CLUAP1
2072 ERCC4
91949 COG7
29070 CCDC113
8824 CES2
1874 E2F4
6560 SLC12A4
146198 ZFP90
5119 CHMP1A
64359 NXN
5048 PAFAH1B1
9135 RABEP1
57336 ZNF287
79736 TEFM
55813 UTP6
54475 NLE1
5193 PEX12
22794 CASC3
3292 HSD17B1
10614 HEXIM1
114881 OSBPL7
8405 SPOP
10237 SLC35B1
81558 FAM117A
55316 RSAD1
8161 COIL
6426 SRSF1
54903 MKS1
284161 GDPD1
57508 INTS2
84923 FAM104A
55028 C17orf80
6730 SRP68
9489 PGS1
9775 EIF4A3
57521 RPTOR
79643 CHMP6
5881 RAC3
55364 IMPACT
54531 MIER2
126308 MOB3A
29985 SLC39A3
51343 FZR1
6455 SH3GL1
5609 MAP2K7
79603 CERS4
93134 ZNF561
2193 FARSA
85360 SYDE1
8907 AP1M1
54858 PGPEP1
93436 ARMC6
79414 LRFN3
163087 ZNF383
84503 ZNF527
22835 ZFP30
284323 ZNF780A
29950 SERTAD1
90324 CCDC97
56006 SMG9
7773 ZNF230
9668 ZNF432
147657 ZNF480
112724 RDH13
163033 ZNF579
147694 ZNF548
100293516 ZNF587B
25799 ZNF324
55006 TRMT61B
253635 GPATCH11
92906 HNRNPLL
8491 MAP4K3
57504 MTA3
53335 BCL11A
5861 RAB1A
27332 ZNF638
129303 TMEM150A
5903 RANBP2
10254 STAM2
79828 METTL8
80067 DCAF17
129831 RBM45
3628 INPP1
6775 STAT4
9330 GTF3C3
57404 CYP20A1
377007 KLHL30
4735 SEPTIN2
80023 NRSN2
55317 AP5S1
64412 GZF1
51230 PHF20
25980 AAR2
10904 BLCAP
51006 SLC35C2
10564 ARFGEF2
11054 OGFR
80331 DNAJC5
29104 N6AMT1
84221 SPATC1L
51586 MED15
6598 SMARCB1
84700 MYO18B
402055 SRRD
84164 ASCC2
23780 APOL2
129138 ANKRD54
9463 PICK1
84247 RTL6
132001 TAMM41
23609 MKRN2
22908 SACM1L
51385 ZNF589
64925 CCDC71
11070 TMEM115
28972 SPCS1
25871 NEPRO
131601 TPRA1
7879 RAB7A
51122 COMMD2
86 ACTL6A
90407 TMEM41A
1487 CTBP1
7469 NELFA
10606 PAICS
92597 MOB1B
266812 NAP1L5
56916 SMARCAD1
55212 BBS7
90826 PRMT9
4750 NEK1
55100 WDR70
55814 BDP1
167153 TENT2
9765 ZFYVE16
55781 RIOK2
90355 MACIR
153443 SRFBP1
8572 PDLIM4
202052 DNAJC18
23438 HARS2
10826 FAXDC2
9443 MED7
5917 RARS1
8899 PRPF4B
10473 HMGN4
7746 ZSCAN9
8449 DHX16
57827 C6orf47
578 BAK1
5467 PPARD
6428 SRSF3
9477 MED20
25821 MTO1
57226 LYRM2
26235 FBXL4
91749 MFSD4B
10758 TRAF3IP2
5689 PSMB1
5575 PRKAR1B
90639 COX19
84262 PSMG3
8379 MAD1L1
54476 RNF216
221830 POLR1F
3364 HUS1
55695 NSUN5
9569 GTF2IRD1
113878 DTX2
6717 SRI
9069 CLDN12
10898 CPSF4
3268 AGFG2
5001 ORC5
60561 RINT1
64418 TMEM168
27153 ZNF777
80346 REEP4
5533 PPP3CC
23087 TRIM35
90362 FAM110B
55824 PAG1
55656 INTS8
51123 ZNF706
51105 PHF20L1
203062 TSNARE1
55958 KLHL9
54840 APTX
55035 NOL8
5998 RGS3
399665 FAM102A
84885 ZDHHC12
5900 RALGDS
6837 MED22
57109 REXO4
92715 DPH7
23708 GSPT2
29934 SNX12
139596 UPRT
64860 ARMCX5
# saveRDS(Knowlesvarlist,"data/Knowlesvarlist.RDS")