Last updated: 2026-01-05
Checks: 7 0
Knit directory: DXR_website/
This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20251201) was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 49fa865. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish or
wflow_git_commit). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: DXR_website/data/
Ignored: code/Flow_Purity_Stressor_Boxplots_EMP_250226.R
Ignored: data/CDKN1A_geneplot_Dox.RDS
Ignored: data/Cormotif_dox.RDS
Ignored: data/Cormotif_prob_gene_list.RDS
Ignored: data/Cormotif_prob_gene_list_doxonly.RDS
Ignored: data/Counts_RNA_EMPfortmiller.txt
Ignored: data/DMSO_TNN13_plot.RDS
Ignored: data/DOX_TNN13_plot.RDS
Ignored: data/DOXgeneplots.RDS
Ignored: data/ExpressionMatrix_EMP.csv
Ignored: data/Ind1_DOX_Spearman_plot.RDS
Ignored: data/Ind6REP_Spearman_list.csv
Ignored: data/Ind6REP_Spearman_plot.RDS
Ignored: data/Ind6REP_Spearman_set.csv
Ignored: data/Ind6_Spearman_plot.RDS
Ignored: data/MDM2_geneplot_Dox.RDS
Ignored: data/Master_summary_75-1_EMP_250811_final.csv
Ignored: data/Master_summary_87-1_EMP_250811_final.csv
Ignored: data/Metadata.csv
Ignored: data/Metadata_2_norep_update.RDS
Ignored: data/Metadata_update.csv
Ignored: data/QC/
Ignored: data/Recovery_flow_purity.csv
Ignored: data/SIRT1_geneplot_Dox.RDS
Ignored: data/Sample_annotated.csv
Ignored: data/SpearmanHeatmapMatrix_EMP
Ignored: data/VolcanoPlot_D144R_original_EMP_250623.png
Ignored: data/VolcanoPlot_D24R_original_EMP_250623.png
Ignored: data/VolcanoPlot_D24T_original_EMP_250623.png
Ignored: data/annot_dox.RDS
Ignored: data/annot_list_hm.csv
Ignored: data/cormotif_dxr_1.RDS
Ignored: data/cormotif_dxr_2.RDS
Ignored: data/counts/
Ignored: data/counts_DE_df_dox.RDS
Ignored: data/counts_DE_raw_data.RDS
Ignored: data/counts_raw_filt.RDS
Ignored: data/counts_raw_matrix.RDS
Ignored: data/counts_raw_matrix_EMP_250514.csv
Ignored: data/d24_Spearman_plot.RDS
Ignored: data/dge_calc_dxr.RDS
Ignored: data/dge_calc_matrix.RDS
Ignored: data/ensembl_backup_dox.RDS
Ignored: data/fC_AllCounts.RDS
Ignored: data/fC_DOXCounts.RDS
Ignored: data/featureCounts_Concat_Matrix_DOXSamples_EMP_250430.csv
Ignored: data/featureCounts_DOXdata_updatedind.RDS
Ignored: data/filcpm_colnames_matrix.csv
Ignored: data/filcpm_matrix.csv
Ignored: data/filt_gene_list_dox.RDS
Ignored: data/filter_gene_list_final.RDS
Ignored: data/final_data/
Ignored: data/gene_clustlike_motif.RDS
Ignored: data/gene_postprob_motif.RDS
Ignored: data/genedf_dxr.RDS
Ignored: data/genematrix_dox.RDS
Ignored: data/genematrix_dxr.RDS
Ignored: data/heartgenes.csv
Ignored: data/heartgenes_dox.csv
Ignored: data/image_intensity_stats_all_conditions.csv
Ignored: data/image_level_stats_all_conditions.csv
Ignored: data/image_level_summary_yH2AX_quant.csv
Ignored: data/ind_num_dox.RDS
Ignored: data/ind_num_dxr.RDS
Ignored: data/initial_cormotif.RDS
Ignored: data/initial_cormotif_dox.RDS
Ignored: data/low_nuclei_samples.csv
Ignored: data/low_veh_samples.csv
Ignored: data/new/
Ignored: data/new_cormotif_dox.RDS
Ignored: data/perc_mean_all_10X.png
Ignored: data/perc_mean_all_20X.png
Ignored: data/perc_mean_all_40X.png
Ignored: data/perc_mean_consistent_10X.png
Ignored: data/perc_mean_consistent_20X.png
Ignored: data/perc_mean_consistent_40X.png
Ignored: data/perc_median_all_10X.png
Ignored: data/perc_median_all_20X.png
Ignored: data/perc_median_all_40X.png
Ignored: data/perc_median_consistent_10X.png
Ignored: data/perc_median_consistent_20X.png
Ignored: data/perc_median_consistent_40X.png
Ignored: data/plot_leg_d.RDS
Ignored: data/plot_leg_d_horizontal.RDS
Ignored: data/plot_leg_d_vertical.RDS
Ignored: data/plot_theme_custom.RDS
Ignored: data/process_gene_data_funct.RDS
Ignored: data/raw_mean_all_10X.png
Ignored: data/raw_mean_all_20X.png
Ignored: data/raw_mean_all_40X.png
Ignored: data/raw_mean_consistent_10X.png
Ignored: data/raw_mean_consistent_20X.png
Ignored: data/raw_mean_consistent_40X.png
Ignored: data/raw_median_all_10X.png
Ignored: data/raw_median_all_20X.png
Ignored: data/raw_median_all_40X.png
Ignored: data/raw_median_consistent_10X.png
Ignored: data/raw_median_consistent_20X.png
Ignored: data/raw_median_consistent_40X.png
Ignored: data/ruv/
Ignored: data/summary_all_IF_ROI.RDS
Ignored: data/tableED_GOBP.RDS
Ignored: data/tableESR_GOBP_postprob.RDS
Ignored: data/tableLD_GOBP.RDS
Ignored: data/tableLR_GOBP_postprob.RDS
Ignored: data/tableNR_GOBP.RDS
Ignored: data/tableNR_GOBP_postprob.RDS
Ignored: data/table_incsus_genes_GOBP_RUV.RDS
Ignored: data/table_motif1_GOBP_d.RDS
Ignored: data/table_motif1_genes_GOBP.RDS
Ignored: data/table_motif2_GOBP_d.RDS
Ignored: data/table_motif3_genes_GOBP.RDS
Ignored: data/thing.R
Ignored: data/top.table_V.D144r_dox.RDS
Ignored: data/top.table_V.D24_dox.RDS
Ignored: data/top.table_V.D24r_dox.RDS
Ignored: data/yH2AX_75-1_EMP_250811.csv
Ignored: data/yH2AX_87-1_EMP_250811.csv
Ignored: data/yH2AX_Boxplots_MeanIntensity_10X_EMP_250811.png
Ignored: data/yH2AX_Boxplots_MeanIntensity_10X_NormtoMax_EMP_250811.png
Ignored: data/yH2AX_Boxplots_MeanIntensity_20X_EMP_250811.png
Ignored: data/yH2AX_Boxplots_MeanIntensity_20X_NormtoMax_EMP_250811.png
Ignored: data/yH2AX_Boxplots_MedianIntensity_10X_NormtoMax_EMP_250811.png
Ignored: data/yH2AX_Boxplots_MedianIntensity_20X_NormtoMax_EMP_250811.png
Ignored: data/yH2AX_Boxplots_MedianIntensity_NormtoMax_EMP_250811.png
Ignored: data/yH2AX_Boxplots_Quant_10X_EMP_250817.png
Ignored: data/yH2AX_Boxplots_Quant_20X_EMP_250817.png
Ignored: data/yh2ax_all_df_EMP_250812.csv
Unstaged changes:
Modified: analysis/DXR_Project_Analysis.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (DXR_website/analysis/Figure2.Rmd)
and HTML (DXR_website/docs/Figure2.html) files. If you’ve
configured a remote Git repository (see ?wflow_git_remote),
click on the hyperlinks in the table below to view the files as they
were in that past version.
| File | Version | Author | Date | Message |
|---|---|---|---|---|
| Rmd | 49fa865 | emmapfort | 2026-01-05 | Final Updates to Website EMP 01/05/26 |
| html | f088a00 | emmapfort | 2025-12-09 | website updates |
| Rmd | e05f0cc | emmapfort | 2025-12-09 | workflowr::wflow_publish(files = c("analysis/Figure2.Rmd", "analysis/about.Rmd", |
| Rmd | a967d53 | emmapfort | 2025-12-09 | Update website and analysis chunks |
| html | a967d53 | emmapfort | 2025-12-09 | Update website and analysis chunks |
| Rmd | da1f69c | emmapfort | 2025-12-02 | updated analysis 12/02/25 |
Load necessary libraries for analysis.
library(tidyverse)
library(readxl)
library(readr)
library(ggrepel)
library(ggfortify)
library(RUVSeq)
library(patchwork)
library(eulerr)
library(org.Hs.eg.db)
library(AnnotationDbi)
library(ggVennDiagram)
library(ggrastr)
Define my custom plot theme
# Define the custom theme
# plot_theme_custom <- function() {
# theme_minimal() +
# theme(
# #line for x and y axis
# axis.line = element_line(linewidth = 1,
# color = "black"),
#
# #axis ticks only on x and y, length standard
# axis.ticks.x = element_line(color = "black",
# linewidth = 1),
# axis.ticks.y = element_line(color = "black",
# linewidth = 1),
# axis.ticks.length = unit(0.05, "in"),
#
# #text and font
# axis.text = element_text(color = "black",
# family = "Arial",
# size = 7),
# axis.title = element_text(color = "black",
# family = "Arial",
# size = 9),
# legend.text = element_text(color = "black",
# family = "Arial",
# size = 7),
# legend.title = element_text(color = "black",
# family = "Arial",
# size = 9),
# plot.title = element_text(color = "black",
# family = "Arial",
# size = 10),
#
# #blank background and border
# panel.background = element_blank(),
# panel.border = element_blank(),
#
# #gridlines for alignment
# panel.grid.major = element_line(color = "grey80", linewidth = 0.5), #grey major grid for align in illus
# panel.grid.minor = element_line(color = "grey90", linewidth = 0.5) #grey minor grid for align in illus
# )
# }
# saveRDS(plot_theme_custom, "data/plot_theme_custom.RDS")
theme_custom <- readRDS("data/plot_theme_custom.RDS")
####Colors####
# txtime_col_old <- list(
# "DOX_24T" = "#263CA5",
# "VEH_24T" = "#444343",
# "DOX_24R" = "#5B8FD1",
# "VEH_24R" = "#757575",
# "DOX_144R" = "#89C5E5",
# "VEH_144R" = "#AAAAAA"
# )
#updated colors and factors:
ind_col <- list(
"1" = "#FF9F64",
"2" = "#78EBDA",
"3" = "#ADCD77",
"4" = "#E6CC50",
"5" = "#A68AC0",
"6" = "#F16F90",
"6R" = "#C61E4E"
)
ind_col_nr <- list(
"1" = "#FF9F64",
"2" = "#78EBDA",
"3" = "#ADCD77",
"4" = "#E6CC50",
"5" = "#A68AC0",
"6" = "#F16F90"
)
time_col <- list(
"t0" = "#005A4C",
"t24" = "#328477",
"t144" = "#8FB9B1")
tx_col <- list(
"DOX" = "#499FBD",
"VEH" = "#BBBBBC")
txtime_col <- list(
"DOX_t0" = "#263CA5",
"VEH_t0" = "#444343",
"DOX_t24" = "#5B8FD1",
"VEH_t24" = "#757575",
"DOX_t144" = "#89C5E5",
"VEH_t144" = "#AAAAAA"
)
Define a function to save plots as .pdf and .png
save_plot <- function(plot, filename,
folder = ".",
width = 8,
height = 6,
units = "in",
dpi = 300,
add_date = TRUE) {
if (missing(filename)) stop("Please provide a filename (without extension) for the plot.")
date_str <- if (add_date) paste0("_", format(Sys.Date(), "%y%m%d")) else ""
pdf_file <- file.path(folder, paste0(filename, date_str, ".pdf"))
png_file <- file.path(folder, paste0(filename, date_str, ".png"))
ggsave(filename = pdf_file, plot = plot, device = cairo_pdf, width = width, height = height, units = units, bg = "transparent")
ggsave(filename = png_file, plot = plot, device = "png", width = width, height = height, units = units, dpi = dpi, bg = "transparent")
message("Saved plot as Cairo PDF: ", pdf_file)
message("Saved plot as PNG: ", png_file)
}
output_folder <- "C:/Users/emmap/OneDrive/Desktop/DXR Manuscript Materials/Fig pdfs"
#save plot function created
#to use: just define the plot name, filename_base, width, height
Principal component analysis of RNAseq data following RUVs correction.
#read in dataframe after cpm transformation and k=1 RUVs correction
#add in the annotation files
ind_num <- readRDS("data/QC/ind_num.RDS")
annot <- read.csv("data/QC/Metadata_update.csv")
RUV_df_rm1_cpm <- readRDS("data/Fig2/RUV_df_rm1_cpm.RDS")
prcomp_res_1_cpm <- prcomp(t(RUV_df_rm1_cpm), scale. = FALSE, center = TRUE)
annot_prcomp_res_1_cpm <- readRDS("data/Fig2/annot_pca_cpm_fig2a.RDS")
fig2a <- autoplot(prcomp_res_1_cpm,
data = annot,
colour = "Cond",
shape = "Time",
size = 5,
x = 1,
y = 2) +
theme_custom() +
scale_color_manual(values = txtime_col) +
geom_text_repel(aes(label = ind_num),
nudge_y = 0.05,
max.overlaps = 100,
segment.color = "grey40") +
ggtitle("RUVs Correction k=1 log2cpm")
print(fig2a)

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#save for figure generation
# save_plot(plot = fig2a,
# filename = "Fig2A_PCA_6x8_EMP",
# folder = output_folder,
# width = 6,
# height = 8)
#Perform Differential Expression Analysis with k = 1 RUVs Correction
#same DGEList object as before
dge <- readRDS("data/DE/dge_matrix.RDS")
Metadata_2 <- readRDS("data/QC/Metadata_2_norep_update.RDS")
x_norep <- readRDS("data/DE/x_norep.RDS")
#check normalization factors from TMM normalization of LIBRARIES
dge$samples
group lib.size norm.factors
87-1_DOX_24 DOX_t0 19935097 1.0526605
87-1_DMSO_24 VEH_t0 21302879 0.9773889
87-1_DOX_24+24 DOX_t24 25636959 1.0751043
87-1_DMSO_24+24 VEH_t24 26319662 0.9940323
87-1_DOX_24+144 DOX_t144 23463426 0.9003102
87-1_DMSO_24+144 VEH_t144 25840938 0.9888449
78-1_DOX_24 DOX_t0 23085807 0.7676077
78-1_DMSO_24 VEH_t0 25610495 1.0077383
78-1_DOX_24+24 DOX_t24 18083930 1.1682704
78-1_DMSO_24+24 VEH_t24 24331177 0.9906872
78-1_DOX_24+144 DOX_t144 19754391 0.9941834
78-1_DMSO_24+144 VEH_t144 22641509 1.0010734
75-1_DOX_24 DOX_t0 20583626 1.0676786
75-1_DMSO_24 VEH_t0 28166198 1.0031906
75-1_DOX_24+24 DOX_t24 25831427 1.1530208
75-1_DMSO_24+24 VEH_t24 26081158 1.0058953
75-1_DOX_24+144 DOX_t144 24659898 0.9261599
75-1_DMSO_24+144 VEH_t144 25412931 0.9703454
84-1_DOX_24 DOX_t0 23393931 0.9745263
84-1_DMSO_24 VEH_t0 22853195 0.9565797
84-1_DOX_24+24 DOX_t24 23846995 1.1659432
84-1_DMSO_24+24 VEH_t24 21299355 0.9649641
84-1_DOX_24+144 DOX_t144 18222568 0.9913625
84-1_DMSO_24+144 VEH_t144 28115884 0.9653464
17-3_DOX_24 DOX_t0 22518848 0.9766893
17-3_DMSO_24 VEH_t0 24589534 0.9612345
17-3_DOX_24+24 DOX_t24 24797547 1.1703079
17-3_DMSO_24+24 VEH_t24 25977536 0.9509690
17-3_DOX_24+144 DOX_t144 27447106 0.9422729
17-3_DMSO_24+144 VEH_t144 24893583 0.9356377
90-1_DOX_24 DOX_t0 25187428 1.0311957
90-1_DMSO_24 VEH_t0 25630519 1.0283437
90-1_DOX_24+24 DOX_t24 26138399 1.1183471
90-1_DMSO_24+24 VEH_t24 24430396 0.9988688
90-1_DOX_24+144 DOX_t144 23323463 0.9496884
90-1_DMSO_24+144 VEH_t144 25424152 0.9872926
#read in my covariate information RUV_1 dataframe + remove 6R
RUV_df_rm1 <- readRDS("data/DE/RUV_df_rm1.RDS")
#filter out 6R for DE analysis
RUV_1_df_filt <- RUV_df_rm1[!grepl("6R$", rownames(RUV_df_rm1)), , drop = FALSE]
Metadata_2$W_1 <- RUV_1_df_filt$W_1
#now ensure that RUV_1 has the right number of cols after removing rep
length(RUV_1_df_filt$W_1)
[1] 36
#36
#now make this into a list
RUV_1_DE <- RUV_1_df_filt$W_1
# saveRDS(RUV_1_DE, "data/DE/RUV_1_DE.RDS")
#create my design matrix for DE + RUVs covariate k=1
design1 <- model.matrix(~0 + RUV_1_DE + Metadata_2$Condition)
colnames(design1) <- gsub("Metadata_2\\$Condition", "", colnames(design1))
#take care that the matrix automatically sorts cols alphabetically
#run duplicate correlation for individual effect
corfit1 <- duplicateCorrelation(object = dge$counts, design = design1, block = Metadata_2$Ind)
#voom transformation and plot
v1 <- voom(dge, design1, block = Metadata_2$Ind, correlation = corfit1$consensus.correlation, plot = TRUE)

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#fit my linear model
fit1 <- lmFit(v1, design1, block = Metadata_2$Ind, correlation = corfit1$consensus.correlation)
#make my contrast matrix to compare across tx and veh
contrast_matrix_RUV <- makeContrasts(
V.D24T = DOX_t0 - VEH_t0,
V.D24R = DOX_t24 - VEH_t24,
V.D144R = DOX_t144 - VEH_t144,
RUV_1_24T = RUV_1_DE - VEH_t0,
RUV_1_24R = RUV_1_DE - VEH_t24,
RUV_1_144R = RUV_1_DE - VEH_t144,
levels = design1
)
#apply these contrasts to compare DOX to DMSO VEH
fit_RUV <- contrasts.fit(fit1, contrast_matrix_RUV)
fit_RUV <- eBayes(fit_RUV)
#plot the mean-variance trend
plotSA(fit_RUV, main = "Mean-Variance Trend, RUVs k=1")

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#look at the summary of your results
##this tells you the number of DEGs in each condition
results_summary1 <- decideTests(fit_RUV, adjust.method = "BH", p.value = 0.05)
summary(results_summary1)
V.D24T V.D24R V.D144R RUV_1_24T RUV_1_24R RUV_1_144R
Down 4866 3727 466 10518 10519 10513
NotSig 4899 7107 13659 2905 2899 2889
Up 4554 3485 194 896 901 917
# V.D24T V.D24R V.D144R
# Down 4866 3727 466
# NotSig 4899 7107 13659
# Up 4554 3485 194
#compare this to my previous DEGs I found
# summary(results_summary)
# V.D24T V.D24R V.D144R
# Down 4723 3593 359
# NotSig 5076 7151 13810
# Up 4520 3575 150
#overall, there are more DEGs found after RUVs k=1 correction
#this was an expected result as it increases tx effect w/ correction
##Generate Toptables RUVs
# Generate Top Table for Specific Comparisons
x <- readRDS("data/DE/x_dge_counts_allind_updated.RDS")
Toptable_V.D24T_k1 <- topTable(fit = fit_RUV, coef = "V.D24T", number = nrow(x), adjust.method = "BH", p.value = 1, sort.by = "none")
# write.csv(Toptable_V.D24T_k1, "data/DE/Toptable_V.D24T_k1.csv")
Toptable_V.D24R_k1 <- topTable(fit = fit_RUV, coef = "V.D24R", number = nrow(x), adjust.method = "BH", p.value = 1, sort.by = "none")
# write.csv(Toptable_V.D24R_k1, "data/DE/Toptable_V.D24R_k1.csv")
Toptable_V.D144R_k1 <- topTable(fit = fit_RUV, coef = "V.D144R", number = nrow(x), adjust.method = "BH", p.value = 1, sort.by = "none")
# write.csv(Toptable_V.D144R_k1, "data/DE/Toptable_V.D144R_k1.csv")
# save all of these toptables as R objects
# saveRDS(list(
# V.D24T_k1 = Toptable_V.D24T_k1,
# V.D24R_k1 = Toptable_V.D24R_k1,
# V.D144R_k1 = Toptable_V.D144R_k1
# ), file = "data/DE/Toptable_list_RUVk1.RDS")
# Toptable_list_RUVk1 <- readRDS("data/DE/Toptable_list_RUVk1.RDS")
#################################################################
#this section is commented out since it only needs to run once
#kept the code for posterity
# #I want to add the hgnc symbols to my toptables as well
# ####D24T####
# #load in data
# sample_toptab_24T <- read_csv("C:/Users/emmap/RDirectory/Recovery_RNAseq/Recovery_5FU/data/new/DEGs/Toptable_V.D24T_k1.csv", show_col_types = FALSE)
# # ----------------- Ensure Entrez_ID is Present and in Character Format -----------------
# # Check column names
# print(colnames(sample_toptab_24T))
# # Rename if needed (adjust if the column name is not exactly 'Entrez_ID')
# colnames(sample_toptab_24T)[1] <- "Entrez_ID"
# # Convert Entrez_ID to character
# sample_toptab_24T <- sample_toptab_24T %>%
# mutate(Entrez_ID = as.character(Entrez_ID))
# # ----------------- Map Entrez_ID to Gene Symbol -----------------
# gene_symbols1 <- AnnotationDbi::select(
# org.Hs.eg.db,
# keys = sample_toptab_24T$Entrez_ID,
# columns = c("SYMBOL"),
# keytype = "ENTREZID"
# )
# # ----------------- Join Back to Main Data -----------------
# Toptable_RUV_24T <- left_join(sample_toptab_24T, gene_symbols1, by = c("Entrez_ID" = "ENTREZID"))
# Toptable_RUV_24T %>% column_to_rownames(., var = "Entrez_ID")
# # ----------------- Save Annotated Output -----------------
# write_csv(Toptable_RUV_24T, "data/DE/Toptable_RUV_24T.csv")
#
# Toptable_RUV_24T <- read.csv("data/DE/Toptable_RUV_24T.csv")
# #now do this again for my other two toptables
#
# ####24R####
# #load in data
# sample_toptab_24R <- read_csv("C:/Users/emmap/RDirectory/Recovery_RNAseq/Recovery_5FU/data/new/DEGs/Toptable_V.D24R_k1.csv", show_col_types = FALSE)
# # ----------------- Ensure Entrez_ID is Present and in Character Format -----------------
# # Check column names
# print(colnames(sample_toptab_24R))
# # Rename if needed (adjust if the column name is not exactly 'Entrez_ID')
# colnames(sample_toptab_24R)[1] <- "Entrez_ID"
# # Convert Entrez_ID to character
# sample_toptab_24R <- sample_toptab_24R %>%
# mutate(Entrez_ID = as.character(Entrez_ID))
# ----------------- Map Entrez_ID to Gene Symbol -----------------
# gene_symbols2 <- AnnotationDbi::select(
# org.Hs.eg.db,
# keys = sample_toptab_24R$Entrez_ID,
# columns = c("SYMBOL"),
# keytype = "ENTREZID"
# )
# # ----------------- Join Back to Main Data -----------------
# Toptable_RUV_24R <- left_join(sample_toptab_24R, gene_symbols2, by = c("Entrez_ID" = "ENTREZID"))
# Toptable_RUV_24R %>% column_to_rownames(., var = "Entrez_ID")
# # ----------------- Save Annotated Output -----------------
# #I'll make these symbols into rownames later for my volcano plots
# write_csv(Toptable_RUV_24R, "data/DE/Toptable_RUV_24R.csv")
#
# Toptable_RUV_24R <- read.csv("data/DE/Toptable_RUV_24R.csv")
#
# ####D144R####
# #load in data
# sample_toptab_144R <- read_csv("C:/Users/emmap/RDirectory/Recovery_RNAseq/Recovery_5FU/data/new/DEGs/Toptable_V.D144R_k1.csv", show_col_types = FALSE)
# # ----------------- Ensure Entrez_ID is Present and in Character Format -----------------
# # Check column names
# print(colnames(sample_toptab_144R))
# # Rename if needed (adjust if the column name is not exactly 'Entrez_ID')
# colnames(sample_toptab_144R)[1] <- "Entrez_ID"
# # Convert Entrez_ID to character
# sample_toptab_144R <- sample_toptab_144R %>%
# mutate(Entrez_ID = as.character(Entrez_ID))
# # ----------------- Map Entrez_ID to Gene Symbol -----------------
# gene_symbols3 <- AnnotationDbi::select(
# org.Hs.eg.db,
# keys = sample_toptab_144R$Entrez_ID,
# columns = c("SYMBOL"),
# keytype = "ENTREZID"
# )
# # ----------------- Join Back to Main Data -----------------
# Toptable_RUV_144R <- left_join(sample_toptab_144R, gene_symbols3, by = c("Entrez_ID" = "ENTREZID"))
# Toptable_RUV_144R %>% column_to_rownames(., var = "Entrez_ID")
# # ----------------- Save Annotated Output -----------------
# #I'll make these symbols into rownames later for my volcano plots
# write_csv(Toptable_RUV_144R, "data/DE/Toptable_RUV_144R.csv")
#
# write.csv(Toptable_RUV_24T, "data/DE/Toptable_RUV_24T_final.csv")
# write.csv(Toptable_RUV_24R, "data/DE/Toptable_RUV_24R_final.csv")
# write.csv(Toptable_RUV_144R, "data/DE/Toptable_RUV_144R_final.csv")
Toptable_RUV_24T <- read_csv("data/DE/Toptable_RUV_24T_final.csv",
col_select = -1)
New names:
Rows: 14319 Columns: 8
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): SYMBOL dbl (7): Entrez_ID, logFC, AveExpr, t, P.Value, adj.P.Val, B
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
Toptable_RUV_24R <- read_csv("data/DE/Toptable_RUV_24R_final.csv",
col_select = -1)
New names:
Rows: 14319 Columns: 8
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): SYMBOL dbl (7): Entrez_ID, logFC, AveExpr, t, P.Value, adj.P.Val, B
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
Toptable_RUV_144R <- read_csv("data/DE/Toptable_RUV_144R_final.csv",
col_select = -1)
New names:
Rows: 14319 Columns: 8
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): SYMBOL dbl (7): Entrez_ID, logFC, AveExpr, t, P.Value, adj.P.Val, B
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
# save all of these toptables as R objects
# saveRDS(list(
# V.D24T_RUV = Toptable_RUV_24T,
# V.D24R_RUV = Toptable_RUV_24R,
# V.D144R_RUV = Toptable_RUV_144R
# ), file = "data/DE/Toptable_list_RUVk1_Symbols.RDS")
Toptable_list_RUVk1_symbols <- readRDS("data/DE/Toptable_list_RUVk1_Symbols.RDS")
#Figure 2B - Barplot of DEG Percentages
###Proportion of DEGs in Gene Set RUVs corrected
#check the proportion of my genes in my gene set that are DEGs by timepoint
#make this into a pie chart or a bar graph
#read in my full gene set
filcpm_matrix <- readRDS("data/QC/filcpm_final_matrix_newind.RDS")
all_genes <- rownames(filcpm_matrix)
#convert Entrez_ID to Gene Symbols
all_genes_set <- mapIds(org.Hs.eg.db,
keys = all_genes,
column = "SYMBOL",
keytype = "ENTREZID",
multiVals = "first")
'select()' returned 1:1 mapping between keys and columns
# Load DEGs Data (replaced with new set)
DOX_t0 <- read_csv("data/DE/Toptable_RUV_24T_final.csv",
col_select = -1)
New names:
• `` -> `...1`
Rows: 14319 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): SYMBOL
dbl (7): Entrez_ID, logFC, AveExpr, t, P.Value, adj.P.Val, B
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
DOX_t24 <- read_csv("data/DE/Toptable_RUV_24R_final.csv",
col_select = -1)
New names:
Rows: 14319 Columns: 8
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): SYMBOL dbl (7): Entrez_ID, logFC, AveExpr, t, P.Value, adj.P.Val, B
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
DOX_t144 <- read_csv("data/DE/Toptable_RUV_144R_final.csv",
col_select = -1)
New names:
Rows: 14319 Columns: 8
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): SYMBOL dbl (7): Entrez_ID, logFC, AveExpr, t, P.Value, adj.P.Val, B
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
# Extract Significant DEGs (change names to timepoint)
DEG_t0_RUV <- DOX_t0$Entrez_ID[DOX_t0$adj.P.Val < 0.05]
DEG_t24_RUV <- DOX_t24$Entrez_ID[DOX_t24$adj.P.Val < 0.05]
DEG_t144_RUV <- DOX_t144$Entrez_ID[DOX_t144$adj.P.Val < 0.05]
# saveRDS(DEG_t0_RUV, "data/DE/DEGs_sig_list_24T_RUVs_set.RDS")
# saveRDS(DEG_t24_RUV, "data/DE/DEGs_sig_list_24R_RUVs_set.RDS")
# saveRDS(DEG_t144_RUV, "data/DE/DEGs_sig_list_144R_RUVs_set.RDS")
#make a list of all of my significant DEGs
timepoint_DEGs_RUV <- list(
"24T" = DEG_t0_RUV,
"24R" = DEG_t24_RUV,
"144R" = DEG_t144_RUV
)
# Function to compute proportions
get_proportions_degs <- function(degs, total_genes) {
deg_count <- length(degs)
non_deg_count <- total_genes - deg_count
data.frame(
Category = c("DEGs", "Non-DEGs"),
Count = c(deg_count, non_deg_count),
Proportion = c(deg_count / total_genes, non_deg_count / total_genes)
)
}
# Compute proportions for each timepoint
prop_t0 <- get_proportions_degs(DEG_t0_RUV, length(all_genes_set))
prop_t24 <- get_proportions_degs(DEG_t24_RUV, length(all_genes_set))
prop_t144 <- get_proportions_degs(DEG_t144_RUV, length(all_genes_set))
# Combine for bar plot
prop_bar <- rbind(
cbind(Timepoint = "t0", prop_t0),
cbind(Timepoint = "t24", prop_t24),
cbind(Timepoint = "t144", prop_t144)
)
#pie chart for t0
ggplot(prop_t0, aes(x = "", y = Count, fill = Category)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
labs(title = "DEGs vs Non-DEGs (t0)") +
theme_void()

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#pie chart for t24
ggplot(prop_t24, aes(x = "", y = Count, fill = Category)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
labs(title = "DEGs vs Non-DEGs (t24)") +
theme_void()

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#pie chart for t144
ggplot(prop_t144, aes(x = "", y = Count, fill = Category)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
labs(title = "DEGs vs Non-DEGs (t144)") +
theme_void()

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#put the desired colors here for your pie charts
degs_prop_colors <- c("DEGs" = "red", "Non-DEGs" = "blue")
#labels for percentage based on proportion
prop_t0$Label <- paste0(round(prop_t0$Proportion * 100, 1), "")
prop_t24$Label <- paste0(round(prop_t24$Proportion * 100, 1), "")
prop_t144$Label <- paste0(round(prop_t144$Proportion * 100, 1), "")
#create each plot and assign to variables
plot_t0 <- ggplot(prop_t0, aes(x = "", y = Count, fill = Category)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
geom_text(aes(label = Label),
position = position_stack(vjust = 0.6),
color = "white",
size = 4) +
scale_fill_manual(values = degs_prop_colors) +
labs(title = "t0") +
theme_void()
plot_t24 <- ggplot(prop_t24, aes(x = "", y = Count, fill = Category)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
geom_text(aes(label = Label),
position = position_stack(vjust = 0.6),
color = "white",
size = 4) +
scale_fill_manual(values = degs_prop_colors) +
labs(title = "t24") +
theme_void()
plot_t144 <- ggplot(prop_t144, aes(x = "", y = Count, fill = Category)) +
geom_bar(stat = "identity", width = 1) +
coord_polar("y") +
geom_text(aes(label = Label),
position = position_stack(vjust = 0.6),
color = "white",
size = 4) +
scale_fill_manual(values = degs_prop_colors) +
labs(title = "t144") +
theme_void()
#combine all three pie charts in a row
combined_prop_degs_plots <-
plot_t0 +
plot_t24 +
plot_t144 +
plot_layout(ncol = 3, nrow = 1, guides = "collect") +
plot_annotation(title = "Proportion of DEGs vs Non-DEGs")
#show the combined pie chart plot
combined_prop_degs_plots

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#combine all three pie charts in a column
combined_prop_degs_plots_col <-
plot_t0 +
plot_t24 +
plot_t144 +
plot_layout(ncol = 1, nrow = 3, guides = "collect") +
plot_annotation(title = "Proportion of DEGs vs Non-DEGs")
#show the combined pie chart plot in 1 column
combined_prop_degs_plots_col

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
####proportion bar plots####
#first factor the timepoints so that they are in the right order
prop_bar$Timepoint <- factor(prop_bar$Timepoint,
levels = c("t0", "t24", "t144"))
#percentage labels
prop_bar$Label <- paste0(round(prop_bar$Proportion * 100, 1), "")
#plot the bar plot
barplot_deg_perc <- ggplot(prop_bar, aes(x = Timepoint,
y = Proportion,
fill = Category)) +
geom_bar(stat = "identity",
position = "fill") +
geom_text(aes(label = Label),
position = position_fill(vjust = 0.5),
color = "white",
size = 4) +
scale_fill_manual(values = degs_prop_colors) +
labs(title = "Percentage of DEGs vs nonDEGs by Timepoint",
y = "Percentage of DEGs (%)",
x = "") +
scale_y_continuous(labels = c(0, 25, 50, 75, 100)) +
theme_custom()
print(barplot_deg_perc)

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#save this barplot
# save_plot(
# plot = barplot_deg_perc,
# filename = "DEGvsnonDEG_Barplot_update_EMP",
# folder = output_folder
# )
##Venn Diagrams of DEG Overlap after RUVs Correction k=1
#plot a venn diagram with all of your conditions from your toptables
# Load DEGs Data (already read in above)
# DOX_t0 <- read_csv("data/DE/Toptable_RUV_24T_final.csv",
# col_select = -1)
# DOX_t24 <- read_csv("data/DE/Toptable_RUV_24R_final.csv",
# col_select = -1)
# DOX_t144 <- read_csv("data/DE/Toptable_RUV_144R_final.csv",
# col_select = -1)
# Extract Significant DEGs (change names to timepoint)
# DEG_t0_RUV <- DOX_t0$Entrez_ID[DOX_t0$adj.P.Val < 0.05]
# DEG_t24_RUV <- DOX_t24$Entrez_ID[DOX_t24$adj.P.Val < 0.05]
# DEG_t144_RUV <- DOX_t144$Entrez_ID[DOX_t144$adj.P.Val < 0.05]
venn_test_RUV <- list(DEG_t0_RUV, DEG_t24_RUV, DEG_t144_RUV)
ggVennDiagram(
venn_test_RUV,
category.names = c("DOX_t0", "DOX_t24", "DOX_t144")
) + ggtitle("DXR Specific and Shared DEGs RUVs")+
theme_custom() +
theme(
plot.title = element_text(size = 16, face = "bold"), # Increase title size
text = element_text(size = 16) # Increase text size globally
)

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
####I want to make this proportional
#use the eulerr package to do so
venn_prop_RUV <- euler(list(
"t0" = DEG_t0_RUV,
"t24" = DEG_t24_RUV,
"t144" = DEG_t144_RUV
))
plot(venn_prop_RUV,
fills = list(fill = c("#263CA5", "#5B8FD1", "#89C5E5"),
alpha = 1),
labels = list(font = 4),
quantities = list(cex = 0.75),
main = "")

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#Load DEGs Data
#make a list of all of the genes in this set so I can plot the logFC in other sets
# Load DEGs Data (already read in above)
# DOX_t0 <- read_csv("data/DE/Toptable_RUV_24T_final.csv",
# col_select = -1)
# DOX_t24 <- read_csv("data/DE/Toptable_RUV_24R_final.csv",
# col_select = -1)
# DOX_t144 <- read_csv("data/DE/Toptable_RUV_144R_final.csv",
# col_select = -1)
DOX_24T_R <- read_csv("data/DE/Toptable_RUV_24T.csv")
Rows: 14319 Columns: 8
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (1): SYMBOL
dbl (7): Entrez_ID, logFC, AveExpr, t, P.Value, adj.P.Val, B
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
DOX_24R_R <- read_csv("data/DE/Toptable_RUV_24R_final.csv")
New names:
Rows: 14319 Columns: 9
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): SYMBOL dbl (8): ...1, Entrez_ID, logFC, AveExpr, t, P.Value, adj.P.Val, B
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
DOX_144R_R <- read_csv("data/DE/Toptable_RUV_144R_final.csv")
New names:
Rows: 14319 Columns: 9
── Column specification
──────────────────────────────────────────────────────── Delimiter: "," chr
(1): SYMBOL dbl (8): ...1, Entrez_ID, logFC, AveExpr, t, P.Value, adj.P.Val, B
ℹ Use `spec()` to retrieve the full column specification for this data. ℹ
Specify the column types or set `show_col_types = FALSE` to quiet this message.
• `` -> `...1`
#plot those full gene sets in logFC
D24T_DEGs_R <- DOX_t0$Entrez_ID[DOX_t0$adj.P.Val < 0.05]
length(D24T_DEGs_R)
[1] 9420
#9420 DEGs
D24R_DEGs_R <- DOX_t24$Entrez_ID[DOX_t24$adj.P.Val < 0.05]
length(D24R_DEGs_R)
[1] 7212
#7168 DEGs
D144R_DEGs_R <- DOX_t144$Entrez_ID[DOX_t144$adj.P.Val < 0.05]
length(D144R_DEGs_R)
[1] 660
#660 DEGs
#now that I have the full list of genes, I want to plot the logFC across conditions
#Combine the toptables I have from pairwise analysis into a single dataframe
d24_toptable_dxr_1 <- Toptable_RUV_24T %>%
mutate(Time = "t0")
d24r_toptable_dxr_1 <- Toptable_RUV_24R %>%
mutate(Time = "t24")
d144r_toptable_dxr_1 <- Toptable_RUV_144R %>%
mutate(Time = "t144")
combined_toptables_dxr_RUV <- bind_rows(
d24_toptable_dxr_1,
d24r_toptable_dxr_1,
d144r_toptable_dxr_1)
# saveRDS(combined_toptables_dxr_RUV, "data/DE/combined_toptables_dxr_RUV.RDS")
# Define the timepoints from your data
timepoints <- c("t0", "t24", "t144")
comparison_order <- c("t0 vs t24", "t24 vs t144", "t0 vs t144")
# Make all pairwise combinations
pairs_df <- t(combn(timepoints, 2)) %>%
as.data.frame(stringsAsFactors = FALSE) %>%
setNames(c("time_x", "time_y"))
logFC_wide_RUV <- combined_toptables_dxr_RUV %>%
mutate(absFC = abs(logFC)) %>%
dplyr::select(Entrez_ID, Time, logFC) %>%
pivot_wider(names_from = Time, values_from = logFC) %>%
drop_na() %>%
column_to_rownames("Entrez_ID")
# Function to prepare scatter data for each pair
prepare_pair_data <- function(time_x, time_y, df) {
rho <- cor(df[[time_x]], df[[time_y]], method = "spearman")
df %>%
mutate(
x_val = .data[[time_x]],
y_val = .data[[time_y]],
comparison = paste0(time_x, " vs ", time_y),
rho_label = paste0("rho = ", round(rho, 2)),
x_lab = paste0("log2FC", time_x),
y_lab = paste0("log2FC", time_y)
)
}
comparison_colors <- c(
"t0 vs t24" = "#4167BD",
"t24 vs t144" = "#72AEE0",
"t0 vs t144" = "#5780C5"
)
# Build combined plotting dataset
scatter_df <- map2_df(pairs_df$time_x, pairs_df$time_y,
~prepare_pair_data(.x, .y, logFC_wide_RUV)) %>%
mutate(comparison = factor(comparison, levels = comparison_order))
facet_labels <- scatter_df %>%
distinct(comparison, x_lab, y_lab) %>%
mutate(label = paste0(comparison, "\n", x_lab, " | ", y_lab)) %>%
dplyr::select(comparison, label) %>%
deframe()
#split data by comparison
split_data <- split(scatter_df, scatter_df$comparison)
make_scatter <- function(df_sub) {
comp <- df_sub$comparison[1]
ggplot(df_sub, aes(x = x_val, y = y_val)) +
geom_point_rast(alpha = 0.5, raster.dpi = 300, color = comparison_colors[comp]) +
geom_smooth(method = "lm", se = FALSE, linetype = "dashed", color = "black") +
geom_text(
data = df_sub %>% distinct(comparison, rho_label),
aes(x = -Inf, y = Inf, label = rho_label),
inherit.aes = FALSE,
hjust = -0.1, vjust = 1.5,
size = 4
) +
coord_fixed(xlim = c(-9,9), ylim = c(-9,9), ratio = 1) +
labs(
x = unique(df_sub$x_lab),
y = unique(df_sub$y_lab)
) +
theme_custom() +
theme(
axis.line = element_line(linewidth = 1,
color = "black"),
axis.ticks.length = unit(0.05, "in"),
axis.text = element_text(color = "black"),
panel.background = element_blank(),
panel.border = element_blank(),
legend.position = "none"
)
}
#make plot
plot_list <- lapply(split_data, make_scatter)
#wrap all three together to get the final plot
scatterplot_fig2d_wide <- wrap_plots(plot_list, ncol = length(plot_list))
scatterplot_fig2d_tall <- wrap_plots(plot_list, nrow = length(plot_list))
scatterplot_fig2d_wide
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
scatterplot_fig2d_tall
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'
`geom_smooth()` using formula = 'y ~ x'

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#save plot to a pdf and to a png
# ggsave("Fig2D_Scatterplot_fixedlims_EMP_251121.pdf", scatterplot_fig2d_wide, width = 8, height = 6, device = cairo_pdf, path = "C:/Users/emmap/OneDrive/Desktop/DXR Manuscript Materials/Fig pdfs")
# ggsave("Fig2D_Scatterplot_fixedlims_EMP_251121.png", scatterplot_fig2d_wide, width = 12, height = 4, dpi = 300, type = "cairo", path = "C:/Users/emmap/OneDrive/Desktop/DXR Manuscript Materials/Fig pdfs")
#make a function to generate volcano plots + add gene numbers
#ensure the gene symbols are in a col so I can plot names of top 15
generate_volcano_plot_RUV <- function(toptable, title) {
#make significance labels
toptable$Significance <- "Not Significant"
toptable$Significance[toptable$logFC > 0 & toptable$adj.P.Val < 0.05] <- "Upregulated"
toptable$Significance[toptable$logFC < 0 & toptable$adj.P.Val < 0.05] <- "Downregulated"
#enforce correct factor order
toptable$Significance <- factor(
toptable$Significance,
levels = c("Downregulated", "Not Significant", "Upregulated")
)
#add number of genes for each significance label
upgenes <- toptable %>%
filter(Significance == "Upregulated") %>%
nrow()
nsgenes <- toptable %>%
filter(Significance == "Not Significant") %>%
nrow()
downgenes <- toptable %>%
filter(Significance == "Downregulated") %>%
nrow()
#FIX: correct legend labels
legend_lab <- c(
"Upregulated" = paste0("up = ", upgenes),
"Not Significant" = paste0("ns = ", nsgenes),
"Downregulated" = paste0("down = ", downgenes)
)
#FIX: correct legend colors
legend_col <- c(
"Upregulated" = "blue",
"Not Significant" = "gray",
"Downregulated" = "red"
)
#generate volcano plot w/ legend
p <- ggplot(toptable, aes(x = logFC,
y = -log10(P.Value),
color = Significance)) +
geom_point_rast(alpha = 0.6, size = 2) + #rasterize
scale_color_manual(values = legend_col,
labels = legend_lab) +
xlim(-10, 10) +
ylim(0, 25) +
labs(title = title,
x = expression("log"[2]*"FC"),
y = expression("-log"[10]*"P-value")) +
theme_custom() +
theme(legend.position = "bottom")
return(p)
}
#generate volcano plots across each comparison
volcano_plots_RUV <- list(
"Volcano_Supp4_1_EMP" = generate_volcano_plot_RUV(Toptable_RUV_24T, "t0"),
"Volcano_Supp4_2_EMP" = generate_volcano_plot_RUV(Toptable_RUV_24R, "t24"),
"Volcano_Supp4_3_EMP" = generate_volcano_plot_RUV(Toptable_RUV_144R, "t144")
)
# Display each volcano plot
for (plot_name in names(volcano_plots_RUV)) {
print(volcano_plots_RUV[[plot_name]])
}

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |

| Version | Author | Date |
|---|---|---|
| a967d53 | emmapfort | 2025-12-09 |
#save all volcano plots
for (plot_name in names(volcano_plots_RUV)) {
save_plot(
filename = paste0(plot_name, "_EMP.pdf"),
plot = volcano_plots_RUV[[plot_name]],
height = 4,
width = 4,
folder = output_folder
)
}
Saved plot as Cairo PDF: C:/Users/emmap/OneDrive/Desktop/DXR Manuscript Materials/Fig pdfs/Volcano_Supp4_1_EMP_EMP.pdf_260105.pdf
Saved plot as PNG: C:/Users/emmap/OneDrive/Desktop/DXR Manuscript Materials/Fig pdfs/Volcano_Supp4_1_EMP_EMP.pdf_260105.png
Saved plot as Cairo PDF: C:/Users/emmap/OneDrive/Desktop/DXR Manuscript Materials/Fig pdfs/Volcano_Supp4_2_EMP_EMP.pdf_260105.pdf
Saved plot as PNG: C:/Users/emmap/OneDrive/Desktop/DXR Manuscript Materials/Fig pdfs/Volcano_Supp4_2_EMP_EMP.pdf_260105.png
Saved plot as Cairo PDF: C:/Users/emmap/OneDrive/Desktop/DXR Manuscript Materials/Fig pdfs/Volcano_Supp4_3_EMP_EMP.pdf_260105.pdf
Saved plot as PNG: C:/Users/emmap/OneDrive/Desktop/DXR Manuscript Materials/Fig pdfs/Volcano_Supp4_3_EMP_EMP.pdf_260105.png
sessionInfo()
R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22000)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8
[2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8
time zone: America/Chicago
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] ggrastr_1.0.2 ggVennDiagram_1.5.2
[3] org.Hs.eg.db_3.20.0 AnnotationDbi_1.68.0
[5] eulerr_7.0.2 patchwork_1.3.0
[7] RUVSeq_1.40.0 edgeR_4.4.0
[9] limma_3.62.1 EDASeq_2.40.0
[11] ShortRead_1.64.0 GenomicAlignments_1.42.0
[13] SummarizedExperiment_1.36.0 MatrixGenerics_1.18.1
[15] matrixStats_1.5.0 Rsamtools_2.22.0
[17] GenomicRanges_1.58.0 Biostrings_2.74.0
[19] GenomeInfoDb_1.42.3 XVector_0.46.0
[21] IRanges_2.40.0 S4Vectors_0.44.0
[23] BiocParallel_1.40.0 Biobase_2.66.0
[25] BiocGenerics_0.52.0 ggfortify_0.4.17
[27] ggrepel_0.9.6 readxl_1.4.5
[29] lubridate_1.9.4 forcats_1.0.0
[31] stringr_1.5.1 dplyr_1.1.4
[33] purrr_1.0.4 readr_2.1.5
[35] tidyr_1.3.1 tibble_3.2.1
[37] ggplot2_3.5.2 tidyverse_2.0.0
[39] workflowr_1.7.1
loaded via a namespace (and not attached):
[1] splines_4.4.2 later_1.4.2 BiocIO_1.16.0
[4] bitops_1.0-9 filelock_1.0.3 R.oo_1.27.1
[7] cellranger_1.1.0 polyclip_1.10-7 XML_3.99-0.18
[10] lifecycle_1.0.4 httr2_1.1.2 pwalign_1.2.0
[13] rprojroot_2.0.4 processx_3.8.6 lattice_0.22-6
[16] vroom_1.6.5 MASS_7.3-61 magrittr_2.0.3
[19] sass_0.4.10 rmarkdown_2.29 jquerylib_0.1.4
[22] yaml_2.3.10 httpuv_1.6.16 DBI_1.2.3
[25] RColorBrewer_1.1-3 abind_1.4-8 zlibbioc_1.52.0
[28] R.utils_2.13.0 RCurl_1.98-1.17 rappdirs_0.3.3
[31] git2r_0.36.2 GenomeInfoDbData_1.2.13 codetools_0.2-20
[34] DelayedArray_0.32.0 xml2_1.3.8 tidyselect_1.2.1
[37] UCSC.utils_1.2.0 farver_2.1.2 BiocFileCache_2.14.0
[40] jsonlite_2.0.0 systemfonts_1.2.2 polylabelr_0.3.0
[43] tools_4.4.2 progress_1.2.3 ragg_1.4.0
[46] Rcpp_1.0.14 glue_1.8.0 gridExtra_2.3
[49] SparseArray_1.6.0 xfun_0.52 mgcv_1.9-1
[52] withr_3.0.2 fastmap_1.2.0 latticeExtra_0.6-30
[55] callr_3.7.6 digest_0.6.37 timechange_0.3.0
[58] R6_2.6.1 textshaping_1.0.1 colorspace_2.1-1
[61] Cairo_1.6-5 jpeg_0.1-11 biomaRt_2.62.1
[64] RSQLite_2.3.9 R.methodsS3_1.8.2 generics_0.1.4
[67] rtracklayer_1.66.0 prettyunits_1.2.0 httr_1.4.7
[70] S4Arrays_1.6.0 whisker_0.4.1 pkgconfig_2.0.3
[73] gtable_0.3.6 blob_1.2.4 hwriter_1.3.2.1
[76] htmltools_0.5.8.1 scales_1.4.0 png_0.1-8
[79] knitr_1.50 rstudioapi_0.17.1 tzdb_0.5.0
[82] rjson_0.2.23 nlme_3.1-166 curl_6.0.1
[85] cachem_1.1.0 parallel_4.4.2 vipor_0.4.7
[88] restfulr_0.0.15 pillar_1.10.2 grid_4.4.2
[91] vctrs_0.6.5 promises_1.3.2 dbplyr_2.5.0
[94] beeswarm_0.4.0 evaluate_1.0.3 GenomicFeatures_1.58.0
[97] cli_3.6.3 locfit_1.5-9.12 compiler_4.4.2
[100] rlang_1.1.6 crayon_1.5.3 labeling_0.4.3
[103] interp_1.1-6 aroma.light_3.36.0 ps_1.9.1
[106] getPass_0.2-4 fs_1.6.6 ggbeeswarm_0.7.2
[109] stringi_1.8.7 deldir_2.0-4 Matrix_1.7-1
[112] hms_1.1.3 bit64_4.5.2 KEGGREST_1.46.0
[115] statmod_1.5.0 memoise_2.0.1 bslib_0.9.0
[118] bit_4.5.0