Last updated: 2026-01-19
Checks: 7 0
Knit directory: fashr-paper-stephenslab/
This reproducible R Markdown analysis was created with workflowr (version 1.7.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20251109) was run prior to running
the code in the R Markdown file. Setting a seed ensures that any results
that rely on randomness, e.g. subsampling or permutations, are
reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 0fc1046. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish or
wflow_git_commit). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/.DS_Store
Ignored: code/.DS_Store
Ignored: code/.Rhistory
Ignored: data/.DS_Store
Ignored: data/toy_example/
Ignored: output/.DS_Store
Untracked files:
Untracked: analysis/dynamic_eQTL_real_full.rmd
Untracked: analysis/nonlinear_dynamic_eQTL_real_full.rmd
Untracked: code/enrich_analysis.R
Untracked: data/appendixB/
Untracked: data/dynamic_eQTL_real/
Untracked: fash_but_not_strober_most.png
Untracked: output/appendixB/
Untracked: output/dynamic_eQTL_real/
Untracked: output/toy_example/
Unstaged changes:
Modified: analysis/dynamic_eQTL_real.rmd
Deleted: analysis/nonlinear_dynamic_eQTL_real.rmd
Modified: code/00_eQTLs.R
Modified: code/01_fash.R
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/dynamic_eQTL_real.Rmd) and
HTML (docs/dynamic_eQTL_real.html) files. If you’ve
configured a remote Git repository (see ?wflow_git_remote),
click on the hyperlinks in the table below to view the files as they
were in that past version.
| File | Version | Author | Date | Message |
|---|---|---|---|---|
| html | 0fc1046 | Ziang Zhang | 2025-12-18 | Build site. |
| html | c75cd5f | Ziang Zhang | 2025-12-18 | Build site. |
| html | c3e274e | Ziang Zhang | 2025-12-17 | Build site. |
| html | 2010ecb | Ziang Zhang | 2025-12-16 | Build site. |
| html | 5be62b6 | Ziang Zhang | 2025-12-12 | Build site. |
| html | 83145cc | Ziang Zhang | 2025-12-12 | Build site. |
| html | 36a81a3 | Ziang Zhang | 2025-11-09 | Build site. |
| html | d73f3f1 | Ziang Zhang | 2025-11-09 | Build site. |
knitr::opts_chunk$set(fig.width = 8, fig.height = 6, message = FALSE, warning = FALSE)
library(fashr)
library(dplyr)
library(tidyr)
library(stringr)
library(purrr)
# plotting / viz
library(ggplot2)
library(ggrepel)
# paths
result_dir <- file.path(getwd(), "output", "dynamic_eQTL_real")
data_dir <- file.path(getwd(), "data", "dynamic_eQTL_real")
code_dir <- file.path(getwd(), "code", "dynamic_eQTL_real")
# grids
log_prec <- seq(0, 10, by = 0.2)
fine_grid <- sort(c(0, exp(-0.5 * log_prec)))
# ----------------------------
# small utilities
# ----------------------------
cache_read <- function(path) if (file.exists(path)) readRDS(path) else NULL
cache_write <- function(x, path) {
dir.create(dirname(path), showWarnings = FALSE, recursive = TRUE)
saveRDS(x, path)
x
}
sample_int <- function(n, size, seed = 1L) {
set.seed(seed)
sample.int(n, size = min(size, n))
}
# Parse "ENSG..._rs..." keys once
parse_pair_keys <- function(keys) {
m <- str_match(keys, "^([^_]+)_(.+)$")
tibble(key = keys, ens_id = m[,2], rs_id = m[,3])
}
# ----------------------------
# biomaRt mapping with cache
# ----------------------------
get_gene_map <- function(ens_ids, cache_path) {
ens_ids <- unique(as.character(ens_ids))
cached <- cache_read(cache_path)
if (!is.null(cached)) return(cached)
suppressMessages({
library(biomaRt)
mart <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl")
})
mp <- getBM(
attributes = c("ensembl_gene_id", "hgnc_symbol"),
filters = "ensembl_gene_id",
values = ens_ids,
mart = mart
) %>%
as_tibble() %>%
distinct()
cache_write(mp, cache_path)
}
symbol_of <- function(ens_id, map_tbl) {
s <- map_tbl$hgnc_symbol[match(ens_id, map_tbl$ensembl_gene_id)]
ifelse(is.na(s) | s == "", ens_id, s)
}
# (optional) symbol -> ensembl (small list); cache too
symbol_to_ens <- function(symbols, cache_path) {
symbols <- unique(as.character(symbols))
cached <- cache_read(cache_path)
if (!is.null(cached)) return(cached)
suppressMessages({
library(biomaRt)
mart <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl")
})
mp <- getBM(
attributes = c("hgnc_symbol", "ensembl_gene_id"),
filters = "hgnc_symbol",
values = symbols,
mart = mart
) %>% as_tibble() %>% distinct()
cache_write(mp, cache_path)
}
# ----------------------------
# index selection helpers
# ----------------------------
pick_best_idx_by_gene <- function(ens_id, pair_tbl, lfdr_vec, which = c("min", "max")) {
which <- match.arg(which)
cand <- pair_tbl %>% filter(ens_id == !!ens_id) %>% pull(idx)
if (length(cand) == 0) return(NA_integer_)
if (which == "min") cand[which.min(lfdr_vec[cand])] else cand[which.max(lfdr_vec[cand])]
}
pick_idxs_for_genes <- function(ens_ids, pair_tbl, lfdr_vec, which = "min") {
purrr::map_int(ens_ids, ~pick_best_idx_by_gene(.x, pair_tbl, lfdr_vec, which = which))
}
# ----------------------------
# unified plotting
# ----------------------------
fmt_sci <- function(x, digits = 2) {
if (length(x) == 0L || is.null(x) || is.na(x)) return("NA")
formatC(as.numeric(x), format = "e", digits = digits)
}
plot_pair_base <- function(idx,
datasets,
fash_raw,
fash_adj,
gene_map,
p_lin = NULL,
p_non = NULL,
smooth_var = seq(0, 15, by = 0.1),
add_lm = TRUE,
add_quad = TRUE,
main_prefix = NULL) {
dat <- datasets[[idx]]
x <- dat$x; y <- dat$y; se <- dat$SE
w <- 1 / (se^2)
fitted <- predict(fash_adj, index = idx, smooth_var = smooth_var)
y_min <- min(y - 2 * se, fitted$lower, na.rm = TRUE)
y_max <- max(y + 2 * se, fitted$upper, na.rm = TRUE)
# titles
key_parts <- strsplit(names(datasets)[idx], "_", fixed = TRUE)[[1]]
ens_id <- key_parts[1]
rs_id <- key_parts[2]
key <- names(datasets)[idx]
gsym <- symbol_of(ens_id, gene_map)
main_txt <- paste0(gsym, ": ", rs_id)
if (!is.null(main_prefix)) main_txt <- paste0(main_prefix, " ", main_txt)
# fits
lin_fit <- if (add_lm) lm(y ~ x, weights = w) else NULL
quad_fit <- if (add_quad) lm(y ~ poly(x, 2, raw = TRUE), weights = w) else NULL
x_grid <- if (add_quad) seq(min(x), max(x), length.out = 200) else NULL
quad_pred <- if (add_quad) predict(quad_fit, newdata = data.frame(x = x_grid)) else NULL
plot(x, y, pch = 20, col = "black", xlab = "Time", ylab = "Effect Est", ylim = c(y_min, y_max))
title(main = main_txt)
arrows(x0 = x, y0 = y - 2 * se, x1 = x, y1 = y + 2 * se,
length = 0.05, angle = 90, code = 3, col = "black")
polygon(c(fitted$x, rev(fitted$x)), c(fitted$lower, rev(fitted$upper)),
col = rgb(1, 0, 0, 0.3), border = NA)
lines(fitted$x, fitted$mean, col = "red", lwd = 2)
if (add_lm) abline(lin_fit, col = "green", lty = 2, lwd = 1)
if (add_quad) lines(x_grid, quad_pred, col = "purple", lty = 2, lwd = 1)
lfdr_before <- fash_raw$lfdr[idx]
lfdr_after <- fash_adj$lfdr[idx]
p_lin_val <- if (!is.null(p_lin) && !is.null(names(p_lin)) && key %in% names(p_lin)) unname(p_lin[[key]]) else NA_real_
p_non_val <- if (!is.null(p_non) && !is.null(names(p_non)) && key %in% names(p_non)) unname(p_non[[key]]) else NA_real_
if (!is.na(p_lin_val) || !is.na(p_non_val)) {
cap <- sprintf("lfdr = %s (%s), p-value = %s (%s)",
fmt_sci(lfdr_after), fmt_sci(lfdr_before),
fmt_sci(p_lin_val), fmt_sci(p_non_val))
} else {
cap <- sprintf("lfdr (raw) = %s, lfdr (adj) = %s", fmt_sci(lfdr_before), fmt_sci(lfdr_after))
}
title(sub = cap, cex.sub = 0.8)
invisible(NULL)
}
plot_many_pairs <- function(idxs, nrow = 2, ncol = 2, ...) {
dots <- list(...)
datasets <- dots$datasets
idxs <- as.integer(idxs)
idxs <- idxs[!is.na(idxs)]
idxs <- idxs[idxs >= 1]
if (!is.null(datasets)) idxs <- idxs[idxs <= length(datasets)]
if (length(idxs) == 0L) {
message("plot_many_pairs: no valid indices to plot.")
return(invisible(NULL))
}
oldpar <- par(no.readonly = TRUE)
on.exit(par(oldpar), add = TRUE)
par(mfrow = c(nrow, ncol))
for (idx in idxs) {
try(plot_pair_base(idx = idx, ...), silent = TRUE)
}
invisible(NULL)
}
# ----------------------------
# Strober loading helpers
# ----------------------------
read_strober <- function(path) {
read.delim(path) %>%
as_tibble() %>%
mutate(key = paste0(ensamble_id, "_", rs_id))
}
# ----------------------------
# lfdr vs p-value scatter helper
# ----------------------------
plot_lfdr_vs_p <- function(fash_tbl, strober_df, pval_cutoff, lfdr_cutoff,
gene_map, title_text = "",
manual_symbol = NULL) {
eps <- .Machine$double.eps
if (is.null(manual_symbol)) manual_symbol <- c()
df <- fash_tbl %>%
inner_join(strober_df %>% dplyr::select(key, ensamble_id, pvalue), by = "key") %>%
mutate(ensamble_id = dplyr::coalesce(ensamble_id, ens_id))
# min per gene
df_p <- df %>% dplyr::group_by(ensamble_id) %>% dplyr::slice_min(pvalue, n = 1, with_ties = FALSE) %>% dplyr::ungroup()
df_l <- df %>% dplyr::group_by(ensamble_id) %>% dplyr::slice_min(lfdr_adj, n = 1, with_ties = FALSE) %>% dplyr::ungroup()
g <- df_p %>% dplyr::select(ensamble_id, pvalue) %>%
inner_join(df_l %>% dplyr::select(ensamble_id, lfdr_adj), by = "ensamble_id") %>%
mutate(
gene_symbol = symbol_of(ensamble_id, gene_map),
gene_label = dplyr::coalesce(unname(manual_symbol[ensamble_id]), gene_symbol, ensamble_id),
neglog_p = -log10(pmin(pmax(pvalue, eps), 1)),
neglog_lfdr = -log10(pmin(pmax(lfdr_adj, eps), 1)),
category = dplyr::case_when(
pvalue <= pval_cutoff & lfdr_adj <= lfdr_cutoff ~ "Both",
pvalue <= pval_cutoff & lfdr_adj > lfdr_cutoff ~ "Strober only",
pvalue > pval_cutoff & lfdr_adj <= lfdr_cutoff ~ "FASH only",
TRUE ~ "Neither"
),
category = factor(category, levels = c("Both", "FASH only", "Strober only", "Neither"))
)
top_strober_only <- g %>% dplyr::filter(category == "Strober only") %>% dplyr::arrange(pvalue) %>% dplyr::slice_head(n = 10)
top_fash_only <- g %>% dplyr::filter(category == "FASH only") %>% dplyr::arrange(lfdr_adj) %>% dplyr::slice_head(n = 10)
top_both <- g %>% dplyr::filter(category == "Both") %>% dplyr::arrange(pvalue + lfdr_adj) %>% dplyr::slice_head(n = 10)
label_df <- dplyr::bind_rows(top_strober_only, top_fash_only, top_both)
col_map <- c("Both"="#7B3294","FASH only"="#1B9E77","Strober only"="#D95F02","Neither"="grey80")
ggplot(g, aes(x = neglog_p, y = neglog_lfdr)) +
geom_point(data = subset(g, category == "Neither"), color = col_map["Neither"], alpha = 0.25, size = 1.8) +
geom_point(data = subset(g, category != "Neither"), aes(color = category), alpha = 0.8, size = 2.2) +
scale_color_manual(values = col_map) +
geom_vline(xintercept = -log10(pval_cutoff), linetype = "dashed") +
geom_hline(yintercept = -log10(lfdr_cutoff), linetype = "dashed") +
geom_text_repel(
data = label_df, aes(label = gene_label, color = category),
size = 3, box.padding = 0.3, point.padding = 0.5, max.overlaps = Inf, show.legend = FALSE
) +
theme_minimal() +
theme(text = element_text(size = 16), axis.text = element_text(size = 14), legend.position = "bottom") +
labs(title = title_text, x = "-log10(p-value)", y = "-log10(lfdr)", color = NULL)
}
We use the processed (expression & genotype) data of Strober et.al, 2019 to perform the eQTL analysis.
For the association testing, we use a linear regression model for each gene-variant pair at each time point. Following the practice in Strober et.al, we adjust for the first 3 PCs.
The code to perform this step can be found in the script
dynamic_eQTL_real/00_eQTLs.R from the code directory.
After this step, we have the effect size of eQTLs for each gene-variant pair at each time point, as well as its standard error.
To fit the FASH model on \(\{\beta_i(t_j), s_{ij}\}_{i\in N,j \in [16]}\), we consider fitting two FASH models:
A FASH model based on first order IWP (testing for dynamic eQTLs: \(H_0: \beta_i(t)=c\)).
A FASH model based on second order IWP (testing for nonlinear-dynamic eQTLs: \(H_0: \beta_i(t)=c_1+c_2t\)).
The code to perform this step can be found in the script
dynamic_eQTL_real/01_fash.R from the code directory.
We will directly load the fitted FASH models from the output directory.
load(file.path(result_dir, "fash_fit1_all.RData"))
We will load the datasets from the fitted FASH object:
datasets <- fash_fit1$fash_data$data_list
S_list <- fash_fit1$fash_data$S
for (i in seq_along(datasets)) datasets[[i]]$SE <- S_list[[i]]
pair_tbl <- parse_pair_keys(names(datasets)) %>%
mutate(idx = row_number())
all_genes <- unique(pair_tbl$ens_id)
gene_map <- get_gene_map(
ens_ids = all_genes,
cache_path = file.path(result_dir, "cache_gene_map.rds")
)
In this analysis, we will focus on the FASH(1) model that assumes a first order IWP and tests for dynamic eQTLs.
Let’s take a quick overview of the fitted FASH model:
log_prec <- seq(0,10, by = 0.2)
fine_grid <- sort(c(0, exp(-0.5*log_prec)))
fash_fit1 <- fash(Y = "beta", smooth_var = "time", S = "SE", data_list = datasets,
num_basis = 20, order = 1, betaprec = 0,
pred_step = 1, penalty = 10, grid = fine_grid,
num_cores = num_cores, verbose = TRUE)
save(fash_fit1, file = "./results/fash_fit1_all.RData")
fash_fit1
Fitted fash Object
-------------------
Number of datasets: 1009173
Likelihood: gaussian
Number of PSD grid values: 52 (initial), 11 (non-trivial)
Order of Integrated Wiener Process (IWP): 1
As well as the estimated priors:
fash_fit1$prior_weights
psd prior_weight
1 0.000000000 0.410663695
2 0.006737947 0.141127785
3 0.016572675 0.012015292
4 0.049787068 0.178001757
5 0.074273578 0.071560569
6 0.110803158 0.098258453
7 0.182683524 0.037970329
8 0.246596964 0.021572272
9 0.367879441 0.007501801
10 0.406569660 0.011455909
11 1.000000000 0.009872138
If we measure the significance using the false discovery rate, then it is sensitive to the value of \(\pi_0\). The estimated \(\pi_0\) is 0.4106637, which is way too small to be realistic.
One likely reason could be due to model-misspecification under the alternative hypothesis. To account for this, we will consider the following approaches:
(i): Computing a conservative estimate of \(\pi_0\) based on the BF procedure:
fash_fit1_update <- BF_update(fash_fit1, plot = FALSE)
fash_fit1_update$prior_weights
save(fash_fit1_update, file = paste0(result_dir, "/fash_fit1_update.RData"))
The conservative estimate is 0.93568, which is much more realistic.
(ii): Instead of looking at the FDR which is based on the estimated \(\pi_0\), we can use the minimum local false sign rate (\(\text{min-lfsr}_i\)) to measure significance: \[ \text{min-lfsr}_i = \min_{t} \left\{ \text{lfsr}(W_i(t)) \right\}, \] where \(W_i(t) = \beta_i(t) - \beta_i(0)\).
Let’s compute the significance using the minimum local false sign rate (\(\text{min-lfsr}_i\)):
smooth_var_refined = seq(0,15, by = 0.1)
min_lfsr_summary1 <- min_lfsr_summary(fash_fit1, num_cores = num_cores, smooth_var = smooth_var_refined)
save(min_lfsr_summary1, file = "./results/min_lfsr_summary1.RData")
min_lfsr_summary1_update <- min_lfsr_summary(fash_fit1_update, num_cores = num_cores, smooth_var = smooth_var_refined)
save(min_lfsr_summary1_update, file = "./results/min_lfsr_summary1_update.RData")
Let’s visualize how the min-lfsr compares with the local false discovery rate (lfdr):
load(file.path(result_dir, "min_lfsr_summary1.RData"))
# sample some indices for easy visualization
sample_indices <- sample_int(length(min_lfsr_summary1$min_lfsr), size = 1e5, seed = 1)
min_lfsr1_unadj <- min_lfsr_summary1$min_lfsr[sample_indices]
lfdr1_vec_unadj <- fash_fit1$lfdr[min_lfsr_summary1$index][sample_indices]
plot(lfdr1_vec_unadj, min_lfsr1_unadj,
pch = 20, cex = 0.1,
ylim = c(0,1), xlim = c(0,1),
xlab = "lfdr", ylab = "min-lfsr", main = "lfdr vs min-lfsr (Un-adjusted)")
lines(c(0,1), c(0,1), col = "red")

| Version | Author | Date |
|---|---|---|
| 2010ecb | Ziang Zhang | 2025-12-16 |
Let’s also visualize the min-lfsr and the lfdr from the BF-updated model:
load(file.path(result_dir, "min_lfsr_summary1_update.RData"))
min_lfsr1 <- min_lfsr_summary1_update$min_lfsr[sample_indices]
lfdr1_vec <- fash_fit1_update$lfdr[min_lfsr_summary1_update$index][sample_indices]
plot(lfdr1_vec, min_lfsr1,
pch = 20, cex = 0.1,
ylim = c(0,1), xlim = c(0,1),
xlab = "lfdr", ylab = "min-lfsr", main = "lfdr vs min-lfsr (BF-adjusted)")
lines(c(0,1), c(0,1), col = "red")

| Version | Author | Date |
|---|---|---|
| 2010ecb | Ziang Zhang | 2025-12-16 |
Indeed, the min-lfsr tends to be more conservative than the lfdr, especially when \(\hat{\pi_0}\) has not been adjusted using the BF procedure.
Let’s also assess how does the BF-update of \(\pi_0\) affect the min-lfsr.
plot(min_lfsr1_unadj, min_lfsr1,
pch = 20, cex = 0.1,
ylim = c(0,1), xlim = c(0,1),
xlab = "min-lfsr (unadjusted)", ylab = "min-lfsr", main = "Effect of BF adjustment on min-lfsr")
lines(c(0,1), c(0,1), col = "red")

| Version | Author | Date |
|---|---|---|
| 2010ecb | Ziang Zhang | 2025-12-16 |
We will use the updated FASH model (1) to detect dynamic eQTLs.
alpha <- 0.05
test1 <- fdr_control(fash_fit1_update, alpha = alpha, plot = FALSE)
10290 datasets are significant at alpha level 0.05. Total datasets tested: 1009173.
fash_highlighted1 <- test1$fdr_results$index[test1$fdr_results$FDR <= alpha]
test1_before <- fdr_control(fash_fit1, alpha = alpha, plot = FALSE)
50027 datasets are significant at alpha level 0.05. Total datasets tested: 1009173.
fash_highlighted1_before <- test1_before$fdr_results$index[test1_before$fdr_results$FDR <= alpha]
How many pairs are detected as dynamic eQTLs?
pairs_highlighted1 <- names(datasets)[fash_highlighted1]
length(pairs_highlighted1)
[1] 10290
length(pairs_highlighted1)/length(datasets)
[1] 0.01019647
What is the number before the BF adjustment?
pairs_highlighted1_before <- names(datasets)[fash_highlighted1_before]
length(pairs_highlighted1_before)
[1] 50027
length(pairs_highlighted1_before)/length(datasets)
[1] 0.04957227
How many unique genes are detected?
genes_highlighted1 <- unique(pair_tbl$ens_id[pair_tbl$idx %in% fash_highlighted1])
length(genes_highlighted1)
[1] 1295
length(genes_highlighted1)/length(all_genes)
[1] 0.2035523
Before the BF adjustment?
genes_highlighted1_before <- unique(pair_tbl$ens_id[pair_tbl$idx %in% fash_highlighted1_before])
length(genes_highlighted1_before)
[1] 3417
length(genes_highlighted1_before)/length(all_genes)
[1] 0.5370953
Let’s see how many pairs and genes remain significant after controlling the min-lfsr:
fash_highlighted1_lfsr <- min_lfsr_summary1_update$index[min_lfsr_summary1_update$fsr <= alpha]
pairs_highlighted1_lfsr <- names(datasets)[fash_highlighted1_lfsr]
length(pairs_highlighted1_lfsr)
[1] 10070
length(pairs_highlighted1_lfsr)/length(datasets)
[1] 0.009978468
genes_highlighted1_lfsr <- unique(pair_tbl$ens_id[pair_tbl$idx %in% fash_highlighted1_lfsr])
length(genes_highlighted1_lfsr)
[1] 1271
length(genes_highlighted1_lfsr)/length(all_genes)
[1] 0.1997799
It seems like once \(\hat{\pi_0}\) is adjusted, there is not much difference between measuring significance using the min-lfsr or the lfdr. From now on, we will consider the pairs that are significant using the lfdr.
Visualize top-ranked pairs for some selected genes:

| Version | Author | Date |
|---|---|---|
| 2010ecb | Ziang Zhang | 2025-12-16 |
Some examples of null pairs:

| Version | Author | Date |
|---|---|---|
| 2010ecb | Ziang Zhang | 2025-12-16 |
We will compare the detected dynamic eQTLs with the results from Strober et.al.
What are the p-value cutoff for linear and non-linear methods in Strober et.al?
pval_cutoff_strober_nonlinear <- max(strober_nonlinear_highlighted$pvalue)
pval_cutoff_strober_linear <- max(strober_linear_highlighted$pvalue)
pval_cutoff_strober_nonlinear
[1] 6.829826e-05
pval_cutoff_strober_linear
[1] 0.0001702417
What are the lfdr cutoff for FASH (order 1) before and after BF adjustment?
lfdr_cutoff1_before <- max(fash_fit1$lfdr[test1$fdr_results$index[test1$fdr_results$FDR <= alpha]])
lfdr_cutoff1_after <- max(fash_fit1_update$lfdr[test1$fdr_results$index[test1$fdr_results$FDR <= alpha]])
lfdr_cutoff1_before
[1] 0.008331724
lfdr_cutoff1_after
[1] 0.1478733
Let’s take a look at the overlap between the two methods used in Strober et.al and FASH (order 1):

| Version | Author | Date |
|---|---|---|
| 2010ecb | Ziang Zhang | 2025-12-16 |
Produce another Venn diagram for the pairs detected by the three methods:

| Version | Author | Date |
|---|---|---|
| 2010ecb | Ziang Zhang | 2025-12-16 |
Produce similar Venn diagrams for genes and pairs detected by FASH without BF adjustment:

| Version | Author | Date |
|---|---|---|
| 2010ecb | Ziang Zhang | 2025-12-16 |

| Version | Author | Date |
|---|---|---|
| 2010ecb | Ziang Zhang | 2025-12-16 |
There is a large number of genes only detected by FASH (order 1). Let’s take a look at the 4 pairs that are least significant from FASH, and have at least p-value 0.1 from Strober et.al (both linear and non-linear):

Let’s also take a look at the 4 pairs that are most significant from FASH, also with at least p-value 0.1 from Strober et.al (both linear and non-linear):

Take a look at the p-values and eFDRs from Strober et.al for some of these pairs:
select_gene_id <- gene_map$ensembl_gene_id[gene_map$hgnc_symbol %in% selected_gene_fashr_only]
select_variant_id <- c("rs35399", "rs2268744", "rs7532265", "rs2047804")
strober_linear %>%
filter(rs_id %in% select_variant_id, ensamble_id %in% select_gene_id)
# A tibble: 4 × 6
rs_id ensamble_id pvalue eFDR key gene_symbol
<chr> <chr> <dbl> <dbl> <chr> <chr>
1 rs2268744 ENSG00000156928 0.163 0.680 ENSG00000156928_rs2268744 MALSU1
2 rs2047804 ENSG00000215041 0.606 0.921 ENSG00000215041_rs2047804 NEURL4
3 rs35399 ENSG00000151388 0.685 0.944 ENSG00000151388_rs35399 ADAMTS12
4 rs7532265 ENSG00000157216 0.849 0.977 ENSG00000157216_rs7532265 SSBP3
strober_nonlinear %>%
filter(rs_id %in% select_variant_id, ensamble_id %in% select_gene_id)
# A tibble: 4 × 6
rs_id ensamble_id pvalue eFDR key gene_symbol
<chr> <chr> <dbl> <dbl> <chr> <chr>
1 rs35399 ENSG00000151388 0.116 0.680 ENSG00000151388_rs35399 ADAMTS12
2 rs2268744 ENSG00000156928 0.196 0.776 ENSG00000156928_rs2268744 MALSU1
3 rs7532265 ENSG00000157216 0.299 0.856 ENSG00000157216_rs7532265 SSBP3
4 rs2047804 ENSG00000215041 0.757 0.981 ENSG00000215041_rs2047804 NEURL4
Next, look at some genes that were detected by both FASH and Strober et.al. We will pick the most significant pair for each gene in FASH’s results:

Let’s also look at the genes that were missed by FASH, but detected by Strober et.al. In this case, we will pick the most significant pair for each gene in Strober et.al’s results:

Take a look at the same plot but with the unadjusted FASH results:

The shape of eFDR (Strober et al) and FDR (FASH):
subset_size <- 10000
subset_seq <- round(seq(1, length(test1$fdr_results$FDR), length.out = subset_size))
plot(
sort(strober_nonlinear$eFDR)[subset_seq],
x = subset_seq,
type = "l",
col = "blue",
lwd = 2,
xlab = "Ranked pairs",
ylab = "FDR",
lty = "dashed",
main = ""
)
lines(sort(test1$fdr_results$FDR)[subset_seq], x = subset_seq, col = "red", lty = "solid", lwd = 2)
lines(sort(strober_linear$eFDR)[subset_seq], x = subset_seq, col = "green", lty = "dotted", lwd = 2)
legend(
"topleft",
legend = c("Strober Quadratic", "Strober Linear", "FASH (1)"),
col = c("blue", "green", "red"),
lty = c("dashed", "dotted", "solid"),
lwd = c(2, 2, 2)
)

plot(
sort(strober_nonlinear$eFDR[strober_nonlinear$eFDR <= 0.05]),
xlim = c(0, 10000),
type = "l",
col = "blue",
lwd = 2,
lty = "dashed",
xlab = "Ranked pairs",
ylab = "FDR",
main = "(Zoomed in)"
)
lines(sort(test1$fdr_results$FDR[test1$fdr_results$FDR <= 0.05]), col = "red", lty = "solid", lwd = 2)
lines(sort(strober_linear$eFDR[strober_linear$eFDR <= 0.05]), col = "green", lty = "dotted", lwd = 2)
legend(
"topleft",
legend = c("Strober Quadratic", "Strober Linear", "FASH (1)"),
col = c("blue", "green", "red"),
lty = c("dashed", "dotted", "solid"),
lwd = c(2, 2, 2)
)

Scatterplot of lfdr and p-values from Strober et.al:
fash_tbl <- pair_tbl %>%
transmute(key, ens_id, rs_id, idx,
lfdr_raw = fash_fit1$lfdr[idx],
lfdr_adj = fash_fit1_update$lfdr[idx])
manual_symbol <- c("ENSG00000225485" = "NFATC4")
plot_lfdr_vs_p(
fash_tbl,
strober_linear,
pval_cutoff_strober_linear,
lfdr_cutoff1_after,
gene_map,
title_text = "Scatterplot of lfdr (FASH) vs p-values (Strober Linear)",
manual_symbol = manual_symbol
)

plot_lfdr_vs_p(
fash_tbl,
strober_nonlinear,
pval_cutoff_strober_nonlinear,
lfdr_cutoff1_after,
gene_map,
title_text = "Scatterplot of lfdr (FASH) vs p-values (Strober Quadratic)",
manual_symbol = manual_symbol
)

Following the definition in Strober et.al, we will classify the detected dynamic eQTLs into different categories:
Early: eQTLs with strongest effect during the first three days: \(\max_{t\leq3} |\beta(t)| - \max_{t> 3} |\beta(t)| > 0\).
Late: eQTLs with strongest effect during the last four days: \(\max_{t\geq 12} |\beta(t)| - \max_{t< 12} |\beta(t)| > 0\).
Middle: eQTLs with strongest effect during days 4-11: \(\max_{4\leq t\leq 11} |\beta(t)| - \max_{t> 11 | t< 4} |\beta(t)| > 0\).
Switch: eQTLs with effect sign switch during the time course such that \(\min\{\max\beta(t)^+,\max\beta(t)^-\}-c>0\) where \(c\) is a threshold that we set to 0.25 (which means with two alleles, the maximal difference of effect size is at least \(\geq 2\times\min\{\max\beta(t)^+,\max\beta(t)^-\}\times2 \geq 2 \times 0.25 \times 2 = 1\)).
We will take a look at the significant pairs detected by FASH (order 1), and classify them based on the false sign rate (lfsr).
smooth_var_refined = seq(0,15, by = 0.1)
functional_early <- function(x){
max(abs(x[smooth_var_refined <= 3])) - max(abs(x[smooth_var_refined > 3]))
}
testing_early_dyn <- testing_functional(functional_early,
lfsr_cal = function(x){mean(x <= 0)},
fash = fash_fit1,
indices = fash_highlighted1,
smooth_var = smooth_var_refined)
How many pairs and how many unique genes are classified as early dynamic eQTLs?
load(file.path(result_dir, "classify_dyn_eQTLs_early.RData"))
early_indices <- testing_early_dyn$indices[testing_early_dyn$cfsr <= alpha]
length(early_indices)
[1] 129
early_genes <- unique(pair_tbl$ens_id[pair_tbl$idx %in% early_indices])
length(early_genes)
[1] 18
Let’s take a look at the top-ranked early dynamic eQTLs:

functional_middle <- function(x){
max(abs(x[smooth_var_refined <= 11 & smooth_var_refined >= 4])) - max(abs(x[smooth_var_refined > 11]), abs(x[smooth_var_refined < 4]))
}
testing_middle_dyn <- testing_functional(functional_middle,
lfsr_cal = function(x){mean(x <= 0)},
fash = fash_fit1,
indices = fash_highlighted1,
num_cores = num_cores,
smooth_var = smooth_var_refined)
How many pairs and how many unique genes are classified as middle dynamic eQTLs?
load(file.path(result_dir, "classify_dyn_eQTLs_middle.RData"))
middle_indices <- testing_middle_dyn$indices[testing_middle_dyn$cfsr <= alpha]
length(middle_indices)
[1] 54
middle_genes <- unique(pair_tbl$ens_id[pair_tbl$idx %in% middle_indices])
length(middle_genes)
[1] 17
Take a look at their results:

functional_late <- function(x){
max(abs(x[smooth_var_refined >= 12])) - max(abs(x[smooth_var_refined < 12]))
}
testing_late_dyn <- testing_functional(functional_late,
lfsr_cal = function(x){mean(x <= 0)},
fash = fash_fit1,
indices = fash_highlighted1,
num_cores = num_cores,
smooth_var = smooth_var_refined)
How many pairs and how many unique genes are classified as late dynamic eQTLs?
load(file.path(result_dir, "classify_dyn_eQTLs_late.RData"))
late_indices <- testing_late_dyn$indices[testing_late_dyn$cfsr <= alpha]
length(late_indices)
[1] 50
late_genes <- unique(pair_tbl$ens_id[pair_tbl$idx %in% late_indices])
length(late_genes)
[1] 18
Let’s take a look at the top-ranked late dynamic eQTLs:

How many pairs and how many unique genes are classified as switch dynamic eQTLs?
switch_threshold <- 0.25
functional_switch <- function(x){
x_pos <- x[x > 0]
x_neg <- x[x < 0]
if(length(x_pos) == 0 || length(x_neg) == 0){
return(0)
}
min(max(abs(x_pos)), max(abs(x_neg))) - switch_threshold
}
testing_switch_dyn <- testing_functional(functional_switch,
lfsr_cal = function(x){mean(x <= 0)},
fash = fash_fit1,
indices = fash_highlighted1,
num_cores = num_cores,
smooth_var = smooth_var_refined)
load(file.path(result_dir, "classify_dyn_eQTLs_switch.RData"))
switch_indices <- testing_switch_dyn$indices[testing_switch_dyn$cfsr <= alpha]
length(switch_indices)
[1] 1284
switch_genes <- unique(pair_tbl$ens_id[pair_tbl$idx %in% switch_indices])
length(switch_genes)
[1] 285
selected_switch_genes_symbols <- c("SCN5A", "USP3", "VAMP8", "CCND1")
sym2ens_sw <- symbol_to_ens(selected_switch_genes_symbols, file.path(result_dir, "cache_sym2ens_switch.rds"))
selected_switch_genes <- sym2ens_sw$ensembl_gene_id[match(selected_switch_genes_symbols, sym2ens_sw$hgnc_symbol)]
Let’s take a look at the top-ranked switch dynamic eQTLs:

library(clusterProfiler)
library(tidyverse)
library(msigdbr)
library(org.Hs.eg.db)
library(cowplot)
m_t2g <- msigdbr(species = "Homo sapiens", category = "H") %>%
mutate(ensembl_use = dplyr::coalesce(ensembl_gene, db_ensembl_gene)) %>%
dplyr::filter(!is.na(ensembl_use)) %>%
dplyr::select(gs_name, ensembl_use) %>%
dplyr::distinct()
enrich_set <- function(genes_selected,
background_gene,
q_val_cutoff = 0.05,
pvalueCutoff = 0.05) {
genes_selected_raw <- unique(as.character(genes_selected))
background_gene_raw <- unique(as.character(background_gene))
universe_for_test <- background_gene_raw
hallmark_genes <- unique(m_t2g$ensembl_use)
bg_not_in_hallmark <- setdiff(universe_for_test, hallmark_genes)
dummy_id <- "__DUMMY_BACKGROUND__"
if (length(bg_not_in_hallmark) > 0) {
dummy_t2g <- tibble(
gs_name = dummy_id,
ensembl_use = bg_not_in_hallmark
)
TERM2GENE_full <- bind_rows(m_t2g, dummy_t2g)
} else {
TERM2GENE_full <- m_t2g
}
genes_sel_used <- intersect(genes_selected_raw, universe_for_test)
enrich_res <- enricher(
gene = genes_sel_used,
TERM2GENE = TERM2GENE_full,
universe = universe_for_test,
pAdjustMethod = "BH",
qvalueCutoff = q_val_cutoff,
pvalueCutoff = pvalueCutoff
)
if (is.null(enrich_res) || nrow(enrich_res@result) == 0L) {
return(enrich_res)
}
df <- enrich_res@result
df <- df %>% dplyr::filter(ID != dummy_id)
df$GeneRatio_orig <- df$GeneRatio
df$BgRatio_orig <- df$BgRatio
n_sel_total <- length(genes_sel_used)
n_bg_total <- length(universe_for_test)
df$GeneRatio_fixed <- paste0(df$Count, "/", n_sel_total)
df$BgRatio_fixed <- paste0(df$setSize, "/", n_bg_total)
enrich_res@result <- df
enrich_res
}
Among all the genes highlighted by FASH:
result <- enrich_set(genes_selected = genes_highlighted1, background_gene = all_genes)
result@result %>%
filter(pvalue < 0.05) %>%
dplyr::select(GeneRatio, BgRatio, pvalue, qvalue)
GeneRatio BgRatio pvalue qvalue
HALLMARK_HYPOXIA 26/1295 89/6362 0.02879958 0.382977
HALLMARK_COAGULATION 11/1295 31/6362 0.03642565 0.382977
HALLMARK_MYOGENESIS 20/1295 67/6362 0.04127072 0.382977
HALLMARK_KRAS_SIGNALING_UP 15/1295 47/6362 0.04152590 0.382977
HALLMARK_KRAS_SIGNALING_DN 10/1295 28/6362 0.04318383 0.382977
Among the genes highlighted by FASH that are classified as early dynamic eQTLs:
result <- enrich_set(genes_selected = early_genes, background_gene = all_genes)
result@result %>%
filter(pvalue < 0.05) %>%
dplyr::select(GeneRatio, BgRatio, pvalue, qvalue)
GeneRatio BgRatio pvalue qvalue
HALLMARK_P53_PATHWAY 2/18 83/6362 0.02247213 0.09461948
Among the genes highlighted by FASH that are classified as middle dynamic eQTLs:
result <- enrich_set(genes_selected = middle_genes, background_gene = all_genes)
result@result %>%
filter(pvalue < 0.05) %>%
dplyr::select(GeneRatio, BgRatio, pvalue, qvalue)
GeneRatio BgRatio pvalue qvalue
HALLMARK_COMPLEMENT 2/17 57/6362 0.009839637 0.09321762
Among the genes highlighted by FASH that are classified as late dynamic eQTLs:
result <- enrich_set(genes_selected = late_genes, background_gene = all_genes)
result@result %>%
filter(pvalue < 0.05) %>%
dplyr::select(GeneRatio, BgRatio, pvalue, qvalue)
GeneRatio BgRatio pvalue qvalue
HALLMARK_ANDROGEN_RESPONSE 3/18 57/6362 0.000505832 0.003727183
HALLMARK_OXIDATIVE_PHOSPHORYLATION 2/18 119/6362 0.043671354 0.160894461
Among the genes highlighted by FASH that are classified as switch dynamic eQTLs:
result <- enrich_set(genes_selected = switch_genes, background_gene = all_genes)
result@result %>%
filter(pvalue < 0.05) %>%
dplyr::select(GeneRatio, BgRatio, pvalue, qvalue)
GeneRatio BgRatio pvalue qvalue
HALLMARK_HYPOXIA 11/285 89/6362 0.001944003 0.04034811
HALLMARK_MYOGENESIS 9/285 67/6362 0.002780865 0.04034811
HALLMARK_P53_PATHWAY 10/285 83/6362 0.003726032 0.04034811
HALLMARK_XENOBIOTIC_METABOLISM 9/285 71/6362 0.004143860 0.04034811
HALLMARK_PI3K_AKT_MTOR_SIGNALING 7/285 55/6362 0.010785281 0.07937656
HALLMARK_APOPTOSIS 8/285 70/6362 0.012426809 0.07937656
HALLMARK_TNFA_SIGNALING_VIA_NFKB 7/285 58/6362 0.014266327 0.07937656
HALLMARK_HEDGEHOG_SIGNALING 3/285 15/6362 0.027120719 0.11736452
HALLMARK_NOTCH_SIGNALING 3/285 15/6362 0.027120719 0.11736452
HALLMARK_INTERFERON_GAMMA_RESPONSE 6/285 56/6362 0.038073617 0.14828672
HALLMARK_IL2_STAT5_SIGNALING 7/285 75/6362 0.049784481 0.17627041
sessionInfo()
R version 4.5.1 (2025-06-13)
Platform: aarch64-apple-darwin20
Running under: macOS Sequoia 15.6.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Chicago
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] cowplot_1.2.0 org.Hs.eg.db_3.21.0 AnnotationDbi_1.70.0
[4] IRanges_2.42.0 S4Vectors_0.46.0 Biobase_2.68.0
[7] BiocGenerics_0.54.1 generics_0.1.4 msigdbr_25.1.1
[10] clusterProfiler_4.16.0 lubridate_1.9.4 forcats_1.0.1
[13] readr_2.1.6 tibble_3.3.0 tidyverse_2.0.0
[16] ggVennDiagram_1.5.4 ggrepel_0.9.6 ggplot2_4.0.1
[19] purrr_1.2.0 stringr_1.6.0 tidyr_1.3.1
[22] dplyr_1.1.4 fashr_0.1.42
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 rstudioapi_0.17.1 jsonlite_2.0.0
[4] magrittr_2.0.4 ggtangle_0.0.9 farver_2.1.2
[7] rmarkdown_2.30 fs_1.6.6 vctrs_0.6.5
[10] memoise_2.0.1 ggtree_3.16.3 mixsqp_0.3-54
[13] htmltools_0.5.9 curl_7.0.0 gridGraphics_0.5-1
[16] sass_0.4.10 bslib_0.9.0 plyr_1.8.9
[19] cachem_1.1.0 TMB_1.9.19 whisker_0.4.1
[22] igraph_2.2.1 lifecycle_1.0.4 pkgconfig_2.0.3
[25] Matrix_1.7-4 R6_2.6.1 fastmap_1.2.0
[28] gson_0.1.0 GenomeInfoDbData_1.2.14 digest_0.6.39
[31] numDeriv_2016.8-1.1 aplot_0.2.9 enrichplot_1.28.4
[34] colorspace_2.1-2 patchwork_1.3.2 rprojroot_2.1.1
[37] irlba_2.3.5.1 RSQLite_2.4.5 labeling_0.4.3
[40] timechange_0.3.0 httr_1.4.7 compiler_4.5.1
[43] bit64_4.6.0-1 withr_3.0.2 S7_0.2.1
[46] BiocParallel_1.42.2 DBI_1.2.3 R.utils_2.13.0
[49] rappdirs_0.3.3 tools_4.5.1 otel_0.2.0
[52] ape_5.8-1 httpuv_1.6.16 R.oo_1.27.1
[55] glue_1.8.0 nlme_3.1-168 GOSemSim_2.34.0
[58] promises_1.5.0 grid_4.5.1 reshape2_1.4.5
[61] fgsea_1.34.2 gtable_0.3.6 tzdb_0.5.0
[64] R.methodsS3_1.8.2 data.table_1.17.8 hms_1.1.4
[67] utf8_1.2.6 XVector_0.48.0 pillar_1.11.1
[70] babelgene_22.9 yulab.utils_0.2.3 later_1.4.4
[73] splines_4.5.1 treeio_1.32.0 lattice_0.22-7
[76] bit_4.6.0 tidyselect_1.2.1 GO.db_3.21.0
[79] Biostrings_2.76.0 knitr_1.50 git2r_0.36.2
[82] xfun_0.55 LaplacesDemon_16.1.6 stringi_1.8.7
[85] UCSC.utils_1.4.0 workflowr_1.7.2 lazyeval_0.2.2
[88] ggfun_0.2.0 yaml_2.3.12 evaluate_1.0.5
[91] codetools_0.2-20 qvalue_2.40.0 ggplotify_0.1.3
[94] cli_3.6.5 jquerylib_0.1.4 dichromat_2.0-0.1
[97] Rcpp_1.1.0 GenomeInfoDb_1.44.3 png_0.1-8
[100] parallel_4.5.1 assertthat_0.2.1 blob_1.2.4
[103] DOSE_4.2.0 tidytree_0.4.6 scales_1.4.0
[106] crayon_1.5.3 rlang_1.1.6 fastmatch_1.1-6
[109] KEGGREST_1.48.1