Differential expression analysis using a topic model: illustration in mixture of FACS-purified PBMC data

Last updated: 2021-12-31

Checks: 7 0

Knit directory: single-cell-topics/analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(1)

The command set.seed(1) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: b4ebbb3

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version b4ebbb3. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    data/droplet.RData
    Ignored:    data/pbmc_68k.RData
    Ignored:    data/pbmc_purified.RData
    Ignored:    data/pulseseq.RData
    Ignored:    output/droplet/diff-count-droplet.RData
    Ignored:    output/droplet/fits-droplet.RData
    Ignored:    output/droplet/rds/
    Ignored:    output/pbmc-68k/fits-pbmc-68k.RData
    Ignored:    output/pbmc-68k/rds/
    Ignored:    output/pbmc-purified/fits-pbmc-purified.RData
    Ignored:    output/pbmc-purified/rds/
    Ignored:    output/pulseseq/diff-count-pulseseq.RData
    Ignored:    output/pulseseq/fits-pulseseq.RData
    Ignored:    output/pulseseq/rds/

Untracked files:
    Untracked:  analysis/de_analysis_detailed_look_cache/
    Untracked:  analysis/de_analysis_detailed_look_more_cache/
    Untracked:  plots/

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/de_analysis_purified_pbmc.Rmd) and HTML (docs/de_analysis_purified_pbmc.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	b4ebbb3	Peter Carbonetto	2021-12-31	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
html	7533a43	Peter Carbonetto	2021-12-30	Rebuilt de_analysis_purified_pbmc page after reorg.
Rmd	8d3498d	Peter Carbonetto	2021-12-30	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
Rmd	c8832de	Peter Carbonetto	2021-12-30	Reorganized and deleted some webpages.
html	f4b9bd2	Peter Carbonetto	2021-12-10	Updated the interactive volcano plots in de_analysis_purified_pbmc
Rmd	e15c655	Peter Carbonetto	2021-12-10	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
Rmd	ef4caa1	Peter Carbonetto	2021-12-10	Small fix.
html	8232419	Peter Carbonetto	2021-12-10	Revised the volcano plots in the de_analysis_purified_pbmc analysis.
Rmd	4aac42e	Peter Carbonetto	2021-12-10	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
Rmd	c1825e6	Peter Carbonetto	2021-12-09	Small fix to ggsave call in de_analysis_purified_pbmc analysis.
html	2413615	Peter Carbonetto	2021-12-09	Revised a number of plots in the de_analysis_purified_pbmc analysis.
Rmd	7a99e6b	Peter Carbonetto	2021-12-09	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
html	b6a778f	Peter Carbonetto	2021-12-09	Added steps to save some plots as .eps files in
Rmd	5932942	Peter Carbonetto	2021-12-09	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
html	16b967f	Peter Carbonetto	2021-12-09	Added CD14+ scatterplot to de_analysis_purified_pbmc analysis.
html	24a69da	Peter Carbonetto	2021-12-09	Added CD14+ scatterplot to de_analysis_purified_pbmc analysis.
Rmd	278c1ca	Peter Carbonetto	2021-12-09	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
html	e13338a	Peter Carbonetto	2021-12-08	A few small fixes to the plots in the de_analysis_purified_pbmc analysis.
Rmd	e563701	Peter Carbonetto	2021-12-08	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
html	e7b7153	Peter Carbonetto	2021-12-08	Added volcano plots to de_analysis_purified_pbmc analysis.
Rmd	a521628	Peter Carbonetto	2021-12-08	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
Rmd	641b0d6	Peter Carbonetto	2021-12-08	Added more volcano plots to de_analysis_purified_pbmc analysis.
Rmd	63ee088	Peter Carbonetto	2021-12-06	Created exploratory script temp4.R.
html	13d5e90	Peter Carbonetto	2021-11-29	A couple small fixes to the de_analysis_purified_pbmc analysis.
Rmd	27821a6	Peter Carbonetto	2021-11-29	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
html	a7efe91	Peter Carbonetto	2021-11-29	Added volcano plot to de_analysis_purified_pbmc analysis, and updated
Rmd	ffdd29d	Peter Carbonetto	2021-11-29	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
html	1580f79	Peter Carbonetto	2021-11-24	Fixed volcano plot in de_analysis_purified_pbmc analysis.
Rmd	5825aa9	Peter Carbonetto	2021-11-24	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
html	29aada6	Peter Carbonetto	2021-11-24	Added B-cells volcano plot to de_analysis_purified_pbmc analysis.
Rmd	9fcc15c	Peter Carbonetto	2021-11-24	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
Rmd	7b34362	Peter Carbonetto	2021-11-24	Working on comparison of deseq2 and fasttopics in B cells in de_analysis_purified_pbmc analysis.
Rmd	6fbc554	Peter Carbonetto	2021-11-24	Working on deseq2 vs fasttopics comparison in de_analysis_purified_pbmc analysis.
Rmd	c1e0931	Peter Carbonetto	2021-11-23	Revised explanatory test in de_analysis_purified_pbmc analysis.
Rmd	fb81385	Peter Carbonetto	2021-11-23	Made a few improvements to the volcano plot in de_analysis_purified_pbmc analysis.
Rmd	a11826f	Peter Carbonetto	2021-11-23	Added scatterplot to de_analysis_purified_pbmc analysis.
Rmd	161d0e9	Peter Carbonetto	2021-11-23	Made a few improvements the b-cells scatterplot in the de_analysis_purified_pbmc analysis.
Rmd	760f6ad	Peter Carbonetto	2021-11-23	Added a z-score q-q plot to de_analysis_purified_pbmc analysis.
Rmd	5fa98f8	Peter Carbonetto	2021-11-23	Working on de_analysis_purified_pbmc analysis.
html	5fa98f8	Peter Carbonetto	2021-11-23	Working on de_analysis_purified_pbmc analysis.
Rmd	c30816e	Peter Carbonetto	2021-11-21	A few small edits.
html	0d0c720	Peter Carbonetto	2021-11-21	Added scatterplots assessing accuracy of Monte Carlo estimates to
Rmd	f7a5a86	Peter Carbonetto	2021-11-21	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”)
html	1917832	Peter Carbonetto	2021-11-21	Working on de_analysis_purified_pbmc analysis.
Rmd	90c6584	Peter Carbonetto	2021-11-21	workflowr::wflow_publish(“de_analysis_purified_pbmc.Rmd”, verbose = TRUE)
Rmd	3143481	Peter Carbonetto	2021-11-21	Working on the de_analysis_purified_pbmc analysis.
Rmd	a16fdf9	Peter Carbonetto	2021-11-20	Added structure plot to de_analysis_purified_pbmc analysis.
html	e1ab3a0	Peter Carbonetto	2021-11-08	Built the initial de_analysis_purified_pbmc analysis page.
html	2befadb	Peter Carbonetto	2021-11-08	Added link to overview page.
Rmd	71e267d	Peter Carbonetto	2021-11-08	workflowr::wflow_publish(“index.Rmd”)

The aim of this analysis is to understand by way of illustration the differences between a “classical” differential expresion analysis comparing expression among cell types (here we implement this “classical” DE analysis using DESEq2), and a differential expression analysis using the topic model, which allows for grades of membership to cell types (or more generally cellular expression factors).

Begin by loading the packages and some function definitions used in the analysis.

library(Matrix)
library(DESeq2)
library(fastTopics)
library(ggplot2)
library(ggrepel)
library(cowplot)
source("../code/de_analysis_functions.R")

Load count data and topic model fit

Load the UMI count data for 94,655 cells and 21,952 genes.

load("../data/pbmc_purified.RData")
dim(counts)
# [1] 94655 21952

Load the $K = 6$ Poisson NMF model fit to these data, and convert the Poisson NMF model fit to a multinomial topic model fit.

fit <- readRDS(file.path("../output/pbmc-purified/rds",
                         "fit-pbmc-purified-scd-ex-k=6.rds"))$fit
fit <- poisson2multinom(fit)

The cells are subdivided, based on FACS sorting, into 10 “cell types”. Several of the cell types are virtually indistinguishable based on their gene expression profiles alone, so we combine these indistinguishable cell types into a single “T cell” cell type. This results in 5 predefined cell types, the majority of which are T cells:

set.seed(1)
celltype <- as.character(samples$celltype)
celltype[celltype == "CD4+ T Helper2" |
         celltype == "CD4+/CD45RO+ Memory" |
         celltype == "CD8+/CD45RA+ Naive Cytotoxic" |
         celltype == "CD4+/CD45RA+/CD25- Naive T" |
         celltype == "CD8+ Cytotoxic T" |
         celltype == "CD4+/CD25 T Reg"] <- "T cell"
celltype <- factor(celltype,
                   c("CD19+ B","CD14+ Monocyte","CD34+","CD56+ NK","T cell"))
table(celltype)
# celltype
#        CD19+ B CD14+ Monocyte          CD34+       CD56+ NK         T cell 
#          10085           2612           9232           8385          64341

Structure plot

Next we visualize the structure inferred by the $K = $ topic model using a “structure plot”. The cells in this plot are arranged horizontally according to their predefined cell type to relate the topics to these predefinend cell types:

topic_colors <- c("gold","forestgreen","dodgerblue","gray",
                  "darkmagenta","violet")
topics <- c(5,3,2,4,1,6)
rows <- sort(c(sample(which(celltype == "CD19+ B"),500),
               sample(which(celltype == "CD14+ Monocyte"),250),
               sample(which(celltype == "CD34+"),500),
               sample(which(celltype == "CD56+ NK"),400),
               sample(which(celltype == "T cell"),1000)))
p1 <- structure_plot(select_loadings(fit,loadings = rows),
                     grouping = celltype[rows],topics = topics,
                     colors = topic_colors,perplexity = 70,n = Inf,gap = 30,
                     num_threads = 4,verbose = FALSE)
# Running tsne on 500 x 6 matrix.
# Running tsne on 250 x 6 matrix.
# Running tsne on 500 x 6 matrix.
# Running tsne on 400 x 6 matrix.
# Running tsne on 1000 x 6 matrix.
print(p1)

Version	Author	Date
a7efe91	Peter Carbonetto	2021-11-29
1917832	Peter Carbonetto	2021-11-21

Some of the topics correspond very closely to the predefined cell types. In particular, topics 1 through 4 closely correspond, respectively, to T cells, CD14+ monocytes (myeloid cells), B cells and natural killer (NK) cells.

Topic 5 (violet) closely corresponds to the “CD34+” FACS cell type, but from the structure plot we observe many cells labeled as “CD34+” with little to no contribution from topic 5, which suggests mislabeling of the CD34+ cells.

Topic 6 (magenta) does not correspond to any FACS cell type and as we will see it captures a different characteristic of the cells—specifically, abundance of ribosomal protein genes. Therefore, the DE results for topics 1–4 are most comparable to a classical DE analysis, and we begin with these comparisons. But before doing this we first assess accuracy of the MCMC computations used in the topic-model-based DE analysis.

Assessing accuracy of the Monte Carlo estimates

The topic-model-based DE analysis was performed previously by simulating the posterior distribution of the LFC statistics via MCMC. Here we assess accuracy of the MCMC calculations. We performed the DE analysis twice (using different seeds, each with 100,000 Monte Carlo samples), so we can compare the posterior mean estimates and z-scores returned by the two de_analysis runs.

load("../output/pbmc-purified/de-pbmc-purified-seed=1.RData")
de1 <- de
load("../output/pbmc-purified/de-pbmc-purified-seed=2.RData")
de2 <- de
rm(de)

The MCMC estimates of the posterior mean log-fold change (LFC) estimates are largely consistent:

pdat <- data.frame(postmean1 = as.vector(de1$postmean),
                   postmean2 = as.vector(de2$postmean))
pdat1 <- subset(pdat,abs(postmean1) > 0.1 | abs(postmean2) > 0.1)
pdat2 <- subset(pdat,abs(postmean1) <= 0.1 & abs(postmean2) <= 0.1)
pdat2 <- pdat[sample(nrow(pdat2),100),]
pdat  <- rbind(pdat1,pdat2)
p1 <- ggplot(pdat,aes(x = postmean1,y = postmean2)) +
  geom_point(shape = 21,color = "white",fill = "black",size = 2) +
  geom_abline(intercept = 0,slope = 1,color = "magenta",linetype = "dashed") +
  labs(x = "first posterior mean",y = "second posterior mean") +
  theme_cowplot()
print(p1)

Version	Author	Date
2413615	Peter Carbonetto	2021-12-09
29aada6	Peter Carbonetto	2021-11-24
5fa98f8	Peter Carbonetto	2021-11-23
0d0c720	Peter Carbonetto	2021-11-21

The z-scores on the other hand are estimated less consistently, presumably because accurately estimating uncertainty is harder. Still, the z-scores are still are consistent enough in that it is rare for an LFC to have an lfsr less than 0.05 in one MCMC simulation and not the other (these are the red points in the scatterplot). Note for better visualization z-scores larger than 100 (or smaller than -100) are shown as 100 (or -100) in this plot.

pdat <- data.frame(z1 = clamp(as.vector(de1$z),-100,+100),
                   z2 = clamp(as.vector(de2$z),-100,+100),
                   lfsr = factor((de1$lfsr < 0.05) + (de2$lfsr < 0.05)))
pdat1 <- subset(pdat,abs(z1) > 0.5 | abs(z2) > 0.5)
pdat2 <- subset(pdat,abs(z1) <= 0.5 & abs(z2) <= 0.5)
pdat2 <- pdat[sample(nrow(pdat2),100),]
pdat  <- rbind(pdat1,pdat2)
p2 <- ggplot(pdat,aes(x = z1,y = z2,fill = lfsr)) +
  geom_point(shape = 21,color = "white",size = 2) +
  geom_abline(intercept = 0,slope = 1,color = "magenta",linetype = "dotted") +
  scale_fill_manual(values = c("darkblue","tomato","dodgerblue"),
                    na.value = "white") +
  labs(x = "first z-score",y = "second z-score",fill = "lfsr < 0.05") +
  theme_cowplot()
print(p2)

Version	Author	Date
2413615	Peter Carbonetto	2021-12-09
29aada6	Peter Carbonetto	2021-11-24
5fa98f8	Peter Carbonetto	2021-11-23
0d0c720	Peter Carbonetto	2021-11-21

Moving forward, when the two z-scores disagree, we use the one that is nearer to zero.

de             <- de1[c("f0","lower","postmean","upper","z","lfsr")]
class(de)      <- c("topic_model_de_analysis","list")
i              <- which(abs(de2$z) < abs(de1$z))
de$lower[i]    <- de2$lower[i]
de$postmean[i] <- de2$postmean[i]
de$upper[i]    <- de2$upper[i]
de$z[i]        <- de2$z[i]
de$lfsr[i]     <- de$lfsr[i]

Load DESeq2 results for all cell types

We load the results of the DESeq2 analyes, and combine them into two data frames: a data frame for the posterior mean LFC estimates, and a data frame for the z-scores.

load("../output/pbmc-purified/deseq2-pbmc-purified.RData")
celltypes <- names(deseq)
n <- length(celltypes)
p <- nrow(genes)
deseq2 <- list(postmean = matrix(0,p,n),
               z        = matrix(0,p,n))
rownames(deseq2$postmean) <- genes$ensembl
rownames(deseq2$z)        <- genes$ensembl
colnames(deseq2$postmean) <- celltypes
colnames(deseq2$z)        <- celltypes
for (i in 1:n) {
  deseq2$postmean[,i] <- deseq[[i]]$log2FoldChange
  deseq2$z[,i]        <- with(deseq[[i]],log2FoldChange/lfcSE)
}
deseq <- deseq2
rm(deseq2)

Since we filtered out a few lowly expressed genes before running DESeq2, we subset the fastTopics results to match up with DESeq2.

rows        <- match(rownames(deseq$z),rownames(de$z))
de$f0       <- de$f0[rows]
de$lower    <- de$lower[rows,]
de$postmean <- de$postmean[rows,]
de$upper    <- de$upper[rows,]
de$z        <- de$z[rows,]
de$lfsr     <- de$lfsr[rows,]

DESeq2 vs. fastTopics

Comparing the distributions of all z-scores for B cells and CD14+ cells, we see that the DE analysis allowing for grades of membership has many more z-scores near zero, yet still allows for large z-scores, illustrating the flexibility of the model. (Because there are a few extremely large and extremely small z-scores, for better visualization z-scores larger than 20 in magnitude are counted as 20 or -20.)

pdat <- data.frame(
  deseq = quantile(clamp(deseq$z[,c("CD19+ B","CD14+ Monocyte")],-20,+20),
                   seq(0,1,length.out = 100),na.rm = TRUE),
  fasttopics = quantile(clamp(de$z[,1:2],-20,+20),
                        seq(0,1,length.out = 100),na.rm = TRUE))
p1 <- ggplot(pdat,aes(x = deseq,y = fasttopics)) +
  geom_point() +
  geom_abline(intercept = 0,slope = 1,color = "magenta",linetype = "dotted") +
  labs(x = "DESeq2",y = "fastTopics",title = "z-score quantiles") +
  theme_cowplot()
print(p1)

Version	Author	Date
29aada6	Peter Carbonetto	2021-11-24

Now let’s look closely at the DESeq2 and fastTopics results for B cells and CD14+ cells. These two cell types are chosen because they each closely correspond to a topic (topics 2 and 3). Here we focus on genes for which the z-score is greater than 2 in at least one of the analyses. Genes are colored according to the LFSR estimated in the fastTopics analysis.

i <- "CD19+ B"
k <- "k3"
pdat1 <- data.frame(gene                = genes$symbol,
                    postmean.deseq      = deseq$postmean[,i],
                    postmean.fasttopics = de$postmean[,k],
                    z.deseq             = deseq$z[,i],
                    z.fasttopics        = de$z[,k],
                    lfsr = cut(de$lfsr[,k],c(-1,0.001,0.01,0.05,Inf)),
                    stringsAsFactors = FALSE)
j <- which(pdat1$postmean.fasttopics < 8)
pdat1[j,"gene"] <- ""
pdat1 <- subset(pdat1,abs(z.deseq) > 2 | abs(z.fasttopics) > 2)
p1 <- ggplot(pdat1,aes(x = postmean.deseq,y = postmean.fasttopics,
                fill = lfsr,label = gene)) +
  geom_point(shape = 21,color = "white") +
  geom_abline(intercept = 0,slope = 1,color = "black",linetype = "dotted") +
  geom_text_repel(color = "darkgray",size = 2.25,fontface = "italic",
                  segment.color = "darkgray",segment.size = 0.25,
                  min.segment.length = 0,max.overlaps = Inf,na.rm = TRUE) +
  scale_fill_manual(values = c("deepskyblue","gold","orange","coral"),
                    na.value = "gainsboro") +
  xlim(-12.3,13.1) +
  ylim(-10,13.1) +
  labs(x = "DESeq2",y = "fastTopics",
       title = "LFC in B cells") +
  theme_cowplot(font_size = 10)
i <- "CD14+ Monocyte"
k <- "k2"
pdat2 <- data.frame(gene                = genes$symbol,
                    postmean.deseq      = deseq$postmean[,i],
                    postmean.fasttopics = de$postmean[,k],
                    z.deseq             = deseq$z[,i],
                    z.fasttopics        = de$z[,k],
                    lfsr = cut(de$lfsr[,k],c(-1,0.001,0.01,0.05,Inf)),
                    stringsAsFactors = FALSE)
j <- which(pdat2$postmean.fasttopics < 10)
pdat2[j,"gene"] <- ""
pdat2 <- subset(pdat2,abs(z.deseq) > 2 | abs(z.fasttopics) > 2)
p2 <- ggplot(pdat2,aes(x = postmean.deseq,y = postmean.fasttopics,
                fill = lfsr,label = gene)) +
  geom_point(shape = 21,color = "white") +
  geom_abline(intercept = 0,slope = 1,color = "black",linetype = "dotted") +
  geom_text_repel(color = "darkgray",size = 2.25,fontface = "italic",
                  segment.color = "darkgray",segment.size = 0.25,
                  min.segment.length = 0,max.overlaps = Inf,na.rm = TRUE) +
  scale_fill_manual(values = c("deepskyblue","gold","orange","coral"),
                    na.value = "gainsboro") +
  xlim(-8.2,15) +
  ylim(-8.2,15) +
  labs(x = "DESeq2",y = "fastTopics",
       title = "LFC in CD14+ monocytes") +
  theme_cowplot(font_size = 10)
plot_grid(p1,p2)

Version	Author	Date
7533a43	Peter Carbonetto	2021-12-30
f4b9bd2	Peter Carbonetto	2021-12-10
8232419	Peter Carbonetto	2021-12-10
2413615	Peter Carbonetto	2021-12-09
b6a778f	Peter Carbonetto	2021-12-09
16b967f	Peter Carbonetto	2021-12-09
24a69da	Peter Carbonetto	2021-12-09
e13338a	Peter Carbonetto	2021-12-08
e7b7153	Peter Carbonetto	2021-12-08
13d5e90	Peter Carbonetto	2021-11-29
a7efe91	Peter Carbonetto	2021-11-29
1580f79	Peter Carbonetto	2021-11-24
29aada6	Peter Carbonetto	2021-11-24

A couple interesting themes emerge from this plot:

Many genes with large LFC in the DESeq2 analysis, particularly genes with very negative LFCs, have zero LFC in the fastTopics analysis.
DESeq2 and fastTopics largely identify the same genes with the largest expression increases (largest LFC), but fastTopics estimates larger LFCs for these genes.

The fastTopics DE analysis for all 6 topics is summarized in the following volcano plots:

my_ggplot_call <- function (dat, plot.title, max.overlaps) {
  i <- which(with(dat,zabs > 0.5 | abs(postmean) > 0.5))
  j <- sample(which(with(dat,zabs <= 1 & abs(postmean) <= 0.5)),100)
  dat <- dat[c(i,j),]
  return(volcano_plot_ggplot_call(dat,plot.title,max.overlaps))
}
p1 <- volcano_plot(de,k = "k1",ymax = 150,labels = genes$symbol,ggplot_call = my_ggplot_call)
p2 <- volcano_plot(de,k = "k2",ymax = 175,labels = genes$symbol,ggplot_call = my_ggplot_call)
p3 <- volcano_plot(de,k = "k3",ymax = 150,labels = genes$symbol,ggplot_call = my_ggplot_call)
p4 <- volcano_plot(de,k = "k4",ymax = 175,labels = genes$symbol,ggplot_call = my_ggplot_call)
p5 <- volcano_plot(de,k = "k5",ymax = 175,labels = genes$symbol,ggplot_call = my_ggplot_call)
p6 <- volcano_plot(de,k = "k6",ymax = 200,labels = genes$symbol,ggplot_call = my_ggplot_call)
plot_grid(p1,p2,p3,p4,p5,p6,nrow = 3,ncol = 2)

Version	Author	Date
7533a43	Peter Carbonetto	2021-12-30
f4b9bd2	Peter Carbonetto	2021-12-10
8232419	Peter Carbonetto	2021-12-10
2413615	Peter Carbonetto	2021-12-09
b6a778f	Peter Carbonetto	2021-12-09
16b967f	Peter Carbonetto	2021-12-09
24a69da	Peter Carbonetto	2021-12-09
e13338a	Peter Carbonetto	2021-12-08
e7b7153	Peter Carbonetto	2021-12-08

These same results can also be explored in interactive volcano plots:

volcano_plotly(de,k = "k1",ymax = 150,labels = genes$symbol,
               file = "volcano_plot_purified_pbmc_tcells.html",
               width = 600,height = 600)
volcano_plotly(de,k = "k2",ymax = 175,labels = genes$symbol,
               file = "volcano_plot_purified_pbmc_cd14.html",
               width = 600,height = 600)
volcano_plotly(de,k = "k3",ymax = 150,labels = genes$symbol,
               file = "volcano_plot_purified_pbmc_bcells.html",
               width = 600,height = 600)
volcano_plotly(de,k = "k4",ymax = 175,labels = genes$symbol,
               file = "volcano_plot_purified_pbmc_nkcells.html",
               width = 600,height = 600)
volcano_plotly(de,k = "k5",ymax = 175,labels = genes$symbol,
               file = "volcano_plot_purified_pbmc_cd34.html",
               width = 600,height = 600)
volcano_plotly(de,k = "k6",ymax = 200,labels = genes$symbol,
               file = "volcano_plot_purified_pbmc_ribosome.html",
               width = 600,height = 600)

You can explore these interactive volcano plots by following these links: