Last updated: 2022-02-22

Checks: 7 0

Knit directory: MelanomaIMC/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20200728) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version d246c15. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rproj.user/
    Ignored:    Table_S4.csv
    Ignored:    code/.DS_Store
    Ignored:    code/._.DS_Store
    Ignored:    data/.DS_Store
    Ignored:    data/._.DS_Store
    Ignored:    data/data_for_analysis/
    Ignored:    data/full_data/

Unstaged changes:
    Modified:   .gitignore
    Modified:   analysis/Supp-Figure_10.rmd
    Modified:   analysis/_site.yml
    Deleted:    analysis/license.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/Supp-Figure_11.rmd) and HTML (docs/Supp-Figure_11.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
html 73aa800 toobiwankenobi 2022-02-22 add .html for static website
Rmd 64e5fde toobiwankenobi 2022-02-16 change order and naming of supp fig files
Rmd b20b6fb toobiwankenobi 2022-02-02 update code for Supp Figures
Rmd 3da15db toobiwankenobi 2021-11-24 changes for revision

Introduction

This script generates plots for Supplementary Figure 11.

Preparations

Load libraries

First, we will load the libraries needed for this part of the analysis.

sapply(list.files("code/helper_functions", full.names = TRUE), source)
        code/helper_functions/calculateSummary.R
value   ?                                       
visible FALSE                                   
        code/helper_functions/censor_dat.R
value   ?                                 
visible FALSE                             
        code/helper_functions/detect_mRNA_expression.R
value   ?                                             
visible FALSE                                         
        code/helper_functions/DistanceToClusterCenter.R
value   ?                                              
visible FALSE                                          
        code/helper_functions/findMilieu.R code/helper_functions/findPatch.R
value   ?                                  ?                                
visible FALSE                              FALSE                            
        code/helper_functions/getInfoFromString.R
value   ?                                        
visible FALSE                                    
        code/helper_functions/getSpotnumber.R
value   ?                                    
visible FALSE                                
        code/helper_functions/plotCellCounts.R
value   ?                                     
visible FALSE                                 
        code/helper_functions/plotCellFractions.R
value   ?                                        
visible FALSE                                    
        code/helper_functions/plotDist.R code/helper_functions/read_Data.R
value   ?                                ?                                
visible FALSE                            FALSE                            
        code/helper_functions/scatter_function.R
value   ?                                       
visible FALSE                                   
        code/helper_functions/sceChecks.R
value   ?                                
visible FALSE                            
        code/helper_functions/validityChecks.R
value   ?                                     
visible FALSE                                 
library(SingleCellExperiment)
library(reshape2)
library(tidyverse)
library(dplyr)
library(data.table) 
library(ggplot2)
library(ComplexHeatmap)
library(rms)
library(ggrepel)
library(ggbeeswarm)
library(circlize)
library(ggpubr)
library(ggridges)
library(gridExtra)
library(rstatix)

Read the data

sce_rna = readRDS(file = "data/data_for_analysis/sce_RNA.rds")
sce_prot = readRDS(file = "data/data_for_analysis/sce_protein.rds")

sce_rna <- sce_rna[,sce_rna$Location != "CTRL"]
sce_prot <- sce_prot[,sce_prot$Location != "CTRL"]

# dysfunction stain
sce_dysfunction <- readRDS(file = "data/data_for_analysis/sce_dysfunction.rds")

Supp Figure 11A

Tumor Marker Profile for different T cell Scoring Groups

tumor_marker_protein <- c("bCatenin", "Sox9", "pERK", "p75", "Ki67", "SOX10", "PARP", "S100", "MiTF")
tumor_marker_rna <- c("Mart1", "pRB")

# rna data 
dat_rna <- data.frame(t(assay(sce_rna[tumor_marker_rna, sce_rna$celltype == "Tumor"], "asinh")))
dat_rna$cellID <- rownames(dat_rna)
dat_rna <- left_join(dat_rna, data.frame(colData(sce_rna))[,c("cellID", "Tcell_density_score_image", "Description", "MM_location", "Location")])
Joining, by = "cellID"
# filter
dat_rna <- dat_rna %>%
  filter(Location != "CTRL")

# mean per image
dat_rna <- dat_rna %>%
  dplyr::select(-cellID) %>%
  group_by(Description, Tcell_density_score_image) %>%
  summarise_if(is.numeric, mean, na.rm = TRUE)

# melt
dat_rna <- dat_rna %>%
  reshape2::melt(id.vars = c("Description", "Tcell_density_score_image"), variable.name = "channel", value.name = "asinh")

# protein data
dat_prot <- data.frame(t(assay(sce_prot[tumor_marker_protein,, sce_prot$celltype == "Tumor"], "asinh")))
dat_prot$cellID <- rownames(dat_prot)
dat_prot <- left_join(dat_prot, data.frame(colData(sce_prot))[,c("cellID", "Tcell_density_score_image", "Description", "MM_location", "Location")])
Joining, by = "cellID"
# filter
dat_prot <- dat_prot %>%
  filter(Location != "CTRL")

# mean per image
dat_prot <- dat_prot %>%
  dplyr::select(-cellID) %>%
  group_by(Description, Tcell_density_score_image) %>%
  summarise_if(is.numeric, mean, na.rm = TRUE)

# melt
dat_prot <- dat_prot %>%
  reshape2::melt(id.vars = c("Description", "Tcell_density_score_image"), variable.name = "channel", value.name = "asinh")

# join both data sets
comb <- rbind(dat_prot, dat_rna)

# adjusted wilcox.test for groups
group_comparison <- list(c("absent", "high"), c("med", "high"))

stat.test <- comb %>%
  group_by(channel) %>%
  wilcox_test(data = ., asinh ~ Tcell_density_score_image) %>%
  adjust_pvalue(method = "BH") %>%
  add_significance("p.adj",cutpoints = c(0, 1e-04, 0.001, 0.01, 0.1, 1)) %>%
  add_xy_position(x = "Tcell_density_score_image", dodge = 0.8, comparisons = group_comparison) %>%
  filter(is.na(y.position) == FALSE)

# plot 
p <- ggplot(comb, aes(x=Tcell_density_score_image, y=asinh)) + 
  geom_boxplot(alpha=0.2, lwd=1, aes(fill=Tcell_density_score_image)) +
  geom_quasirandom(alpha=0.6, size=2, aes(col=Tcell_density_score_image)) +
  scale_color_discrete(guide = FALSE) +
  theme_bw() +
  theme(text = element_text(size=18),
        axis.text.x = element_blank(),
        axis.ticks = element_blank(),
        axis.title.x = element_blank()) +
  facet_wrap(~channel, scales = "free") + 
  stat_pvalue_manual(stat.test, label = "p.adj.signif", size = 7) + 
  xlab("") + 
  ylab("Mean Count per Image (asinh)") +
  scale_y_continuous(expand = expansion(mult = c(0.05, 0.1))) +
  guides(fill=guide_legend(title="T cell Score", override.aes = c(lwd=0.5, alpha=1)))

leg <- get_legend(p)
Warning: It is deprecated to specify `guide = FALSE` to remove a guide. Please
use `guide = "none"` instead.
grid.arrange(p + theme(legend.position = "none"))

Version Author Date
235386f toobiwankenobi 2022-02-22
grid.arrange(leg)

Version Author Date
235386f toobiwankenobi 2022-02-22

Supp Figure 11B

# only CD8+ T cells
CD8_sce <- sce_dysfunction[,sce_dysfunction$celltype %in% c("CD8_Tcell","CD8_CXCL13+_Tcell")]

pat_dat <- as_tibble(colData(sce_rna)) %>%
  filter(is.na(dysfunction_score) == FALSE) %>%
  dplyr::select(Description,PatientID) %>%
  distinct()

cur_dat <- as_tibble(colData(CD8_sce))
cur_dat <- left_join(cur_dat,pat_dat,"Description")
RNA_dat <- as_tibble(colData(sce_rna))

prot_dat_CXCL13 <- cur_dat %>%
  filter(is.na(dysfunction_score) == FALSE) %>%
  dplyr::select(PatientID,celltype) %>%
  group_by(PatientID,celltype) %>%
  dplyr::count(celltype) %>%
  ungroup() %>%
  group_by(PatientID) %>%
  mutate(total_cellcount = sum(n)) %>%
  mutate(frac = n/total_cellcount) %>%
  filter(celltype == "CD8_CXCL13+_Tcell") %>%
  dplyr::select(PatientID,frac) %>%
  distinct() %>%
  pull(frac,PatientID)
  
rna_dat <- RNA_dat %>%
  filter(is.na(dysfunction_score) == FALSE) %>%
  dplyr::select(celltype,PatientID,dysfunction_score,CXCL13) %>%
  mutate(celltype2 = paste0(celltype,"_",CXCL13)) %>%
  filter(celltype2 %in% c("CD8+ T cell_0","CD8+ T cell_1")) %>%
  group_by(PatientID,celltype, .drop = FALSE) %>%
  dplyr::count(celltype2) %>%
  ungroup() %>%
  group_by(PatientID, .drop = FALSE) %>%
  mutate(total_cellcount = sum(n),frac=n/total_cellcount) %>%
  filter(celltype2 == "CD8+ T cell_0") %>%
  mutate(frac_CD8_CXCL13 = 1-frac) %>%
  pull(frac_CD8_CXCL13,PatientID)

# check names and only select those names in RNA data that we have in protein data
identical(names(rna_dat),names(prot_dat_CXCL13))
[1] FALSE
rna_dat <- rna_dat[names(prot_dat_CXCL13)]

# merge all data
all_dat <- data.frame(PatientID = names(prot_dat_CXCL13),
                      prot_CXCL13 = as.vector(prot_dat_CXCL13),
                      rna_frac = as.vector(rna_dat))

ggplot(all_dat, aes(x=prot_CXCL13, y=rna_frac)) + 
  geom_point(size=3) + 
  geom_smooth(method="lm") +
  stat_cor(method = "spearman") + 
  theme_bw() +
  theme(text = element_text(size=15)) +
  xlab("Fraction CD8+ CXCL13+ T cells (Protein data)") +
  ylab("Fraction CD8+ CXCL13+ T cells (RNA data)")
`geom_smooth()` using formula 'y ~ x'

Version Author Date
235386f toobiwankenobi 2022-02-22
cor.test(all_dat$prot_CXCL13, all_dat$rna_frac, method = "spearman")

    Spearman's rank correlation rho

data:  all_dat$prot_CXCL13 and all_dat$rna_frac
S = 214, p-value = 0.004431
alternative hypothesis: true rho is not equal to 0
sample estimates:
      rho 
0.6852941 

Supp Figure 11C

Heatmap with Community Modules

cur_dt <- data.frame(colData(sce_rna))
clust <- data.frame()

## wide table for communities 
for(i in names(cur_dt[,grepl(glob2rx("*pure"),names(cur_dt))])) {
  cur_dt_sub <- cur_dt[cur_dt[,i] > 0,]
  cur_dt_sub <- cbind(cur_dt_sub[,c(i, "Description")],
                      cur_dt_sub[,grepl(glob2rx("C*L*"),names(cur_dt_sub))])
  
  # count numbers of chemokine-expressing cells per patch
  cur_dt_sub <- cur_dt_sub %>%
    group_by(Description) %>%
    group_by_at(i, .add = TRUE) %>%
    summarise_each(funs(sum))
  
  cur_dt_sub$cluster_type <- i
  
  cur_dt_sub <- cur_dt_sub[,-2]
  
  clust <- rbind(clust, cur_dt_sub)
}
Warning: `summarise_each_()` was deprecated in dplyr 0.7.0.
Please use `across()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
Warning: `funs()` was deprecated in dplyr 0.8.0.
Please use a list of either functions or lambdas: 

  # Simple named list: 
  list(mean = mean, median = median)

  # Auto named with `tibble::lst()`: 
  tibble::lst(mean, median)

  # Using lambdas
  list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was generated.
# remove pure clusters with low abundance (CCL22, CCL4, CCL8)
clust <- clust[!(clust$cluster_type %in% c("ccl22_pure", "ccl4_pure", "ccl8_pure")),]

# number of patches by image
clust <- clust %>%
  group_by(Description, cluster_type) %>%
  summarise(n=n()) %>%
  reshape2::dcast(Description ~ cluster_type, value.var = "n", fill = 0)
`summarise()` has grouped output by 'Description'. You can override using the
`.groups` argument.
# add images with 0 clusters and add more information
to_add <- data.frame(colData(sce_rna)) %>%
  distinct(Description, .keep_all = TRUE)

clust <- left_join(to_add[,c("Description", "dysfunction_score", "Tcell_density_score_image")], clust)
Joining, by = "Description"
# repalce NA with 0 
cur_dt_wide <- clust %>%
  mutate_if(is.numeric,coalesce,0)

# order according to image infiltration score
cur_dt_wide <- cur_dt_wide[order(cur_dt_wide$dysfunction_score),]

# chemokines per image
total_chemokines <- cur_dt %>%
  group_by(Description, chemokine) %>%
  summarise(n=n()) %>%
  group_by(Description) %>%
  mutate(fraction = n/sum(n)) %>%
  filter(chemokine == TRUE)
`summarise()` has grouped output by 'Description'. You can override using the
`.groups` argument.
# is a expressing cell part of a milieu?
cur_dt$in_community <- ifelse(rowSums(cur_dt[,grepl(glob2rx("*pure"),names(cur_dt))]) > 0 & cur_dt$chemokine == TRUE, TRUE, FALSE)

# fraction in_community 1vs.0 per image
fractions <- cur_dt %>%
  filter(chemokine == TRUE) %>%
  group_by(Description, in_community) %>%
  summarise(n=n()) %>%
  group_by(Description) %>%
  mutate(fraction = n / sum(n)) %>%
  reshape2::dcast(Description ~ in_community, value.var = "fraction", fill = 0)
`summarise()` has grouped output by 'Description'. You can override using the
`.groups` argument.
names(fractions)[2:3] <- c("single", "community")
fractions[, 2:3][is.na(fractions[, 2:3])] <- 0

# chemokines per image (regardless of combination, multi-producing cells count more than once)
chemokines <- cbind(cur_dt[,c("Description", "in_community")], cur_dt[,grepl(glob2rx("C*L*"),names(cur_dt))])

# long table - chemokine / in_community info and count (n) per image
chemokines <- reshape2::melt(chemokines, id.vars = c("Description", "in_community"), variable.name = "chemokine", value.name = "n") %>%
  group_by(Description, in_community, chemokine) %>%
  summarise(total = sum(n)) %>%
  reshape2::dcast(Description + chemokine ~ in_community, value.var = "total") %>%
  replace(is.na(.), 0) %>%
  reshape2::melt(id.vars = c("Description", "chemokine"), variable.name = "in_community", value.name = "n")
`summarise()` has grouped output by 'Description', 'in_community'. You can
override using the `.groups` argument.
# combine all information
cur_dt_wide <- left_join(cur_dt_wide, fractions)
Joining, by = "Description"
cur_dt_wide <- left_join(cur_dt_wide, total_chemokines)
Joining, by = "Description"

Plot Heatmap

# remove controls and only keep images with dysfunction score
cur_dt_wide_sub <- cur_dt_wide[cur_dt_wide$Description %in% unique(sce_rna[,sce_rna$Location != "CTRL"]$Description),]
cur_dt_wide_sub <- cur_dt_wide[cur_dt_wide$dysfunction_score %in% c("High Dysfunction", "Low Dysfunction"),]

# define subgroups to split  heatmap
subgroup = cur_dt_wide_sub[,"dysfunction_score"]

# heatmap annotation
row_ha2 = rowAnnotation("Production Mode of\nChemokine-Expressing Cells" = 
                          anno_barplot(cur_dt_wide_sub[,c("single", "community")], 
                                       gp = gpar(fill = c("#F8766D", "#00BFC4")), width = unit(1.5, "cm")),
                        "Fraction of Chemokine- \n Expressing Cells" = 
                          anno_barplot(cur_dt_wide_sub[,"fraction"],
                                       width = unit(1.5, "cm")),
                        annotation_name_rot = 90, gap = unit(3, "mm"),
                        col = list(Relapse = c("no relapse" = "orange", "relapse" = "black", "untreated/lost" = "grey")))

# function for the zoom-in plot
panel_fun_chemokines = function(index, nm) {
    image_number = cur_dt_wide_sub[index,"Description"]
    if(length(unique(image_number)) > 9){
      df = chemokines[chemokines$Description %in% image_number, ]
      g = ggplot(df, aes(x = factor(chemokine), y = log10(n+1), fill=in_community)) + 
        geom_boxplot() + 
        xlab("Chemokine") + 
        ylab("# Cells [log10(n+1)]") +
        theme_bw() +
        theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1),
              legend.position = "none") +
        ylim(0,3)
      g = grid.grabExpr(print(g))
      pushViewport(viewport())
      grid.rect()
      grid.draw(g)
      popViewport()
    }
}

# create zoom-in 
zoom = anno_zoom(align_to = subgroup, 
                 which = "row", panel_fun = panel_fun_chemokines, 
                 size = unit(6, "cm"), 
                 gap = unit(1, "cm"), 
                 width = unit(10, "cm"))

# heatmap
m <- as.matrix(cur_dt_wide_sub[,grepl(glob2rx("*pure"),names(cur_dt_wide_sub))])
col_names <- c()
for(i in (1:length(colnames(m)))){col_names <- c(col_names,(toupper(str_split(colnames(m), "_")[[i]][1])))}
colnames(m) <- col_names

col_fun = viridis::inferno(max(m)+1)

ht1 = Heatmap(m, 
        col = col_fun,
        left_annotation = row_ha2,
        right_annotation = rowAnnotation(foo = zoom, gap = unit(3,"cm")),
        row_split = subgroup,
        row_title_side = "left",
        border = T,
        row_gap = unit(3, "mm"),
        cluster_rows = T,
        cluster_columns = F,
        cluster_row_slices = F,
        show_heatmap_legend = F,
        show_row_dend = F,
        name = "Detected Patches",
        column_title = "Chemokine Milieu", 
        column_title_side = "bottom",
        column_title_gp = gpar(fontsize=20),
        show_row_names = F,
        width = unit(15,"cm"))

draw(ht1)
Warning: Removed 1 rows containing non-finite values (stat_boxplot).

Version Author Date
235386f toobiwankenobi 2022-02-22

Legend

# manual legends
lgd1 = Legend(labels = c("Stand-alone", "Milieu"), title = "Production Mode", legend_gp = gpar(fill = c("#F8766D", "#00BFC4")))
lgd2 = Legend(col_fun = colorRamp2(c(0:max(m)), colors = col_fun), 
              at = seq(0, max(m)+2, by=5), title = "Detected Milieus", direction = "horizontal", grid_width = unit(2, "cm"))

# Draw Legend
draw(packLegend(lgd2, lgd1, direction = "horizontal"))

Version Author Date
235386f toobiwankenobi 2022-02-22

Supp Figure 11D

Tumor Marker Profile for different Dysfunction Scoring Groups per Image

tumor_marker_protein <- c("pS6", "bCatenin", "H3K27me3", "HLADR", "Sox9", "pERK", "p75", "PDL1", "Ki67", "SOX10", "PARP")
tumor_marker_rna <- c("B2M")

# rna data 
dat_rna <- data.frame(t(assay(sce_rna[tumor_marker_rna, sce_rna$celltype == "Tumor"], "asinh")))
dat_rna$cellID <- rownames(dat_rna)
dat_rna <- left_join(dat_rna, data.frame(colData(sce_rna))[,c("cellID", "dysfunction_score", "Description")])
Joining, by = "cellID"
# filter
dat_rna <- dat_rna %>%
  filter(dysfunction_score %in% c("High Dysfunction", "Low Dysfunction"))

# mean per image
dat_rna <- as_tibble(dat_rna) %>%
  dplyr::select(-cellID) %>%
  group_by(Description, dysfunction_score) %>%
  summarise_if(is.numeric, mean, na.rm = TRUE)

# melt
dat_rna <- dat_rna %>%
  reshape2::melt(id.vars = c("Description", "dysfunction_score"), variable.name = "channel", value.name = "asinh")

# protein data
dat_prot <- data.frame(t(assay(sce_prot[tumor_marker_protein,, sce_prot$celltype == "Tumor"], "asinh")))
dat_prot$cellID <- rownames(dat_prot)
dat_prot <- left_join(dat_prot, data.frame(colData(sce_prot))[,c("cellID", "dysfunction_score", "Description")])
Joining, by = "cellID"
# filter
dat_prot <- dat_prot %>%
  filter(dysfunction_score %in% c("High Dysfunction", "Low Dysfunction"))

# mean per image
dat_prot <- dat_prot %>%
  dplyr::select(-cellID) %>%
  group_by(Description, dysfunction_score) %>%
  summarise_if(is.numeric, mean, na.rm = TRUE)

# melt
dat_prot <- dat_prot %>%
  reshape2::melt(id.vars = c("Description", "dysfunction_score"), variable.name = "channel", value.name = "asinh")

# join both data sets
comb <- rbind(dat_prot, dat_rna)

stat.test <- comb %>%
  group_by(channel) %>%
  wilcox_test(data = ., asinh ~ dysfunction_score) %>%
  adjust_pvalue(method = "BH") %>%
  add_significance("p.adj",cutpoints = c(0, 1e-04, 0.001, 0.01, 0.1, 1)) %>%
  add_xy_position(dodge = 0.8)

# plot 
p <- ggplot(comb, aes(x=dysfunction_score, y=asinh)) + 
  geom_boxplot(alpha=0.2, lwd=1, aes(fill=dysfunction_score)) +
  geom_quasirandom(alpha=0.6, size=2, aes(col=dysfunction_score)) +
  scale_color_discrete(guide = "none") +
  theme_bw() +
  theme(text = element_text(size=18),
        axis.text.x = element_blank(),
        axis.ticks = element_blank(),
        axis.title.x = element_blank()) +
  facet_wrap(~channel, scales = "free", ncol = 4) + 
  stat_pvalue_manual(stat.test, label = "p.adj.signif", size = 7) + 
  xlab("") + 
  ylab("Mean Count per Image (asinh)") +
  scale_y_continuous(expand = expansion(mult = c(0.05, 0.2))) +
  guides(fill=guide_legend(title="Dysfunction Score", override.aes = c(lwd=0.5, alpha=1)))

leg <- get_legend(p)

grid.arrange(p + theme(legend.position = "none"))

Version Author Date
235386f toobiwankenobi 2022-02-22
grid.arrange(leg)

Version Author Date
235386f toobiwankenobi 2022-02-22

Supp Figure 11E

Define S100+ cells

y <- c(rep(1:10,16),rep(11,7))

# add the group information to the sce object
sce_rna$groups <- y[colData(sce_rna)$ImageNumber]

# now we use the function written by Nils
plotDist(sce_rna["S100", sce_rna$celltype == "Tumor"], plot_type = "ridges", 
         colour_by = "groups", split_by = "rows", 
         exprs_values = "asinh") +
  geom_vline(xintercept = 3)

Plot

# manual gating 
sce_rna$S100 <- ifelse(assay(sce_rna["S100",], "asinh") > 3, "positive", "negative")

# fraction of S100 tumor cells per image
s100 <- data.frame(colData(sce_rna)) %>%
  filter(celltype == "Tumor") %>%
  group_by(ImageNumber, dysfunction_score, S100) %>%
  summarise(n=n()) %>%
  mutate(fraction = n/sum(n)) %>%
  reshape2::dcast(ImageNumber + dysfunction_score ~ S100, fill = 0, value.var = "fraction") %>%
  reshape2::melt(id.vars = c("ImageNumber", "dysfunction_score"), variable.name = "S100", value.name = "fraction") %>%
  filter(is.na(dysfunction_score) == F & S100 == "positive")
`summarise()` has grouped output by 'ImageNumber', 'dysfunction_score'. You can
override using the `.groups` argument.
s100$dysfunction_score <- factor(s100$dysfunction_score)

stat.test <- s100 %>%
  group_by(S100) %>%
  wilcox_test(data = ., fraction ~ dysfunction_score) %>%
  adjust_pvalue(method = "BH") %>%
  add_significance("p.adj",cutpoints = c(0, 1e-04, 0.001, 0.01, 0.1, 1)) %>%
  add_x_position(dodge = 0.8)

ggplot(s100, aes(x=dysfunction_score, y=fraction)) + 
  geom_boxplot(alpha=0.2, lwd=1.5, aes(fill = dysfunction_score)) + 
  geom_quasirandom(aes(col=dysfunction_score), size=3) +
  stat_pvalue_manual(stat.test, label = "p.adj.signif", size = 7, y.position = 1) + 
  xlab("") + 
  ylab("Fraction of S100+ Tumor Cells") +
  theme_bw() +
  theme(text = element_text(size=16),
        legend.position = "none") +
  ylim(0,1.05)

Version Author Date
235386f toobiwankenobi 2022-02-22

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] rstatix_0.7.0               gridExtra_2.3              
 [3] ggridges_0.5.3              ggpubr_0.4.0               
 [5] circlize_0.4.13             ggbeeswarm_0.6.0           
 [7] ggrepel_0.9.1               rms_6.2-0                  
 [9] SparseM_1.81                Hmisc_4.6-0                
[11] Formula_1.2-4               survival_3.2-13            
[13] lattice_0.20-45             ComplexHeatmap_2.10.0      
[15] data.table_1.14.2           forcats_0.5.1              
[17] stringr_1.4.0               purrr_0.3.4                
[19] readr_2.1.2                 tidyr_1.2.0                
[21] tibble_3.1.6                ggplot2_3.3.5              
[23] tidyverse_1.3.1             reshape2_1.4.4             
[25] SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0
[27] Biobase_2.54.0              GenomicRanges_1.46.1       
[29] GenomeInfoDb_1.30.1         IRanges_2.28.0             
[31] S4Vectors_0.32.3            BiocGenerics_0.40.0        
[33] MatrixGenerics_1.6.0        matrixStats_0.61.0         
[35] dplyr_1.0.7                 workflowr_1.7.0            

loaded via a namespace (and not attached):
  [1] readxl_1.3.1           backports_1.4.1        plyr_1.8.6            
  [4] splines_4.1.2          TH.data_1.1-0          digest_0.6.29         
  [7] foreach_1.5.2          htmltools_0.5.2        magick_2.7.3          
 [10] viridis_0.6.2          fansi_1.0.2            magrittr_2.0.2        
 [13] checkmate_2.0.0        cluster_2.1.2          doParallel_1.0.16     
 [16] tzdb_0.2.0             modelr_0.1.8           sandwich_3.0-1        
 [19] jpeg_0.1-9             colorspace_2.0-2       rvest_1.0.2           
 [22] haven_2.4.3            xfun_0.29              callr_3.7.0           
 [25] crayon_1.4.2           RCurl_1.98-1.5         jsonlite_1.7.3        
 [28] zoo_1.8-9              iterators_1.0.13       glue_1.6.1            
 [31] gtable_0.3.0           zlibbioc_1.40.0        XVector_0.34.0        
 [34] MatrixModels_0.5-0     GetoptLong_1.0.5       DelayedArray_0.20.0   
 [37] car_3.0-12             shape_1.4.6            abind_1.4-5           
 [40] scales_1.1.1           mvtnorm_1.1-3          DBI_1.1.2             
 [43] Rcpp_1.0.8             viridisLite_0.4.0      htmlTable_2.4.0       
 [46] clue_0.3-60            foreign_0.8-82         htmlwidgets_1.5.4     
 [49] httr_1.4.2             RColorBrewer_1.1-2     ellipsis_0.3.2        
 [52] farver_2.1.0           pkgconfig_2.0.3        nnet_7.3-17           
 [55] sass_0.4.0             dbplyr_2.1.1           utf8_1.2.2            
 [58] labeling_0.4.2         tidyselect_1.1.1       rlang_1.0.0           
 [61] later_1.3.0            munsell_0.5.0          cellranger_1.1.0      
 [64] tools_4.1.2            cli_3.1.1              generics_0.1.2        
 [67] broom_0.7.12           evaluate_0.14          fastmap_1.1.0         
 [70] yaml_2.2.2             processx_3.5.2         knitr_1.37            
 [73] fs_1.5.2               nlme_3.1-155           whisker_0.4           
 [76] quantreg_5.87          xml2_1.3.3             compiler_4.1.2        
 [79] rstudioapi_0.13        beeswarm_0.4.0         png_0.1-7             
 [82] ggsignif_0.6.3         reprex_2.0.1           bslib_0.3.1           
 [85] stringi_1.7.6          highr_0.9              ps_1.6.0              
 [88] Matrix_1.4-0           vctrs_0.3.8            pillar_1.7.0          
 [91] lifecycle_1.0.1        jquerylib_0.1.4        GlobalOptions_0.1.2   
 [94] bitops_1.0-7           httpuv_1.6.5           R6_2.5.1              
 [97] latticeExtra_0.6-29    promises_1.2.0.1       vipor_0.4.5           
[100] codetools_0.2-18       polspline_1.1.19       MASS_7.3-55           
[103] assertthat_0.2.1       rprojroot_2.0.2        rjson_0.2.21          
[106] withr_2.4.3            multcomp_1.4-18        GenomeInfoDbData_1.2.7
[109] mgcv_1.8-38            parallel_4.1.2         hms_1.1.1             
[112] rpart_4.1.16           rmarkdown_2.11         carData_3.0-5         
[115] git2r_0.29.0           getPass_0.2-2          lubridate_1.8.0       
[118] base64enc_0.1-3