Last updated: 2022-02-22

Checks: 7 0

Knit directory: MelanomaIMC/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20200728) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version d246c15. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rproj.user/
    Ignored:    Table_S4.csv
    Ignored:    code/.DS_Store
    Ignored:    code/._.DS_Store
    Ignored:    data/.DS_Store
    Ignored:    data/._.DS_Store
    Ignored:    data/data_for_analysis/
    Ignored:    data/full_data/

Unstaged changes:
    Modified:   .gitignore
    Modified:   analysis/Supp-Figure_10.rmd
    Modified:   analysis/_site.yml
    Deleted:    analysis/license.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/04_2_protein_classification_subclustering.rmd) and HTML (docs/04_2_protein_classification_subclustering.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd b24a994 toobiwankenobi 2022-02-08 update code to work under R 4.1.2
Rmd f9a3a83 toobiwankenobi 2022-02-08 clean repo for release
html 4109ff1 toobiwankenobi 2021-07-07 delete html files and adapt gitignore
Rmd 3203891 toobiwankenobi 2021-02-19 change celltype names
html 3203891 toobiwankenobi 2021-02-19 change celltype names
Rmd ee1595d toobiwankenobi 2021-02-12 clean repo and adapt files
html ee1595d toobiwankenobi 2021-02-12 clean repo and adapt files
html 3f5af3f toobiwankenobi 2021-02-09 add .html files
Rmd d8819f2 toobiwankenobi 2020-10-08 read new data (nuclei expansion) and adapt scripts
Rmd fb0f7cb SchulzDan 2020-08-24 more paths adapted
Rmd b68b4c3 SchulzDan 2020-08-17 added scores and coxph models. not finished

load packages

sapply(list.files("code/helper_functions/", full.names = TRUE), source)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
        code/helper_functions//calculateSummary.R
value   ?                                        
visible FALSE                                    
        code/helper_functions//censor_dat.R
value   ?                                  
visible FALSE                              
        code/helper_functions//detect_mRNA_expression.R
value   ?                                              
visible FALSE                                          
        code/helper_functions//DistanceToClusterCenter.R
value   ?                                               
visible FALSE                                           
        code/helper_functions//findMilieu.R code/helper_functions//findPatch.R
value   ?                                   ?                                 
visible FALSE                               FALSE                             
        code/helper_functions//getInfoFromString.R
value   ?                                         
visible FALSE                                     
        code/helper_functions//getSpotnumber.R
value   ?                                     
visible FALSE                                 
        code/helper_functions//plotCellCounts.R
value   ?                                      
visible FALSE                                  
        code/helper_functions//plotCellFractions.R
value   ?                                         
visible FALSE                                     
        code/helper_functions//plotDist.R code/helper_functions//read_Data.R
value   ?                                 ?                                 
visible FALSE                             FALSE                             
        code/helper_functions//scatter_function.R
value   ?                                        
visible FALSE                                    
        code/helper_functions//sceChecks.R
value   ?                                 
visible FALSE                             
        code/helper_functions//validityChecks.R
value   ?                                      
visible FALSE                                  
library(LSD)
library(SingleCellExperiment)
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package: 'matrixStats'
The following object is masked from 'package:dplyr':

    count

Attaching package: 'MatrixGenerics'
The following objects are masked from 'package:matrixStats':

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
    rowWeightedSds, rowWeightedVars
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'
The following objects are masked from 'package:dplyr':

    combine, intersect, setdiff, union
The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors

Attaching package: 'S4Vectors'
The following objects are masked from 'package:dplyr':

    first, rename
The following objects are masked from 'package:base':

    expand.grid, I, unname
Loading required package: IRanges

Attaching package: 'IRanges'
The following objects are masked from 'package:dplyr':

    collapse, desc, slice
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Attaching package: 'Biobase'
The following object is masked from 'package:MatrixGenerics':

    rowMedians
The following objects are masked from 'package:matrixStats':

    anyMissing, rowMedians
library(ggplot2)
library(scater)
Loading required package: scuttle
library(viridis)
Loading required package: viridisLite
library(igraph)

Attaching package: 'igraph'
The following object is masked from 'package:scater':

    normalize
The following object is masked from 'package:GenomicRanges':

    union
The following object is masked from 'package:IRanges':

    union
The following object is masked from 'package:S4Vectors':

    union
The following objects are masked from 'package:BiocGenerics':

    normalize, path, union
The following objects are masked from 'package:dplyr':

    as_data_frame, groups, union
The following objects are masked from 'package:stats':

    decompose, spectrum
The following object is masked from 'package:base':

    union
library(CATALYST)
library(workflowr)

load data

sce = readRDS(file = "data/data_for_analysis/sce_protein.rds")
for(i in 1:10){
scatter_x_y(sce,x = "CD8",y = "pERK",imagenumber =i, xlim = c(0,5), ylim = c(0,5))
  abline(h = 1.25)
  abline(v=1.5)
  abline(a = 0, b= 1)
}

mean expression of markers per celltype

# Aggregate the counts. calculateSummary is a function written from Nils
mean_sce <- calculateSummary(sce, split_by = "celltype", 
                             exprs_values = "counts")

# Exclude DNA and Histone
mean_sce <- mean_sce[!grepl("DNA|Histone", rownames(mean_sce)),]

# Transform and scale
assay(mean_sce, "arcsinh") <- asinh(assay(mean_sce, "meanCounts")) 
assay(mean_sce, "arcsinh_scaled") <- t(scale(t(asinh(assay(mean_sce, "meanCounts")))))

# Linear scale 
plotHeatmap(mean_sce, exprs_values = "arcsinh", 
            features = c(rownames(mean_sce)), 
            colour_columns_by = "celltype", 
            color = viridis(100), cluster_rows = FALSE, gaps_row = 10)

scaled mean expression of markers per celltype

# Linear scale 
plotHeatmap(mean_sce, exprs_values = "arcsinh_scaled", 
            features = c(rownames(mean_sce)), 
            colour_columns_by = "celltype", 
            color = colorRampPalette(c("dark blue", "white", "dark red"))(100),
            zlim = c(-3,3), cluster_rows = TRUE, gaps_row = 10)

definition of markers used for subclustering the super-classes

# Select markers
marker_list <- list()

marker_list$Tumor <- c("Ki67", "PDL1", "PARP", "H3K27me3","CXCR2","HLADR","S100","Sox9","pERK","CD36",
                       "p75","MiTF","bCatenin","Collagen1","pS6","IDO1","SOX10")

marker_list$Neutrophil <- c("CD15", "PDL1", "MPO", "CD11b")


marker_list$`FOXP3+ T cell` <- c("IDO1", "PD1", "FOXP3", "ICOS", "CD4", "Ki67", "GrzB",
                        "CD45RA", "CD45RO")

marker_list$`CD8+ T cell` <- c("IDO1", "TOX1", "PD1", "TCF7", "ICOS", "CD8","Ki67", "GrzB",
                        "CD45RA", "CD45RO")

marker_list$`CD4+ T cell` <- c("IDO1", "TOX1", "PD1", "TCF7", 
                        "FOXP3", "ICOS", "CD4", "Ki67", "GrzB",
                        "CD45RA", "CD45RO")

marker_list$`BnT cell` <- c("IDO1", "TOX1", "PD1", "TCF7", 
                        "FOXP3", "ICOS", "CD8", "CD4", "Ki67", "GrzB",
                        "CD45RA", "CD45RO", "CD20","HLADR","H3K27me3","pS6","pERK", "CD7")

marker_list$pDC <- c("CD303","IDO1","GrzB","CXCR2","CD11b")

marker_list$`B cell` <- c("Ki67","CD45RA", "CD45RO", "CD20","HLADR","H3K27me3","pS6","pERK")

marker_list$Macrophage <- c("Caveolin1", "PDL1", "PARP", "H3K27me3","CXCR2","HLADR","CD45RO","CD45RA","CD68","pERK","CD36", "Collagen1","CD11c","pS6","IDO1","CD206")

marker_list$Stroma <- c("Caveolin1", "CD68","CD36", "Collagen1","SMA")

marker_list$unknown <- rownames(sce[rowData(sce)$good_marker,])

cluster the whole dataset with FlowSOM

FlowSOM first because it is faster

## the FlowSOM function from CATALYST needs an another column in the rowData of the sce to work properly:
rowData(sce)$marker_class <- "state"

# vector for clustering
fs_clustering <- vector(length = ncol(sce))

# create the "exprs" slot in the assay data (needed for CATALYST)
assay(sce, "exprs") <- assay(sce,"asinh")

# Macrophage, B cells, CD4+ T cell, CD8+ T cell, Tother and BnT cells will be clustered for a total of 6 clustes each
set.seed(12345)
for(i in c("Macrophage","CD4+ T cell","CD8+ T cell")){
  cur_sce <- sce[,sce$celltype == i]
  
  cur_sce <- CATALYST::cluster(cur_sce,features = marker_list[i][[1]], ydim = 2,xdim = 3,maxK = 4)
  
  fs_clustering[sce$celltype == i] <- cur_sce$cluster_id
}
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
# pDCs and Neutrophils will be clustered to 4 clusteres
for(i in c("pDC","Neutrophil","BnT cell","B cell","FOXP3+ T cell","Stroma", "unknown")){
  cur_sce <- sce[,sce$celltype == i]
  
  cur_sce <- CATALYST::cluster(cur_sce,features = marker_list[i][[1]],ydim = 2,xdim = 2,maxK = 3)
  
  fs_clustering[sce$celltype == i] <- cur_sce$cluster_id
}
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
# Tumor will be clustered into 9 clusters
cur_sce <- sce[,sce$celltype == "Tumor"]
  
cur_sce <- CATALYST::cluster(cur_sce,features = marker_list["Tumor"][[1]],ydim = 3,xdim = 3,maxK = 7)
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
fs_clustering[sce$celltype == "Tumor"] <- cur_sce$cluster_id

# Save in SCE object
colData(sce)$celltype_clustered <- as.factor(fs_clustering)

sce$celltype_clustered <- paste0(sce$celltype, "_", sce$celltype_clustered)

visulalize clustering results for FlowSOM

# Aggregate the counts
mean_sce <- calculateSummary(sce, split_by = c("celltype","celltype_clustered"), 
                             exprs_values = "counts")

# Exclude DNA and Histone
mean_sce <- mean_sce[!grepl("DNA|Histone", rownames(mean_sce)),]

# Transform and scale
assay(mean_sce, "arcsinh") <- asinh(assay(mean_sce, "meanCounts")) 
assay(mean_sce, "arcsinh_scaled") <- t(scale(t(asinh(assay(mean_sce, "meanCounts")))))

# Linear scale 
plotHeatmap(mean_sce, exprs_values = "arcsinh",
            features = rownames(sce)[rowData(sce)$good_marker],
            colour_columns_by = c("celltype"), 
            color = viridis(100), 
            labels_col = mean_sce$celltype_clustered,
            show_colnames = TRUE, annotation_legend = FALSE, borders_color = NA)

# Scaled
plotHeatmap(mean_sce, exprs_values = "arcsinh_scaled", 
            features = rownames(sce)[rowData(sce)$good_marker],
            colour_columns_by = c("celltype"), 
            labels_col = mean_sce$celltype_clustered,
            show_colnames = TRUE, annotation_legend = TRUE, borders_color = NA,
            color = colorRampPalette(c("dark blue", "white", "dark red"))(100),
            zlim = c(-4, 4),legend = TRUE)

numbers of cells per cluster

table(sce$celltype_clustered)

       B cell_1        B cell_2        B cell_3        B cell_4      BnT cell_1 
           4315           11731           12914           21486            7024 
     BnT cell_2      BnT cell_3      BnT cell_4   CD4+ T cell_1   CD4+ T cell_2 
           7961            6527            9082           11334           11205 
  CD4+ T cell_3   CD4+ T cell_4   CD4+ T cell_5   CD4+ T cell_6   CD8+ T cell_1 
           9717           11569           13073            2997            3671 
  CD8+ T cell_2   CD8+ T cell_3   CD8+ T cell_4   CD8+ T cell_5   CD8+ T cell_6 
          11058           11808           12549            9536            4671 
FOXP3+ T cell_1 FOXP3+ T cell_2 FOXP3+ T cell_3 FOXP3+ T cell_4    Macrophage_1 
           2976            2375            2827            1293           12010 
   Macrophage_2    Macrophage_3    Macrophage_4    Macrophage_5    Macrophage_6 
           8126           10022            8439           14226            9699 
   Neutrophil_1    Neutrophil_2    Neutrophil_3    Neutrophil_4           pDC_1 
           3339            2473            1770            3612            2508 
          pDC_2           pDC_3           pDC_4        Stroma_1        Stroma_2 
           1810             891            1882           14333           17709 
       Stroma_3        Stroma_4         Tumor_1         Tumor_2         Tumor_3 
          10161           18751           84616           76055           82520 
        Tumor_4         Tumor_5         Tumor_6         Tumor_7         Tumor_8 
          51185           72087           86565           43957           51597 
        Tumor_9       unknown_1       unknown_2       unknown_3       unknown_4 
          75909            2795           10315            1678            4665 

assign names to clusters

annotations <- sce$celltype_clustered

# annotations[annotations == "other_4"] <- "Other_CD206Low_pS6Low"
# annotations[annotations == "other_2"] <- "Tumor_SOX9Low_SOX10+_MITF+_S100+_bCatenin++"
# annotations[annotations == "other_3"] <- "Other_SOX10Low_MITFLow_bCatenin+"
# annotations[annotations == "other_1"] <- "Other_CaveolinLow"
# annotations[annotations == "other_5"] <- "Vasculature_SMA++_Collagen+_Caveolin+"
# annotations[annotations == "other_6"] <- "Stroma_SMA+_Collagen++_Caveolin++_CD36++"
# 
# annotations[annotations == "Tumor_2"] <- "Tumor_SOX9++_p75+"
# annotations[annotations == "Tumor_3"] <- "Tumor_SOX10+_MITFLow_CD36Low"
# annotations[annotations == "Tumor_4"] <- "Tumor_SOX10Low_S100+"
# annotations[annotations == "Tumor_5"] <- "Tumor_SOX9++_SOX10+_MITF+_S100Low"
# annotations[annotations == "Tumor_8"] <- "Tumor_SOX9Low_SOX10Low_MITFLow_p75+"
# annotations[annotations == "Tumor_9"] <- "Tumor_SOX10+_MITF+_S100++"
# annotations[annotations == "Tumor_6"] <- "Tumor_S100+"
# annotations[annotations == "Tumor_1"] <- "Tumor_SOX10Low_MITFLow_HLADR+"
# annotations[annotations == "Tumor_7"] <- "Tumor_SOX10+_MITF++_S100+_pS6+"
# 
# annotations[annotations == "B cells_1"] <- "B cells_pErkLow_H3K27meLow"
# annotations[annotations == "B cells_4"] <- "B cells_pErkLow_H3K27meLow"
# annotations[annotations == "B cells_2"] <- "B cells_H3K27me_LowpS6Low"
# annotations[annotations == "B cells_3"] <- "B cells_CD19++_pErkLow_H3K27meLow"
# annotations[annotations == "B cells_5"] <- "B cells_Ki67++"
# annotations[annotations == "B cells_6"] <- "B cells_SMALow"
# 
# annotations[annotations == "BnT_1"] <- "B cells_CD3Low_CD4Low"
# annotations[annotations == "BnT_2"] <- "B cells_pErkLow_H3K27meLow"
# annotations[annotations == "BnT_6"] <- "BnT_CD8+"
# annotations[annotations == "BnT_4"] <- "BnT_CD8+"
# annotations[annotations == "BnT_5"] <- "BnT_CD4+_TCF7+"
# annotations[annotations == "BnT_3"] <- "BnT_CD4+_ICOS++_PD1++_Ki67+"
# 
# annotations[annotations == "T_other_4"] <- "Tcells_undefined"
# annotations[annotations == "T_other_5"] <- "Tcells_undefined"
# annotations[annotations == "T_other_3"] <- "Tcells_CD15_LowGrzB+"
# annotations[annotations == "T_other_1"] <- "Tcells_Ki67++"
# annotations[annotations == "T_other_2"] <- "Tcells_CaveolinLow_SMALow"
# annotations[annotations == "T_other_6"] <- "Tcells_undefined"
# 
# annotations[annotations == "T_helper_2"] <- "CD4+ T cell_CD11cLow_HLADRLow"
# annotations[annotations == "T_helper_5"] <- "CD4+ T cell_SMALow_CaveolinLow"
# annotations[annotations == "T_helper_1"] <- "CD4+ T cell_TCF7++"
# annotations[annotations == "T_helper_6"] <- "CD4+ T cell_FoxP3++_ICOS+_CD11cLow_CD206Low"
# annotations[annotations == "T_helper_3"] <- "CD4+ T cell_FoxP3Low_pERKLow"
# annotations[annotations == "T_helper_4"] <- "CD4+ T cell_FoxP3Low"
# 
# annotations[annotations == "T_cytotoxic_3"] <- "CD8+ T cell_PD1+_TOX1+_GrzBLow"
# annotations[annotations == "T_cytotoxic_5"] <- "CD8+ T cell_TCF7+_H3K27meLow"
# annotations[annotations == "T_cytotoxic_6"] <- "CD8+ T cell_PD1+_TOX1+_GrzB+_ICOSLow_Ki67+"
# annotations[annotations == "T_cytotoxic_1"] <- "CD8+ T cell_GrzBLow_SMALow_CaveolinLow"
# annotations[annotations == "T_cytotoxic_2"] <- "CD8+ T cell_GrzBLow_pS6Low"
# annotations[annotations == "T_cytotoxic_4"] <- "CD8+ T cell_PD1+_TOX1+_GrzBLow_ICOSLow_CD11cLow_CD206Low"
# 
# annotations[annotations == "Macrophage_4"] <- "Macrophage_CD68Low_CaveolinLow"
# annotations[annotations == "Macrophage_6"] <- "Tumor_CD68Low_S100Low_bCateninLow_MITFLow_SOX10Low_HLADR+"
# annotations[annotations == "Macrophage_2"] <- "Macrophage_CD68Low_CD206++_CD36Low"
# annotations[annotations == "Macrophage_1"] <- "Macrophage_CD68Low_pS6Low_S100Low"
# annotations[annotations == "Macrophage_3"] <- "Macrophage_CD68+_CD11c+_CD206++_CXCR2+_PDL1+"
# annotations[annotations == "Macrophage_5"] <- "Macrophage_CD68++_CD11c+_CD36+_Caveolin+_Collagen+"
# 
# annotations[annotations == "pDC_1"] <- "pDC_IDO1+_GrzB+"
# annotations[annotations == "pDC_2"] <- "pDC_IDO1+_TOX1+_CD11b+"
# annotations[annotations == "pDC_3"] <- "pDC_IDO1++_GrzB++"
# annotations[annotations == "pDC_4"] <- "pDC_Ido1+_PARP++"
# 
# annotations[annotations == "Neutrophil_2"] <- "Neutrophil_CD11b+_CD15++_MPO+_PDL1++"
# annotations[annotations == "Neutrophil_4"] <- "Neutrophil_CD11b+_CD15+_MPO+_PDL1Low"
# annotations[annotations == "Neutrophil_3"] <- "Neutrophil_CD11b++_CD15++_MPO++_CXCR2Low_PDL1+"
# annotations[annotations == "Neutrophil_1"] <- "Neutrophil_CD11b+_CD15+_MPO++"

sce$cellAnnotation <- annotations

mean_sce <- calculateSummary(sce, split_by = c("celltype","cellAnnotation"), 
                             exprs_values = "counts")

# Exclude DNA and Histone
mean_sce <- mean_sce[!grepl("DNA|Histone", rownames(mean_sce)),]

# Transform and scale
assay(mean_sce, "arcsinh") <- asinh(assay(mean_sce, "meanCounts")) 
assay(mean_sce, "arcsinh_scaled") <- t(scale(t(asinh(assay(mean_sce, "meanCounts")))))

plotHeatmap(mean_sce, exprs_values = "arcsinh_scaled", 
            features = rownames(sce)[rowData(sce)$good_marker],
            colour_columns_by = c("celltype"), 
            labels_col = mean_sce$cellAnnotation,
            show_colnames = TRUE, annotation_legend = TRUE, borders_color = NA,
            color = colorRampPalette(c("dark blue", "white", "dark red"))(100),
            zlim = c(-4, 4),legend = TRUE)

clustering does not really pick up PDL1 positive Tumor cells (which definitely exist). Potentially we have to investigate this with a manual cut off. arcsinh > 1.75 seems to be reasonable

Additionally, the clustering did not pick up proliferating and non prolifertating tumor clusters. this should also be investigated by manual cut offs. arcsinh > 1.25 seems reasonable.

celltype_counts <- sce$celltype

table(celltype_counts)

save the single cell object with the clusters assigned to the cells

# delete the "exprs" slot from the single cell experiment again.
assay(sce,"exprs") <- NULL

saveRDS(sce,file = "data/data_for_analysis/sce_protein.rds")

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] CATALYST_1.18.1             igraph_1.2.11              
 [3] viridis_0.6.2               viridisLite_0.4.0          
 [5] scater_1.22.0               scuttle_1.4.0              
 [7] ggplot2_3.3.5               SingleCellExperiment_1.16.0
 [9] SummarizedExperiment_1.24.0 Biobase_2.54.0             
[11] GenomicRanges_1.46.1        GenomeInfoDb_1.30.1        
[13] IRanges_2.28.0              S4Vectors_0.32.3           
[15] BiocGenerics_0.40.0         MatrixGenerics_1.6.0       
[17] matrixStats_0.61.0          LSD_4.1-0                  
[19] dplyr_1.0.7                 workflowr_1.7.0            

loaded via a namespace (and not attached):
  [1] utf8_1.2.2                  tidyselect_1.1.1           
  [3] grid_4.1.2                  BiocParallel_1.28.3        
  [5] Rtsne_0.15                  aws.signature_0.6.0        
  [7] flowCore_2.6.0              munsell_0.5.0              
  [9] ScaledMatrix_1.2.0          codetools_0.2-18           
 [11] withr_2.4.3                 colorspace_2.0-2           
 [13] highr_0.9                   knitr_1.37                 
 [15] rstudioapi_0.13             ggsignif_0.6.3             
 [17] git2r_0.29.0                GenomeInfoDbData_1.2.7     
 [19] polyclip_1.10-0             farver_2.1.0               
 [21] pheatmap_1.0.12             flowWorkspace_4.6.0        
 [23] rprojroot_2.0.2             vctrs_0.3.8                
 [25] generics_0.1.2              TH.data_1.1-0              
 [27] xfun_0.29                   R6_2.5.1                   
 [29] doParallel_1.0.16           ggbeeswarm_0.6.0           
 [31] clue_0.3-60                 rsvd_1.0.5                 
 [33] bitops_1.0-7                DelayedArray_0.20.0        
 [35] assertthat_0.2.1            promises_1.2.0.1           
 [37] scales_1.1.1                multcomp_1.4-18            
 [39] beeswarm_0.4.0              gtable_0.3.0               
 [41] beachmat_2.10.0             processx_3.5.2             
 [43] RProtoBufLib_2.6.0          sandwich_3.0-1             
 [45] rlang_1.0.0                 GlobalOptions_0.1.2        
 [47] splines_4.1.2               rstatix_0.7.0              
 [49] hexbin_1.28.2               broom_0.7.12               
 [51] reshape2_1.4.4              yaml_2.2.2                 
 [53] abind_1.4-5                 backports_1.4.1            
 [55] httpuv_1.6.5                RBGL_1.70.0                
 [57] tools_4.1.2                 ellipsis_0.3.2             
 [59] jquerylib_0.1.4             RColorBrewer_1.1-2         
 [61] ggridges_0.5.3              Rcpp_1.0.8                 
 [63] plyr_1.8.6                  base64enc_0.1-3            
 [65] sparseMatrixStats_1.6.0     zlibbioc_1.40.0            
 [67] purrr_0.3.4                 RCurl_1.98-1.5             
 [69] ps_1.6.0                    FlowSOM_2.2.0              
 [71] ggpubr_0.4.0                GetoptLong_1.0.5           
 [73] cowplot_1.1.1               zoo_1.8-9                  
 [75] ggrepel_0.9.1               cluster_2.1.2              
 [77] colorRamps_2.3              fs_1.5.2                   
 [79] magrittr_2.0.2              ncdfFlow_2.40.0            
 [81] data.table_1.14.2           scattermore_0.7            
 [83] circlize_0.4.13             mvtnorm_1.1-3              
 [85] whisker_0.4                 ggnewscale_0.4.5           
 [87] evaluate_0.14               XML_3.99-0.8               
 [89] jpeg_0.1-9                  gridExtra_2.3              
 [91] shape_1.4.6                 ggcyto_1.22.0              
 [93] compiler_4.1.2              tibble_3.1.6               
 [95] crayon_1.4.2                ggpointdensity_0.1.0       
 [97] htmltools_0.5.2             later_1.3.0                
 [99] tidyr_1.2.0                 RcppParallel_5.1.5         
[101] aws.s3_0.3.21               DBI_1.1.2                  
[103] tweenr_1.0.2                ComplexHeatmap_2.10.0      
[105] MASS_7.3-55                 Matrix_1.4-0               
[107] car_3.0-12                  cli_3.1.1                  
[109] parallel_4.1.2              pkgconfig_2.0.3            
[111] getPass_0.2-2               xml2_1.3.3                 
[113] foreach_1.5.2               vipor_0.4.5                
[115] bslib_0.3.1                 XVector_0.34.0             
[117] drc_3.0-1                   stringr_1.4.0              
[119] callr_3.7.0                 digest_0.6.29              
[121] ConsensusClusterPlus_1.58.0 graph_1.72.0               
[123] rmarkdown_2.11              DelayedMatrixStats_1.16.0  
[125] curl_4.3.2                  gtools_3.9.2               
[127] rjson_0.2.21                lifecycle_1.0.1            
[129] jsonlite_1.7.3              carData_3.0-5              
[131] BiocNeighbors_1.12.0        fansi_1.0.2                
[133] pillar_1.7.0                lattice_0.20-45            
[135] plotrix_3.8-2               fastmap_1.1.0              
[137] httr_1.4.2                  survival_3.2-13            
[139] glue_1.6.1                  png_0.1-7                  
[141] iterators_1.0.13            Rgraphviz_2.38.0           
[143] nnls_1.4                    ggforce_0.3.3              
[145] stringi_1.7.6               sass_0.4.0                 
[147] BiocSingular_1.10.0         CytoML_2.6.0               
[149] latticeExtra_0.6-29         cytolib_2.6.1              
[151] irlba_2.3.5