Last updated: 2021-02-12

Checks: 7 0

Knit directory: melanoma_publication_old_data/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20200728) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 2e443a5. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    ._.DS_Store
    Ignored:    analysis/._clinical metadata preparation.Rmd
    Ignored:    code/.DS_Store
    Ignored:    code/._.DS_Store
    Ignored:    data/.DS_Store
    Ignored:    data/._.DS_Store
    Ignored:    data/data_for_analysis/
    Ignored:    data/full_data/
    Ignored:    output/.DS_Store
    Ignored:    output/._.DS_Store
    Ignored:    output/._protein_neutrophil.png
    Ignored:    output/._rna_neutrophil.png
    Ignored:    output/PSOCKclusterOut/
    Ignored:    output/bcell_grouping.png
    Ignored:    output/dysfunction_correlation.pdf

Untracked files:
    Untracked:  analysis/00_prepare_clinical_dat.rmd
    Untracked:  code/helper_functions/findMilieu.R
    Untracked:  code/helper_functions/findPatch.R

Unstaged changes:
    Modified:   .gitignore
    Modified:   analysis/01_Protein_read_data.rmd
    Modified:   analysis/01_RNA_read_data.rmd
    Modified:   analysis/02_Protein_annotations.rmd
    Modified:   analysis/02_RNA_annotations.rmd
    Modified:   analysis/03_Protein_quality_control.rmd
    Modified:   analysis/03_RNA_quality_control.rmd
    Modified:   analysis/04_1_Protein_celltype_classification.rmd
    Modified:   analysis/04_1_RNA_celltype_classification.rmd
    Modified:   analysis/04_2_RNA_classification_subclustering.rmd
    Modified:   analysis/04_2_protein_classification_subclustering.rmd
    Modified:   analysis/05_RNA_chemokine_expressing_cells.rmd
    Modified:   analysis/06_RNA_chemokine_patch_detection.rmd
    Modified:   analysis/07_TCF7_PD1_gating.rmd
    Modified:   analysis/08_color_vectors.rmd
    Modified:   analysis/09_Tcell_Score.Rmd
    Modified:   analysis/10_Dysfunction_Score.rmd
    Modified:   analysis/11_Bcell_Score.Rmd
    Modified:   analysis/Figure_1.rmd
    Modified:   analysis/Figure_2.rmd
    Modified:   analysis/Figure_3.rmd
    Modified:   analysis/Figure_4.rmd
    Modified:   analysis/Figure_5.rmd
    Modified:   analysis/Summary_Statistics.rmd
    Modified:   analysis/Supp-Figure_1.rmd
    Modified:   analysis/Supp-Figure_2.rmd
    Modified:   analysis/Supp-Figure_3.rmd
    Modified:   analysis/Supp-Figure_4.rmd
    Modified:   analysis/Supp-Figure_5.rmd
    Modified:   analysis/XX_hazard_ratio.rmd
    Deleted:    code/findPackages.R
    Deleted:    code/helper_functions/findClusters.R
    Deleted:    code/helper_functions/findCommunity.R
    Deleted:    code/helper_functions/getCellCount.R
    Deleted:    code/helper_functions/plotBarFracCluster.R
    Deleted:    code/helper_functions/plotCellFrac.R
    Deleted:    code/helper_functions/plotCellFracGroups.R
    Deleted:    code/helper_functions/plotCellFracGroupsSubset.R
    Modified:   code/helper_functions/scatter_function.R
    Modified:   code/helper_functions/validityChecks.R
    Deleted:    data/mask_comparison/20190809_ZTMA256.1_slide2_TH_s1_p1_r15_a15_ac_full.tiff
    Deleted:    data/mask_comparison/20190809_ZTMA256.1_slide2_TH_s1_p1_r15_a15_ac_ilastik_s2_Probabilities_equalized_cellmask.tiff
    Deleted:    data/mask_comparison/20191023_ZTMA256.1_slide3_TH_s0_p10_r4_a4_ac_full.tiff
    Deleted:    data/mask_comparison/20191023_ZTMA256.1_slide3_TH_s0_p10_r4_a4_ac_ilastik_s2_Probabilities_equalized_cellmask.tiff

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


There are no past versions. Publish this analysis with wflow_publish() to start tracking its development.


load packages

sapply(list.files("code/helper_functions/", full.names = TRUE), source)
        code/helper_functions//calculateSummary.R
value   ?                                        
visible FALSE                                    
        code/helper_functions//censor_dat.R
value   ?                                  
visible FALSE                              
        code/helper_functions//detect_mRNA_expression.R
value   ?                                              
visible FALSE                                          
        code/helper_functions//DistanceToClusterCenter.R
value   ?                                               
visible FALSE                                           
        code/helper_functions//findMilieu.R code/helper_functions//findPatch.R
value   ?                                   ?                                 
visible FALSE                               FALSE                             
        code/helper_functions//getInfoFromString.R
value   ?                                         
visible FALSE                                     
        code/helper_functions//getSpotnumber.R
value   ?                                     
visible FALSE                                 
        code/helper_functions//plotCellCounts.R
value   ?                                      
visible FALSE                                  
        code/helper_functions//plotCellFractions.R
value   ?                                         
visible FALSE                                     
        code/helper_functions//plotDist.R
value   ?                                
visible FALSE                            
        code/helper_functions//scatter_function.R
value   ?                                        
visible FALSE                                    
        code/helper_functions//sceChecks.R
value   ?                                 
visible FALSE                             
        code/helper_functions//validityChecks.R
value   ?                                      
visible FALSE                                  
library(LSD)
library(SingleCellExperiment)
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package: 'MatrixGenerics'
The following objects are masked from 'package:matrixStats':

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
    rowWeightedSds, rowWeightedVars
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min
Loading required package: S4Vectors

Attaching package: 'S4Vectors'
The following object is masked from 'package:base':

    expand.grid
Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Attaching package: 'Biobase'
The following object is masked from 'package:MatrixGenerics':

    rowMedians
The following objects are masked from 'package:matrixStats':

    anyMissing, rowMedians
library(ggplot2)
library(scater)
library(viridis)
Loading required package: viridisLite
library(Rphenograph)
library(igraph)

Attaching package: 'igraph'
The following object is masked from 'package:scater':

    normalize
The following object is masked from 'package:GenomicRanges':

    union
The following object is masked from 'package:IRanges':

    union
The following object is masked from 'package:S4Vectors':

    union
The following objects are masked from 'package:BiocGenerics':

    normalize, path, union
The following objects are masked from 'package:stats':

    decompose, spectrum
The following object is masked from 'package:base':

    union
library(CATALYST)
library(workflowr)

load data

sce = readRDS(file = "data/data_for_analysis/sce_protein.rds")
for(i in 1:10){
scatter_x_y(sce,x = "CD8",y = "pERK",imagenumber =i, xlim = c(0,5), ylim = c(0,5))
  abline(h = 1.25)
  abline(v=1.5)
  abline(a = 0, b= 1)
}

mean expression of markers per celltype

# Aggregate the counts. calculateSummary is a function written from Nils
mean_sce <- calculateSummary(sce, split_by = "celltype", 
                             exprs_values = "counts")

# Exclude DNA and Histone
mean_sce <- mean_sce[!grepl("DNA|Histone", rownames(mean_sce)),]

# Transform and scale
assay(mean_sce, "arcsinh") <- asinh(assay(mean_sce, "meanCounts")) 
assay(mean_sce, "arcsinh_scaled") <- t(scale(t(asinh(assay(mean_sce, "meanCounts")))))

# Linear scale 
plotHeatmap(mean_sce, exprs_values = "arcsinh", 
            features = c(rownames(mean_sce)), 
            colour_columns_by = "celltype", 
            color = viridis(100), cluster_rows = FALSE, gaps_row = 10)

scaled mean expression of markers per celltype

# Linear scale 
plotHeatmap(mean_sce, exprs_values = "arcsinh_scaled", 
            features = c(rownames(mean_sce)), 
            colour_columns_by = "celltype", 
            color = colorRampPalette(c("dark blue", "white", "dark red"))(100),
            zlim = c(-3,3), cluster_rows = TRUE, gaps_row = 10)

definition of markers used for subclustering the super-classes

# Select markers
marker_list <- list()

marker_list$Tumor <- c("Ki67", "PDL1", "PARP", "H3K27me3","CXCR2","HLADR","S100","Sox9","pERK","CD36",
                       "p75","MiTF","bCatenin","Collagen1","pS6","IDO1","SOX10")

marker_list$Neutrophil <- c("CD15", "PDL1", "MPO", "CD11b")


marker_list$Tregulatory <- c("IDO1", "PD1", "FOXP3", "ICOS", "CD4", "Ki67", "GrzB",
                        "CD45RA", "CD45RO")

marker_list$Tcytotoxic <- c("IDO1", "TOX1", "PD1", "TCF7", "ICOS", "CD8","Ki67", "GrzB",
                        "CD45RA", "CD45RO")

marker_list$Thelper <- c("IDO1", "TOX1", "PD1", "TCF7", 
                        "FOXP3", "ICOS", "CD4", "Ki67", "GrzB",
                        "CD45RA", "CD45RO")

marker_list$BnTcell <- c("IDO1", "TOX1", "PD1", "TCF7", 
                        "FOXP3", "ICOS", "CD8", "CD4", "Ki67", "GrzB",
                        "CD45RA", "CD45RO", "CD20","HLADR","H3K27me3","pS6","pERK", "CD7")

marker_list$pDC <- c("CD303","IDO1","GrzB","CXCR2","CD11b")

marker_list$Bcell <- c("Ki67","CD45RA", "CD45RO", "CD20","HLADR","H3K27me3","pS6","pERK")

marker_list$Macrophage <- c("Caveolin1", "PDL1", "PARP", "H3K27me3","CXCR2","HLADR","CD45RO","CD45RA","CD68","pERK","CD36", "Collagen1","CD11c","pS6","IDO1","CD206")

marker_list$Stroma <- c("Caveolin1", "CD68","CD36", "Collagen1","SMA")

marker_list$unknown <- rownames(sce[rowData(sce)$good_marker,])

cluster the whole dataset with FlowSOM

FlowSOM first because it is faster

## the FlowSOM function from CATALYST needs an another column in the rowData of the sce to work properly:
rowData(sce)$marker_class <- "state"

# vector for clustering
fs_clustering <- vector(length = ncol(sce))

# create the "exprs" slot in the assay data (needed for CATALYST)
assay(sce, "exprs") <- assay(sce,"asinh")

# Macrophage, Bcells, Thelper, Tcytotoxic, Tother and BnT cells will be clustered for a total of 6 clustes each
set.seed(12345)
for(i in c("Macrophage","Thelper","Tcytotoxic")){
  cur_sce <- sce[,sce$celltype == i]
  
  cur_sce <- cluster(cur_sce,features = marker_list[i][[1]],ydim = 2,xdim = 3,maxK = 4)
  
  fs_clustering[sce$celltype == i] <- cur_sce$cluster_id
}
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
# pDCs and Neutrophils will be clustered to 4 clusteres
for(i in c("pDC","Neutrophil","BnTcell","Bcell","Tregulatory","Stroma", "unknown")){
  cur_sce <- sce[,sce$celltype == i]
  
  cur_sce <- cluster(cur_sce,features = marker_list[i][[1]],ydim = 2,xdim = 2,maxK = 3)
  
  fs_clustering[sce$celltype == i] <- cur_sce$cluster_id
}
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
# Tumor will be clustered into 9 clusters
cur_sce <- sce[,sce$celltype == "Tumor"]
  
cur_sce <- cluster(cur_sce,features = marker_list["Tumor"][[1]],ydim = 3,xdim = 3,maxK = 7)
o running FlowSOM clustering...
o running ConsensusClusterPlus metaclustering...
fs_clustering[sce$celltype == "Tumor"] <- cur_sce$cluster_id

# Save in SCE object
colData(sce)$celltype_clustered <- as.factor(fs_clustering)

sce$celltype_clustered <- paste0(sce$celltype, "_", sce$celltype_clustered)

visulalize clustering results for FlowSOM

# Aggregate the counts
mean_sce <- calculateSummary(sce, split_by = c("celltype","celltype_clustered"), 
                             exprs_values = "counts")

# Exclude DNA and Histone
mean_sce <- mean_sce[!grepl("DNA|Histone", rownames(mean_sce)),]

# Transform and scale
assay(mean_sce, "arcsinh") <- asinh(assay(mean_sce, "meanCounts")) 
assay(mean_sce, "arcsinh_scaled") <- t(scale(t(asinh(assay(mean_sce, "meanCounts")))))

# Linear scale 
plotHeatmap(mean_sce, exprs_values = "arcsinh", 
            colour_columns_by = c("celltype"), 
            color = viridis(100), labels_col = mean_sce$celltype_clustered,
            show_colnames = TRUE, annotation_legend = FALSE, borders_color = NA)

# Scaled
plotHeatmap(mean_sce, exprs_values = "arcsinh_scaled", 
            colour_columns_by = c("celltype"), 
            labels_col = mean_sce$celltype_clustered,
            show_colnames = TRUE, annotation_legend = TRUE, borders_color = NA,
            color = colorRampPalette(c("dark blue", "white", "dark red"))(100),
            zlim = c(-4, 4),legend = TRUE)

numbers of cells per cluster

table(sce$celltype_clustered)

      Bcell_1       Bcell_2       Bcell_3       Bcell_4     BnTcell_1 
        22407          4277         13208         10504          6449 
    BnTcell_2     BnTcell_3     BnTcell_4  Macrophage_1  Macrophage_2 
         6905          8722          9202         12808         14507 
 Macrophage_3  Macrophage_4  Macrophage_5  Macrophage_6  Neutrophil_1 
        11276          7769          7634          8465          3483 
 Neutrophil_2  Neutrophil_3  Neutrophil_4         pDC_1         pDC_2 
         2949          2105          2479          2667          1837 
        pDC_3         pDC_4      Stroma_1      Stroma_2      Stroma_3 
         1816           943          9888         14920         17433 
     Stroma_4  Tcytotoxic_1  Tcytotoxic_2  Tcytotoxic_3  Tcytotoxic_4 
        18504          7484          9623          6782          8039 
 Tcytotoxic_5  Tcytotoxic_6     Thelper_1     Thelper_2     Thelper_3 
        10739         10535         11094          9694         11068 
    Thelper_4     Thelper_5     Thelper_6 Tregulatory_1 Tregulatory_2 
         3019         12192         12970          3081          2345 
Tregulatory_3 Tregulatory_4       Tumor_1       Tumor_2       Tumor_3 
         2705          1275         76990         78446         90295 
      Tumor_4       Tumor_5       Tumor_6       Tumor_7       Tumor_8 
        52280         72756         79784         45770         53527 
      Tumor_9     unknown_1     unknown_2     unknown_3     unknown_4 
        74583          2672          1570         10194          4735 

assign names to clusters

annotations <- sce$celltype_clustered

# annotations[annotations == "other_4"] <- "Other_CD206Low_pS6Low"
# annotations[annotations == "other_2"] <- "Tumor_SOX9Low_SOX10+_MITF+_S100+_bCatenin++"
# annotations[annotations == "other_3"] <- "Other_SOX10Low_MITFLow_bCatenin+"
# annotations[annotations == "other_1"] <- "Other_CaveolinLow"
# annotations[annotations == "other_5"] <- "Vasculature_SMA++_Collagen+_Caveolin+"
# annotations[annotations == "other_6"] <- "Stroma_SMA+_Collagen++_Caveolin++_CD36++"
# 
# annotations[annotations == "Tumor_2"] <- "Tumor_SOX9++_p75+"
# annotations[annotations == "Tumor_3"] <- "Tumor_SOX10+_MITFLow_CD36Low"
# annotations[annotations == "Tumor_4"] <- "Tumor_SOX10Low_S100+"
# annotations[annotations == "Tumor_5"] <- "Tumor_SOX9++_SOX10+_MITF+_S100Low"
# annotations[annotations == "Tumor_8"] <- "Tumor_SOX9Low_SOX10Low_MITFLow_p75+"
# annotations[annotations == "Tumor_9"] <- "Tumor_SOX10+_MITF+_S100++"
# annotations[annotations == "Tumor_6"] <- "Tumor_S100+"
# annotations[annotations == "Tumor_1"] <- "Tumor_SOX10Low_MITFLow_HLADR+"
# annotations[annotations == "Tumor_7"] <- "Tumor_SOX10+_MITF++_S100+_pS6+"
# 
# annotations[annotations == "Bcells_1"] <- "Bcells_pErkLow_H3K27meLow"
# annotations[annotations == "Bcells_4"] <- "Bcells_pErkLow_H3K27meLow"
# annotations[annotations == "Bcells_2"] <- "Bcells_H3K27me_LowpS6Low"
# annotations[annotations == "Bcells_3"] <- "Bcells_CD19++_pErkLow_H3K27meLow"
# annotations[annotations == "Bcells_5"] <- "Bcells_Ki67++"
# annotations[annotations == "Bcells_6"] <- "Bcells_SMALow"
# 
# annotations[annotations == "BnT_1"] <- "Bcells_CD3Low_CD4Low"
# annotations[annotations == "BnT_2"] <- "Bcells_pErkLow_H3K27meLow"
# annotations[annotations == "BnT_6"] <- "BnT_CD8+"
# annotations[annotations == "BnT_4"] <- "BnT_CD8+"
# annotations[annotations == "BnT_5"] <- "BnT_CD4+_TCF7+"
# annotations[annotations == "BnT_3"] <- "BnT_CD4+_ICOS++_PD1++_Ki67+"
# 
# annotations[annotations == "T_other_4"] <- "Tcells_undefined"
# annotations[annotations == "T_other_5"] <- "Tcells_undefined"
# annotations[annotations == "T_other_3"] <- "Tcells_CD15_LowGrzB+"
# annotations[annotations == "T_other_1"] <- "Tcells_Ki67++"
# annotations[annotations == "T_other_2"] <- "Tcells_CaveolinLow_SMALow"
# annotations[annotations == "T_other_6"] <- "Tcells_undefined"
# 
# annotations[annotations == "T_helper_2"] <- "Thelper_CD11cLow_HLADRLow"
# annotations[annotations == "T_helper_5"] <- "Thelper_SMALow_CaveolinLow"
# annotations[annotations == "T_helper_1"] <- "Thelper_TCF7++"
# annotations[annotations == "T_helper_6"] <- "Thelper_FoxP3++_ICOS+_CD11cLow_CD206Low"
# annotations[annotations == "T_helper_3"] <- "Thelper_FoxP3Low_pERKLow"
# annotations[annotations == "T_helper_4"] <- "Thelper_FoxP3Low"
# 
# annotations[annotations == "T_cytotoxic_3"] <- "Tcytotoxic_PD1+_TOX1+_GrzBLow"
# annotations[annotations == "T_cytotoxic_5"] <- "Tcytotoxic_TCF7+_H3K27meLow"
# annotations[annotations == "T_cytotoxic_6"] <- "Tcytotoxic_PD1+_TOX1+_GrzB+_ICOSLow_Ki67+"
# annotations[annotations == "T_cytotoxic_1"] <- "Tcytotoxic_GrzBLow_SMALow_CaveolinLow"
# annotations[annotations == "T_cytotoxic_2"] <- "Tcytotoxic_GrzBLow_pS6Low"
# annotations[annotations == "T_cytotoxic_4"] <- "Tcytotoxic_PD1+_TOX1+_GrzBLow_ICOSLow_CD11cLow_CD206Low"
# 
# annotations[annotations == "Macrophage_4"] <- "Macrophage_CD68Low_CaveolinLow"
# annotations[annotations == "Macrophage_6"] <- "Tumor_CD68Low_S100Low_bCateninLow_MITFLow_SOX10Low_HLADR+"
# annotations[annotations == "Macrophage_2"] <- "Macrophage_CD68Low_CD206++_CD36Low"
# annotations[annotations == "Macrophage_1"] <- "Macrophage_CD68Low_pS6Low_S100Low"
# annotations[annotations == "Macrophage_3"] <- "Macrophage_CD68+_CD11c+_CD206++_CXCR2+_PDL1+"
# annotations[annotations == "Macrophage_5"] <- "Macrophage_CD68++_CD11c+_CD36+_Caveolin+_Collagen+"
# 
# annotations[annotations == "pDC_1"] <- "pDC_IDO1+_GrzB+"
# annotations[annotations == "pDC_2"] <- "pDC_IDO1+_TOX1+_CD11b+"
# annotations[annotations == "pDC_3"] <- "pDC_IDO1++_GrzB++"
# annotations[annotations == "pDC_4"] <- "pDC_Ido1+_PARP++"
# 
# annotations[annotations == "Neutrophil_2"] <- "Neutrophil_CD11b+_CD15++_MPO+_PDL1++"
# annotations[annotations == "Neutrophil_4"] <- "Neutrophil_CD11b+_CD15+_MPO+_PDL1Low"
# annotations[annotations == "Neutrophil_3"] <- "Neutrophil_CD11b++_CD15++_MPO++_CXCR2Low_PDL1+"
# annotations[annotations == "Neutrophil_1"] <- "Neutrophil_CD11b+_CD15+_MPO++"

sce$cellAnnotation <- annotations

mean_sce <- calculateSummary(sce, split_by = c("celltype","cellAnnotation"), 
                             exprs_values = "counts")

# Exclude DNA and Histone
mean_sce <- mean_sce[!grepl("DNA|Histone", rownames(mean_sce)),]

# Transform and scale
assay(mean_sce, "arcsinh") <- asinh(assay(mean_sce, "meanCounts")) 
assay(mean_sce, "arcsinh_scaled") <- t(scale(t(asinh(assay(mean_sce, "meanCounts")))))

plotHeatmap(mean_sce, exprs_values = "arcsinh_scaled", 
            colour_columns_by = c("celltype"), 
            labels_col = mean_sce$cellAnnotation,
            show_colnames = TRUE, annotation_legend = TRUE, borders_color = NA,
            color = colorRampPalette(c("dark blue", "white", "dark red"))(100),
            zlim = c(-4, 4),legend = TRUE)

clustering does not really pick up PDL1 positive Tumor cells (which definitely exist). Potentially we have to investigate this with a manual cut off. arcsinh > 1.75 seems to be reasonable

Additionally, the clustering did not pick up proliferating and non prolifertating tumor clusters. this should also be investigated by manual cut offs. arcsinh > 1.25 seems reasonable.

celltype_counts <- sce$celltype

table(celltype_counts)

save the single cell object with the clusters assigned to the cells

# delete the "exprs" slot from the single cell experiment again.
assay(sce,"exprs") <- NULL

saveRDS(sce,file = "data/data_for_analysis/sce_protein.rds")

sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] CATALYST_1.12.2             igraph_1.2.6               
 [3] Rphenograph_0.99.1.9003     viridis_0.5.1              
 [5] viridisLite_0.3.0           scater_1.16.2              
 [7] ggplot2_3.3.3               SingleCellExperiment_1.12.0
 [9] SummarizedExperiment_1.20.0 Biobase_2.50.0             
[11] GenomicRanges_1.42.0        GenomeInfoDb_1.26.2        
[13] IRanges_2.24.1              S4Vectors_0.28.1           
[15] BiocGenerics_0.36.0         MatrixGenerics_1.2.0       
[17] matrixStats_0.57.0          LSD_4.1-0                  
[19] workflowr_1.6.2            

loaded via a namespace (and not attached):
  [1] readxl_1.3.1                circlize_0.4.12            
  [3] drc_3.0-1                   plyr_1.8.6                 
  [5] ConsensusClusterPlus_1.52.0 splines_4.0.3              
  [7] flowCore_2.0.1              BiocParallel_1.22.0        
  [9] TH.data_1.0-10              digest_0.6.27              
 [11] htmltools_0.5.0             magrittr_2.0.1             
 [13] CytoML_2.0.5                cluster_2.1.0              
 [15] openxlsx_4.2.3              ComplexHeatmap_2.4.3       
 [17] RcppParallel_5.0.2          sandwich_3.0-0             
 [19] flowWorkspace_4.0.6         cytolib_2.0.3              
 [21] jpeg_0.1-8.1                colorspace_2.0-0           
 [23] ggrepel_0.9.0               haven_2.3.1                
 [25] xfun_0.20                   dplyr_1.0.2                
 [27] crayon_1.3.4                RCurl_1.98-1.2             
 [29] jsonlite_1.7.2              hexbin_1.28.2              
 [31] graph_1.66.0                survival_3.2-7             
 [33] zoo_1.8-8                   glue_1.4.2                 
 [35] gtable_0.3.0                nnls_1.4                   
 [37] zlibbioc_1.36.0             XVector_0.30.0             
 [39] GetoptLong_1.0.5            DelayedArray_0.16.0        
 [41] ggcyto_1.16.0               car_3.0-10                 
 [43] BiocSingular_1.4.0          Rgraphviz_2.32.0           
 [45] shape_1.4.5                 abind_1.4-5                
 [47] scales_1.1.1                pheatmap_1.0.12            
 [49] mvtnorm_1.1-1               Rcpp_1.0.5                 
 [51] plotrix_3.7-8               clue_0.3-58                
 [53] foreign_0.8-81              rsvd_1.0.3                 
 [55] FlowSOM_1.20.0              tsne_0.1-3                 
 [57] RColorBrewer_1.1-2          ellipsis_0.3.1             
 [59] farver_2.0.3                pkgconfig_2.0.3            
 [61] XML_3.99-0.5                reshape2_1.4.4             
 [63] tidyselect_1.1.0            rlang_0.4.10               
 [65] later_1.1.0.1               munsell_0.5.0              
 [67] cellranger_1.1.0            tools_4.0.3                
 [69] generics_0.1.0              ggridges_0.5.3             
 [71] evaluate_0.14               stringr_1.4.0              
 [73] yaml_2.2.1                  knitr_1.30                 
 [75] fs_1.5.0                    zip_2.1.1                  
 [77] purrr_0.3.4                 RANN_2.6.1                 
 [79] RBGL_1.64.0                 xml2_1.3.2                 
 [81] compiler_4.0.3              rstudioapi_0.13            
 [83] beeswarm_0.2.3              curl_4.3                   
 [85] png_0.1-7                   tibble_3.0.4               
 [87] stringi_1.5.3               forcats_0.5.0              
 [89] lattice_0.20-41             Matrix_1.3-2               
 [91] vctrs_0.3.6                 pillar_1.4.7               
 [93] lifecycle_0.2.0             GlobalOptions_0.1.2        
 [95] BiocNeighbors_1.6.0         data.table_1.13.6          
 [97] cowplot_1.1.1               bitops_1.0-6               
 [99] irlba_2.3.3                 httpuv_1.5.4               
[101] R6_2.5.0                    latticeExtra_0.6-29        
[103] promises_1.1.1              gridExtra_2.3              
[105] RProtoBufLib_2.0.0          rio_0.5.16                 
[107] vipor_0.4.5                 codetools_0.2-18           
[109] MASS_7.3-53                 gtools_3.8.2               
[111] rprojroot_2.0.2             rjson_0.2.20               
[113] withr_2.3.0                 multcomp_1.4-15            
[115] GenomeInfoDbData_1.2.4      hms_0.5.3                  
[117] ncdfFlow_2.34.0             grid_4.0.3                 
[119] rmarkdown_2.6               DelayedMatrixStats_1.10.1  
[121] carData_3.0-4               Rtsne_0.15                 
[123] git2r_0.28.0                base64enc_0.1-3            
[125] ggbeeswarm_0.6.0