Last updated: 2021-02-04

Checks: 7 0

Knit directory: melanoma_publication_old_data/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.

R Markdown file: up-to-date

Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Environment: empty

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

Seed: set.seed(20200728)

The command set.seed(20200728) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Session information: recorded

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Cache: none

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

File paths: relative

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Repository version: 20a1458

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 20a1458. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    ._.DS_Store
    Ignored:    code/.DS_Store
    Ignored:    code/._.DS_Store
    Ignored:    code/helper_functions/._findCommunity.R
    Ignored:    data/.DS_Store
    Ignored:    data/._.DS_Store
    Ignored:    data/._density_infiltration_BlockID.csv
    Ignored:    data/._layer_1_classification_rna.csv
    Ignored:    data/._manual_infiltration_scoring.csv
    Ignored:    data/._manual_infiltration_scoring_BlockID.csv
    Ignored:    data/._manual_infiltration_scoring_RC.csv
    Ignored:    data/._manual_infiltration_scoring_TH.csv
    Ignored:    data/._survdat_for_modelling.csv
    Ignored:    data/12plex_validation/
    Ignored:    data/200323_TMA_256_Hot Cold_Clinical Data_Updated Response Data_For Collaborators_latest updated_Mar_2020_for_Coxph_modeling.csv
    Ignored:    data/colour_vector.rds
    Ignored:    data/density_infiltration_BlockID.csv
    Ignored:    data/fraction_and_infiltration_scoring.csv
    Ignored:    data/layer_1_classification_protein.csv
    Ignored:    data/layer_1_classification_protein.rds
    Ignored:    data/layer_1_classification_rna.csv
    Ignored:    data/manual_infiltration_scoring.csv
    Ignored:    data/manual_infiltration_scoring_BlockID.csv
    Ignored:    data/manual_infiltration_scoring_RC.csv
    Ignored:    data/manual_infiltration_scoring_TH.csv
    Ignored:    data/protein/
    Ignored:    data/rna/
    Ignored:    data/safety_copy_SCE/
    Ignored:    data/sce_RNA.rds
    Ignored:    data/sce_protein.rds
    Ignored:    data/survdat_for_modelling.csv
    Ignored:    output/.DS_Store
    Ignored:    output/._.DS_Store
    Ignored:    output/._protein_neutrophil.png
    Ignored:    output/._rna_neutrophil.png
    Ignored:    output/PSOCKclusterOut/
    Ignored:    output/bcell_grouping.png
    Ignored:    output/dysfunction_correlation.pdf

Unstaged changes:
    Modified:   .gitignore

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.

These are the previous versions of the repository in which changes were made to the R Markdown (analysis/07_RNA_chemokine_community_clustering.Rmd) and HTML (docs/07_RNA_chemokine_community_clustering.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File	Version	Author	Date	Message
Rmd	9442cb9	toobiwankenobi	2020-12-22	add all new files
Rmd	c3e8a47	toobiwankenobi	2020-10-12	clean branch

Introduction

Preparations

knitr::opts_chunk$set(echo = TRUE, message= FALSE)
knitr::opts_knit$set(root.dir = rprojroot::find_rstudio_root_file())

Load libraries

First, we will load the libraries needed for this part of the analysis.

library(SingleCellExperiment)
library(reshape2)
library(tidyverse)
library(dplyr)
library(data.table) 
library(fpc)

Read the data

sce = readRDS(file = "data/sce_RNA.rds")

Community Analysis

Community analysis - Chemokine-producing cells

# Subset sce object to only contain chemokine producing cells
cur_sce <- sce[,sce$chemokine_cluster != 0]

# define fractions of chemokines present in community
cur_dt <- as.data.table(colData(cur_sce))
cur_dt <- cbind(cur_dt[,chemokine_cluster], cur_dt[,grepl(glob2rx("C*L*"),names(cur_dt)), with=FALSE])
colnames(cur_dt)[1] <- "chemokine_cluster"

cur_dt <- cur_dt %>%
  group_by(chemokine_cluster) %>%
  summarise_each(funs(sum))

Warning: `summarise_each_()` is deprecated as of dplyr 0.7.0.
Please use `across()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.

Warning: `funs()` is deprecated as of dplyr 0.8.0.
Please use a list of either functions or lambdas: 

  # Simple named list: 
  list(mean = mean, median = median)

  # Auto named with `tibble::lst()`: 
  tibble::lst(mean, median)

  # Using lambdas
  list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.

# wide format of frequencies
freqs_wide <- t(scale(t(as.matrix(cur_dt[,-1])), center = FALSE, 
               scale = rowSums(cur_dt[,-1])))
freqs_wide <- cbind(cur_dt[,1],freqs_wide)

# long format of frequencies
freqs_long <- melt(as.data.table(freqs_wide), id = "chemokine_cluster")
colnames(freqs_long) <- c("chemokine_cluster", "celltype", "fraction")

kmeans clustering to define similar chemokine clusters

# clustering (based on chemokine abundance within a cluster)
estimate <- clusterboot(data=freqs_wide[,-1], B=100, 
                        bootmethod = c("boot"), 
                        distances = FALSE, 
                        krange = 2:20, 
                        clustermethod = kmeansCBI,
                        seed = 12345)

boot 1 
boot 2 
boot 3 
boot 4 
boot 5 
boot 6 
boot 7 
boot 8 
boot 9 
boot 10 
boot 11 
boot 12 
boot 13 
boot 14 
boot 15 
boot 16 
boot 17 
boot 18 
boot 19 
boot 20 
boot 21 
boot 22 
boot 23 
boot 24 
boot 25 
boot 26 
boot 27 
boot 28 
boot 29 
boot 30 
boot 31 
boot 32 
boot 33 
boot 34 
boot 35 
boot 36 
boot 37 
boot 38 
boot 39 
boot 40 
boot 41 
boot 42 
boot 43 
boot 44 
boot 45 
boot 46 
boot 47 
boot 48 
boot 49 
boot 50 
boot 51 
boot 52 
boot 53 
boot 54 
boot 55 
boot 56 
boot 57 
boot 58 
boot 59 
boot 60 
boot 61 
boot 62 
boot 63 
boot 64 
boot 65 
boot 66 
boot 67 
boot 68 
boot 69 
boot 70 
boot 71 
boot 72 
boot 73 
boot 74 
boot 75 
boot 76 
boot 77 
boot 78 
boot 79 
boot 80 
boot 81 
boot 82 
boot 83 
boot 84 
boot 85 
boot 86 
boot 87 
boot 88 
boot 89 
boot 90 
boot 91 
boot 92 
boot 93 
boot 94 
boot 95 
boot 96 
boot 97 
boot 98 
boot 99 
boot 100

# add cluster ID to frequencies
freq_kmeans <- as.data.table(cbind(estimate$result$partition, freqs_wide[,1]))
colnames(freq_kmeans) <- c("kmean_cluster", "chemokine_cluster")
freq_kmeans$kmean_cluster <- factor(freq_kmeans$kmean_cluster, levels = 0:length(unique(freq_kmeans$kmean_cluster)))

cur_dt <- as.data.table(colData(sce))
cur_dt <- left_join(cur_dt, freq_kmeans)

# add community module to sce object
cur_dt[is.na(kmean_cluster), kmean_cluster := "0"]
colData(sce)$community_module <- cur_dt$kmean_cluster

Save RDS

saveRDS(sce, file = "data/sce_rna.rds")

sessionInfo()

R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04 LTS

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] fpc_2.2-9                   data.table_1.13.6          
 [3] forcats_0.5.0               stringr_1.4.0              
 [5] dplyr_1.0.2                 purrr_0.3.4                
 [7] readr_1.4.0                 tidyr_1.1.2                
 [9] tibble_3.0.4                ggplot2_3.3.3              
[11] tidyverse_1.3.0             reshape2_1.4.4             
[13] SingleCellExperiment_1.12.0 SummarizedExperiment_1.20.0
[15] Biobase_2.50.0              GenomicRanges_1.42.0       
[17] GenomeInfoDb_1.26.2         IRanges_2.24.1             
[19] S4Vectors_0.28.1            BiocGenerics_0.36.0        
[21] MatrixGenerics_1.2.0        matrixStats_0.57.0         
[23] workflowr_1.6.2            

loaded via a namespace (and not attached):
 [1] bitops_1.0-6           fs_1.5.0               lubridate_1.7.9.2     
 [4] httr_1.4.2             prabclus_2.3-2         rprojroot_2.0.2       
 [7] tools_4.0.3            backports_1.2.1        R6_2.5.0              
[10] DBI_1.1.0              colorspace_2.0-0       nnet_7.3-14           
[13] withr_2.3.0            tidyselect_1.1.0       compiler_4.0.3        
[16] git2r_0.28.0           cli_2.2.0              rvest_0.3.6           
[19] xml2_1.3.2             DelayedArray_0.16.0    diptest_0.75-7        
[22] scales_1.1.1           DEoptimR_1.0-8         robustbase_0.93-7     
[25] digest_0.6.27          rmarkdown_2.6          XVector_0.30.0        
[28] pkgconfig_2.0.3        htmltools_0.5.0        dbplyr_2.0.0          
[31] rlang_0.4.10           readxl_1.3.1           rstudioapi_0.13       
[34] generics_0.1.0         jsonlite_1.7.2         mclust_5.4.7          
[37] RCurl_1.98-1.2         magrittr_2.0.1         modeltools_0.2-23     
[40] GenomeInfoDbData_1.2.4 Matrix_1.3-2           Rcpp_1.0.5            
[43] munsell_0.5.0          fansi_0.4.1            lifecycle_0.2.0       
[46] stringi_1.5.3          whisker_0.4            yaml_2.2.1            
[49] MASS_7.3-53            zlibbioc_1.36.0        flexmix_2.3-17        
[52] plyr_1.8.6             grid_4.0.3             promises_1.1.1        
[55] crayon_1.3.4           lattice_0.20-41        haven_2.3.1           
[58] hms_0.5.3              knitr_1.30             pillar_1.4.7          
[61] reprex_0.3.0           glue_1.4.2             evaluate_0.14         
[64] modelr_0.1.8           vctrs_0.3.6            httpuv_1.5.4          
[67] cellranger_1.1.0       gtable_0.3.0           kernlab_0.9-29        
[70] assertthat_0.2.1       xfun_0.20              broom_0.7.3           
[73] later_1.1.0.1          class_7.3-17           cluster_2.1.0         
[76] ellipsis_0.3.1

07_chemokine_community_analysis

Tobias Hoch

2020-05-12