Processing math: 100%
  • Introduction
  • Load packages and data
  • Densit plots
  • Normalization
    • Choosing quantile interval
    • Method 1
    • Method 2
  • Session information

Last updated: 2017-12-11

Code version: a5ec074


Introduction

Previously when investigating measurement variation (GFP/RFP/FAPI), we learned that there’s significant variation between batches in the distributions of background-corrected pixel intensites. See here.

 

Approach:

In this document, I apply quantile normalization to intensity measurements on a per-channel basis. The approach is as follows

  1. Construct a reference intensity. Estimate k-quantiles of reference intensities QR,k=(qR,k[1],,qR,k[nk]).

  2. For each plate i, estimate l-quantiles of intensities on a per-plate basis Qi,l=(qi,l[1],,qi,l[nl]).

  3. For each plate i, compare the intensity value Fij with the quantile values (qi,l[1],,qi,l[nl]) and assign image/well j with the quantile that has the closest intensity value, say qi,l[m] if m=argmin(1,,nl)|Fijqi,l[nl]|. Then subsitute the intensity value with the m-th quantile value in the reference intensity qR,k[m].

 

I tried two methods for constructing reference intensity vector, and the results are vastly different depends on the method that we choose.

Method 1: Aggregate intensity values aross plates.

Method 2: Take the average of n-quantiles across plates.

 

Results:

  1. We chose 1/.005 quantiles for all three channels. See the document for our exploratory analysis of intensities from all three channels.

  2. Method 1 versus Method 2: Because in Method 1, the distribution of Green/Red is more dense toward low and high-valued intensities, we see that the normalized values are closer toward the boundaries.

  3. Method 2 of constructing the reference produces better results. We see that the relationship between Green/Red is preserved before versus after normalization.

  4. In Method 2, after normalization, the range of intensities is the same between plates for each of the three channels (Green, Red, DAPI). As a result, many of the images/wells with low intensties decreased in intensity values.

  5. Because of 4, the distances between samples in many plates increase rather than decrease. We were looking for decrease in the distances between samples, i.e., tighter clusters or smaller within-cluster distance…


Load packages and data

ints <- readRDS(file="/project2/gilad/joycehsiao/fucci-seq/data/intensity.rds")

 

Densit plots

 

First, look at the distribution of all batches combined versus each batch.

 


 

Normalization

 

Code for one single sample

 

my_quantnorm <- function(reference, sample, span=.01) {
  # quantiles for intensities all samples across plates
  quants_reference <- quantile(reference, probs=seq(0,1,span))
  # intensities for a given plate
  # quantiles for intensities at each plate
  quants_sample <- quantile(sample, probs=seq(0,1,span))
  # empty vector for normalized values
  sample_normed <- vector("numeric", length=length(sample))
  
  for (index in 1:length(sample)) {
    # for each sample, find the closest sample quantile
    sample_order <- names(which.min(abs(sample[index]-quants_sample)))
    # # get the reference intensity value of the closet quantile
    ref_order_value <- reference[which(names(quants_reference)==sample_order)] 
    # assign the reference intensity value to the sample
    sample_normed[index] <- ref_order_value 
  }
  return(sample_normed)
}

 

Choosing quantile interval

 

RFP

 

 

GFP

 

 

DAPI

 

 

Method 1

Method 1 constructs a vector of refernece intensity by aggregating all image intensity values across plates.

 

 

Distribution of the reference

 

 

After normalization

 

 

Green versus Red intensties by plate, labeled by DAPI

 

 

Green versus Red intensties by individual, labeled by DAPI

 

 

Green

 

 

Red

 

 

DAPI

 


 

Method 2

 

Reference intensity vector: average of quantile values across plates.

 

 

After normalization

 

 

Green versus Red intensties by plate, labeled by DAPI

 

 

Green versus Red intensties by individual, labeled by DAPI

 

 

Green

 

 

Red

 

 

DAPI

 


Session information

R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.2 (Nitrogen)

Matrix products: default
BLAS: /home/joycehsiao/miniconda3/envs/fucci-seq/lib/R/lib/libRblas.so
LAPACK: /home/joycehsiao/miniconda3/envs/fucci-seq/lib/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
[1] Biobase_2.38.0      BiocGenerics_0.24.0 RColorBrewer_1.1-2 
[4] wesanderson_0.3.2   cowplot_0.8.0       ggplot2_2.2.1      
[7] dplyr_0.7.0         data.table_1.10.4  

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14     knitr_1.16       magrittr_1.5     munsell_0.4.3   
 [5] colorspace_1.3-2 R6_2.2.0         rlang_0.1.2      stringr_1.2.0   
 [9] plyr_1.8.4       tools_3.4.1      grid_3.4.1       gtable_0.2.0    
[13] git2r_0.19.0     htmltools_0.3.6  lazyeval_0.2.0   yaml_2.1.14     
[17] rprojroot_1.2    digest_0.6.12    assertthat_0.1   tibble_1.3.3    
[21] glue_1.1.1       evaluate_0.10.1  rmarkdown_1.6    labeling_0.3    
[25] stringi_1.1.2    compiler_3.4.1   scales_0.4.1     backports_1.0.5 

This R Markdown site was created with workflowr