Model training
method-train-classifiers-non101.html
Explore aproaches to fitting cyclical trend
Cell cycle signal in gene expression data
- We investigated cell cycle signals in the sequencing data alone.
- We then assign categorical labels of cell cycle and explored the expresson profiles of these categories.
- We ordered cells on a circle using FUCCI intensities alone.
- I used nonparametric methods to identify genes that may be cyclical along cell cycle phases.
- Fit smash and kernel regression on circular variables on a subset of genes with detection rate > .8.
- Fit trendfilter on a subset of genes (5) that are observed (visually) to have cyclical pattern. trendfilter is robust to small proportion of undetected cells, approx 2 or 3%. In cases of simulation when increasing proportion of undetected cells to 20%, we observed a flat line in gene expression for genes previously identified to tend to a cyclical pattern.
- Next, we fit trendfilter on all genes after transforming the data to follow standard normal distribution, permutation-based p-values for PVE are used to select 101 significant cyclical genes.
RNA-seq data preprcessing
- The first step in preprocessing RNA-seq data consists of QC and filtering.
- Sample QC and filtering
- Gene QC and filtering
- We then analyzed and corrected for batch effect due to C1 plate in the sequencing data
Microscopy image analysis
We evaluated and pre-processed the results of image analysis as follows:
- We visually inspect images deteced to have none or more than one nucleus. For cases that are inconsistent with visual inspection, we correct the number of nuclei detected.
- We applied background correction to the intensity measurements of GFP, RFP and DAPI based on the following analyses.
- We analyzed intensity variation across individuals and batches and considers approaches for removing batch effects in the data.
- We investigated the cell time estimates based on FUCCI intensities.
This R Markdown site was created with workflowr