Last updated: 2018-08-08
workflowr checks: (Click a bullet for more information) ✔ R Markdown file: up-to-date
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
✔ Environment: empty
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
✔ Seed:
set.seed(12345)
The command set.seed(12345)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
✔ Session information: recorded
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
✔ Repository version: 457f59b
wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .Rhistory
Ignored: .Rproj.user/
Untracked files:
Untracked: analysis/table-s1.txt
Untracked: analysis/table-s2.txt
Untracked: code/tb-scratch.R
Untracked: data/counts_per_sample.txt
Untracked: docs/table-s1.txt
Untracked: docs/table-s2.txt
Untracked: factorial-dox.rds
Unstaged changes:
Modified: data/cll-eset.rds
Modified: data/cll-fit2.rds
Modified: data/cll.RData
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | 457f59b | John Blischak | 2018-08-08 | Organize analysis of Arabidopsis used in slides to demo 2x2 factorial study. |
2x2 design to study effect of low temperature in plants:
library(Biobase)
library(GEOquery)
library(limma)
library(stringr)
rds <- "../data/arabidopsis-eset.rds"
if (!file.exists(rds)) {
gset <- getGEO("GSE53990", GSEMatrix = TRUE, getGPL = FALSE)
if (length(gset) > 1) idx <- grep("GPL198", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]
eset <- gset
dim(eset)
plotDensities(eset, legend = FALSE)
# RMA normalization already applied
#
# > Raw chip data were analyzed with R/Bioconductor. Only perfect match (PM)
# > intensities were used. RMA function as implemented in the affy package was
# > used for background adjustment, normalization and summarization.
sum(rowMeans(exprs(eset)) > 5)
plotDensities(eset[rowMeans(exprs(eset)) > 5, ], legend = FALSE)
eset <- eset[rowMeans(exprs(eset)) > 5, ]
pData(eset) <- pData(eset)[, c("title", "genotype:ch1", "lt treatment time:ch1")]
colnames(pData(eset)) <- c("title", "type", "temp")
# Remove 48h sample. More noticeable effect at 120h (authors note that 48 hour
# timepoint is more interesting to them since it is more likely to give insight
# into mechanism since by 120h lots of downstream singaling has started.
# However, the effect is much more minimal, and thus not as useful for my
# pedagological needs)
eset <- eset[, pData(eset)[, "temp"] != "48h"]
# Clean up names
pData(eset)[, "type"] <- tolower(pData(eset)[, "type"])
pData(eset)[, "temp"] <- ifelse(pData(eset)[, "temp"] == "0h", "normal", "low")
pData(eset)[, "rep"] <- sprintf("r%d",
as.integer(str_sub(pData(eset)[, "title"], -1, -1)))
pData(eset)[, "title"] <- NULL
colnames(eset) <- paste(pData(eset)[, "type"],
pData(eset)[, "temp"],
pData(eset)[, "rep"], sep = "_")
head(pData(eset))
saveRDS(eset, file = "../data/arabidopsis-eset.rds")
} else {
eset <- readRDS(rds)
}
dim(eset)
Features Samples
11871 12
table(pData(eset)[, c("type", "temp")])
temp
type low normal
col 3 3
vte2 3 3
# Create single variable
group <- with(pData(eset), paste(type, temp, sep = "."))
group <- factor(group)
# Create design matrix with no intercept
design <- model.matrix(~0 + group)
colnames(design) <- levels(group)
head(design, 3)
col.low col.normal vte2.low vte2.normal
1 0 1 0 0
2 0 1 0 0
3 0 1 0 0
# Count the number of samples modeled by each coefficient
colSums(design)
col.low col.normal vte2.low vte2.normal
3 3 3 3
# Create a contrasts matrix
cm <- makeContrasts(type_normal = vte2.normal - col.normal,
type_low = vte2.low - col.low,
temp_vte2 = vte2.low - vte2.normal,
temp_col = col.low - col.normal,
interaction = (vte2.low - vte2.normal) - (col.low - col.normal),
levels = design)
# View the contrasts matrix
cm
Contrasts
Levels type_normal type_low temp_vte2 temp_col interaction
col.low 0 -1 0 1 -1
col.normal -1 0 0 -1 1
vte2.low 0 1 1 0 1
vte2.normal 1 0 -1 0 -1
# Fit the model
fit <- lmFit(eset, design)
# Fit the contrasts
fit2 <- contrasts.fit(fit, contrasts = cm)
# Calculate the t-statistics for the contrasts
fit2 <- eBayes(fit2)
# Summarize results
results <- decideTests(fit2)
summary(results)
type_normal type_low temp_vte2 temp_col interaction
-1 0 466 1635 1885 128
0 11871 10915 7635 6989 11640
1 0 490 2601 2997 103
sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] stringr_1.3.1 limma_3.32.2 GEOquery_2.42.0
[4] Biobase_2.36.2 BiocGenerics_0.22.1
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 knitr_1.20 whisker_0.3-2
[4] magrittr_1.5 workflowr_1.1.1.9000 R6_2.2.2
[7] httr_1.3.1 tools_3.4.4 R.oo_1.22.0
[10] git2r_0.23.0 htmltools_0.3.6 yaml_2.2.0
[13] rprojroot_1.3-2 digest_0.6.15 R.utils_2.6.0
[16] bitops_1.0-6 RCurl_1.95-4.11 evaluate_0.11
[19] rmarkdown_1.10 stringi_1.2.4 compiler_3.4.4
[22] backports_1.1.2-9000 R.methodsS3_1.7.1 XML_3.98-1.11
This reproducible R Markdown analysis was created with workflowr 1.1.1.9000