Last updated: 2018-06-01
I will use this analysis to look at initial mapping QC for the two mappers I am using.
I created a csv with the number of reads, mapped reads, and proportion of reads mapped per library.
subj_map= read.csv("../data/reads_mapped_three_prime_seq.csv", header=TRUE, stringsAsFactors = FALSE)
Summaries for each number:
Min. 1st Qu. Median Mean 3rd Qu. Max.
5103350 8030068 8776602 8670328 9341566 10931074
Min. 1st Qu. Median Mean 3rd Qu. Max.
3575191 5688940 6268228 6091626 6394260 7788593
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.6658 0.6932 0.7034 0.7025 0.7135 0.7343
Look at this graphically:
subj_melt=melt(subj_map, id.vars=c("line", "fraction"), measure.vars = c("reads", "mapped", "prop_mapped"))
subj_prop_mapped= subj_melt %>% filter(variable=="prop_mapped")
subjplot=ggplot(subj_prop_mapped, aes(y=value, x=line, fill=fraction)) + geom_bar(stat="identity",position="dodge") + labs( title="Proportion of reads mapped with Subjunc") + ylab("Proportion mapped") + geom_hline(yintercept = mean(subj_prop_mapped$value)) + annotate("text",4, mean(subj_prop_mapped$value)- .1, vjust = -1, label = "Mean mapping proportion= .702")
I added two lines to the csv file with the star map stats for each line.
star_map= read.csv("../data/reads_mapped_three_prime_seq.csv", header=TRUE, stringsAsFactors = FALSE)
Summaries for each number:
Min. 1st Qu. Median Mean 3rd Qu. Max.
3326506 5426888 5868012 5834521 6314488 7814874
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.5648 0.6452 0.6558 0.6719 0.7144 0.7827
Look at this graphically:
star_melt=melt(star_map, id.vars=c("line", "fraction"), measure.vars = c("reads", "star_mapped", "star_prop_mapped"))
star_prop_mapped= star_melt %>% filter(variable=="star_prop_mapped")
starplot=ggplot(star_prop_mapped, aes(y=value, x=line, fill=fraction)) + geom_bar(stat="identity",position="dodge") + labs( title="Proportion of reads mapped with Star") + ylab("Proportion mapped") + geom_hline(yintercept = mean(star_prop_mapped$value)) + annotate("text",4, mean(star_prop_mapped$value)- .1, vjust = -1, label = "Mean mapping proportion= .672")
Compare the plots:
