Last updated: 2020-01-14
Checks: 7 0
Knit directory: Comparative_APA/analysis/
This reproducible R Markdown analysis was created with workflowr (version 1.5.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20190902)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: code/chimp_log/
Ignored: code/human_log/
Ignored: data/.DS_Store
Ignored: data/
Untracked files:
Untracked: ._.DS_Store
Untracked: Chimp/
Untracked: Human/
Untracked: analysis/CrossChimpThreePrime.Rmd
Untracked: analysis/DiffTransProtvsExpression.Rmd
Untracked: analysis/assessReadQual.Rmd
Untracked: analysis/diffExpressionPantro6.Rmd
Untracked: analysis/mediation.Rmd
Untracked: code/
Untracked: code/._Config_chimp.yaml
Untracked: code/._Config_chimp_full.yaml
Untracked: code/._Config_human.yaml
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/._ReverseLiftFilter.R
Untracked: code/
Untracked: code/._Snakefile
Untracked: code/._SnakefilePAS
Untracked: code/._SnakefilePASfilt
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/._buildIndecpantro5
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/._cluster.json
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/._extraSnakefiltpas
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/._snakemake.batch
Untracked: code/._snakemakePAS.batch
Untracked: code/._snakemakePASchimp.batch
Untracked: code/._snakemakePAShuman.batch
Untracked: code/._snakemake_chimp.batch
Untracked: code/._snakemake_human.batch
Untracked: code/._snakemakefiltPAS.batch
Untracked: code/._snakemakefiltPAS_chimp
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/.snakemake/
Untracked: code/
Untracked: code/Config_chimp.yaml
Untracked: code/Config_chimp_full.yaml
Untracked: code/Config_human.yaml
Untracked: code/ConvertJunc2Bed.err
Untracked: code/ConvertJunc2Bed.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/CrossmapChimp3prime.err
Untracked: code/CrossmapChimp3prime.out
Untracked: code/CrossmapChimpRNA.err
Untracked: code/CrossmapChimpRNA.out
Untracked: code/DiffSplice.err
Untracked: code/DiffSplice.out
Untracked: code/
Untracked: code/DiffSplicePlots.err
Untracked: code/DiffSplicePlots.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/DiffSplice_removebad.err
Untracked: code/DiffSplice_removebad.out
Untracked: code/
Untracked: code/FilterReverseLift.err
Untracked: code/FilterReverseLift.out
Untracked: code/FindIntronForDomPAS.err
Untracked: code/FindIntronForDomPAS.out
Untracked: code/
Untracked: code/GencodeDiffSplice.err
Untracked: code/GencodeDiffSplice.out
Untracked: code/
Untracked: code/
Untracked: code/HchromOrder.err
Untracked: code/HchromOrder.out
Untracked: code/JunctionLift.err
Untracked: code/JunctionLift.out
Untracked: code/JunctionLiftFinalChimp.err
Untracked: code/JunctionLiftFinalChimp.out
Untracked: code/
Untracked: code/Lift5perPASbed.err
Untracked: code/Lift5perPASbed.out
Untracked: code/LiftClustersFirst.err
Untracked: code/LiftClustersFirst.out
Untracked: code/LiftClustersFirst_remove.err
Untracked: code/LiftClustersFirst_remove.out
Untracked: code/LiftClustersSecond.err
Untracked: code/LiftClustersSecond.out
Untracked: code/LiftClustersSecond_remove.err
Untracked: code/LiftClustersSecond_remove.out
Untracked: code/
Untracked: code/
Untracked: code/LiftorthoPAS.err
Untracked: code/LiftorthoPASt.out
Untracked: code/Log.out
Untracked: code/MapBadSamples.err
Untracked: code/MapBadSamples.out
Untracked: code/
Untracked: code/MapStats.err
Untracked: code/MapStats.out
Untracked: code/MergeClusters.err
Untracked: code/MergeClusters.out
Untracked: code/
Untracked: code/PAS_ATTAAA.err
Untracked: code/PAS_ATTAAA.out
Untracked: code/
Untracked: code/PAS_sequence.err
Untracked: code/PAS_sequence.out
Untracked: code/
Untracked: code/QuantMergeClusters
Untracked: code/QuantMergeClusters.err
Untracked: code/QuantMergeClusters.out
Untracked: code/
Untracked: code/Rev_liftoverPAShg19to38.err
Untracked: code/Rev_liftoverPAShg19to38.out
Untracked: code/ReverseLiftFilter.R
Untracked: code/RunFixCluster.err
Untracked: code/RunFixCluster.out
Untracked: code/
Untracked: code/
Untracked: code/Snakefile
Untracked: code/SnakefilePAS
Untracked: code/SnakefilePASfilt
Untracked: code/SortIndexBadSamples.err
Untracked: code/SortIndexBadSamples.out
Untracked: code/
Untracked: code/TotalTranscriptDTplot.err
Untracked: code/TotalTranscriptDTplot.out
Untracked: code/
Untracked: code/apaQTLsnake.err
Untracked: code/apaQTLsnake.out
Untracked: code/apaQTLsnakePAS.err
Untracked: code/apaQTLsnakePAS.out
Untracked: code/apaQTLsnakePAShuman.err
Untracked: code/bam2junc.err
Untracked: code/bam2junc.out
Untracked: code/bam2junc_remove.err
Untracked: code/bam2junc_remove.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/buildIndecpantro5
Untracked: code/
Untracked: code/buildLeafviz.err
Untracked: code/buildLeafviz.out
Untracked: code/
Untracked: code/
Untracked: code/buildLeafviz_leafanno.err
Untracked: code/buildLeafviz_leafanno.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/chromOrder.err
Untracked: code/chromOrder.out
Untracked: code/classifyLeafviz.err
Untracked: code/classifyLeafviz.out
Untracked: code/
Untracked: code/cluster.json
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/clusterPAS.json
Untracked: code/clusterfiltPAS.json
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/environment.yaml
Untracked: code/extraSnakefiltpas
Untracked: code/filter5perc.R
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/filterJuncChroms.err
Untracked: code/filterJuncChroms.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/filterSortBedbyCleanedBed_gen.R
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/generateStarIndex.err
Untracked: code/generateStarIndex.out
Untracked: code/generateStarIndexHuman.err
Untracked: code/generateStarIndexHuman.out
Untracked: code/
Untracked: code/hg19MapStats.err
Untracked: code/hg19MapStats.out
Untracked: code/
Untracked: code/
Untracked: code/humanFiles
Untracked: code/intersectAnno.err
Untracked: code/intersectAnno.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/liftoverPAShg19to38.err
Untracked: code/liftoverPAShg19to38.out
Untracked: code/log/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/maphg19.err
Untracked: code/maphg19.out
Untracked: code/
Untracked: code/maphg19_new.err
Untracked: code/maphg19_new.out
Untracked: code/maphg19_sub.err
Untracked: code/maphg19_sub.out
Untracked: code/
Untracked: code/merge.err
Untracked: code/
Untracked: code/
Untracked: code/mergeandsort_ChimpinHuman.err
Untracked: code/mergeandsort_ChimpinHuman.out
Untracked: code/
Untracked: code/mergedbam2bw.err
Untracked: code/mergedbam2bw.out
Untracked: code/
Untracked: code/
Untracked: code/nuclearTranscriptDTplot.err
Untracked: code/nuclearTranscriptDTplot.out
Untracked: code/
Untracked: code/overlapPAS.err
Untracked: code/overlapPAS.out
Untracked: code/
Untracked: code/
Untracked: code/pheno2countonly.R
Untracked: code/prepareAnnoLeafviz.err
Untracked: code/prepareAnnoLeafviz.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/primaryLift.err
Untracked: code/primaryLift.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/quantLiftedPAS.err
Untracked: code/quantLiftedPAS.out
Untracked: code/
Untracked: code/quatJunc.err
Untracked: code/quatJunc.out
Untracked: code/recChimpback2Human.err
Untracked: code/recChimpback2Human.out
Untracked: code/
Untracked: code/revLift.err
Untracked: code/revLift.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/runCountNucleotides.err
Untracked: code/runCountNucleotides.out
Untracked: code/
Untracked: code/runCountNucleotidesPantro6.err
Untracked: code/runCountNucleotidesPantro6.out
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/run_Chimpleafcutter_ds.err
Untracked: code/run_Chimpleafcutter_ds.out
Untracked: code/run_Chimpverifybam.err
Untracked: code/run_Chimpverifybam.out
Untracked: code/run_Humanleafcutter_ds.err
Untracked: code/run_Humanleafcutter_ds.out
Untracked: code/run_Nuclearleafcutter_ds.err
Untracked: code/run_Nuclearleafcutter_ds.out
Untracked: code/run_Totalleafcutter_ds.err
Untracked: code/run_Totalleafcutter_ds.out
Untracked: code/
Untracked: code/
Untracked: code/run_verifybam.err
Untracked: code/run_verifybam.out
Untracked: code/slurm-62824013.out
Untracked: code/slurm-62825841.out
Untracked: code/slurm-62826116.out
Untracked: code/slurm-64108209.out
Untracked: code/slurm-64108521.out
Untracked: code/slurm-64108557.out
Untracked: code/snakePASChimp.err
Untracked: code/snakePASChimp.out
Untracked: code/snakePAShuman.out
Untracked: code/snakemake.batch
Untracked: code/snakemakeChimp.err
Untracked: code/snakemakeChimp.out
Untracked: code/snakemakeHuman.err
Untracked: code/snakemakeHuman.out
Untracked: code/snakemakePAS.batch
Untracked: code/snakemakePASFiltChimp.err
Untracked: code/snakemakePASFiltChimp.out
Untracked: code/snakemakePASFiltHuman.err
Untracked: code/snakemakePASFiltHuman.out
Untracked: code/snakemakePASchimp.batch
Untracked: code/snakemakePAShuman.batch
Untracked: code/snakemake_chimp.batch
Untracked: code/snakemake_human.batch
Untracked: code/snakemakefiltPAS.batch
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/test
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/
Untracked: code/verifybam4973.err
Untracked: code/verifybam4973.out
Untracked: code/verifybam4973HumanMap.err
Untracked: code/verifybam4973HumanMap.out
Untracked: code/wrap_Chimpverifybam.err
Untracked: code/wrap_Chimpverifybam.out
Untracked: code/
Untracked: code/
Untracked: code/wrap_verifybam.err
Untracked: code/wrap_verifybam.out
Untracked: code/
Untracked: data/._.DS_Store
Untracked: data/._HC_filenames.txt
Untracked: data/
Untracked: data/._HC_filenames.xlsx
Untracked: data/._MapPantro6_meta.txt
Untracked: data/
Untracked: data/._MapPantro6_meta.xlsx
Untracked: data/._OppositeSpeciesMap.txt
Untracked: data/
Untracked: data/._OppositeSpeciesMap.xlsx
Untracked: data/._RNASEQ_metadata.txt
Untracked: data/
Untracked: data/
Untracked: data/._RNASEQ_metadata_2Removed.txt
Untracked: data/
Untracked: data/._RNASEQ_metadata_2Removed.xlsx
Untracked: data/._RNASEQ_metadata_stranded.txt
Untracked: data/
Untracked: data/
Untracked: data/
Untracked: data/._RNASEQ_metadata_stranded.xlsx
Untracked: data/._metadata_HCpanel.txt
Untracked: data/
Untracked: data/
Untracked: data/
Untracked: data/._metadata_HCpanel.xlsx
Untracked: data/._metadata_HCpanel_frompantro5.xlsx
Untracked: data/._~$RNASEQ_metadata.xlsx
Untracked: data/._~$metadata_HCpanel.xlsx
Untracked: data/._.xlsx
Untracked: data/CompapaQTLpas/
Untracked: data/DTmatrix/
Untracked: data/DiffExpression/
Untracked: data/DiffIso_Nuclear/
Untracked: data/DiffIso_Total/
Untracked: data/DiffSplice/
Untracked: data/DiffSplice_liftedJunc/
Untracked: data/DiffSplice_removeBad/
Untracked: data/DominantPAS/
Untracked: data/EvalPantro5/
Untracked: data/HC_filenames.txt
Untracked: data/HC_filenames.xlsx
Untracked: data/Khan_prot/
Untracked: data/Li_eqtls/
Untracked: data/MapPantro6_meta.txt
Untracked: data/MapPantro6_meta.xlsx
Untracked: data/MapStats/
Untracked: data/NuclearHvC/
Untracked: data/OppositeSpeciesMap.txt
Untracked: data/OppositeSpeciesMap.xlsx
Untracked: data/PAS/
Untracked: data/Peaks_5perc/
Untracked: data/Pheno_5perc/
Untracked: data/Pheno_5perc_nuclear/
Untracked: data/Pheno_5perc_total/
Untracked: data/RNASEQ_metadata.txt
Untracked: data/RNASEQ_metadata_2Removed.txt
Untracked: data/RNASEQ_metadata_2Removed.xlsx
Untracked: data/RNASEQ_metadata_stranded.txt
Untracked: data/
Untracked: data/RNASEQ_metadata_stranded.xlsx
Untracked: data/SignalSites/
Untracked: data/TotalHvC/
Untracked: data/TwoBadSampleAnalysis/
Untracked: data/Wang_ribo/
Untracked: data/chainFiles/
Untracked: data/cleanPeaks_anno/
Untracked: data/cleanPeaks_byspecies/
Untracked: data/cleanPeaks_lifted/
Untracked: data/leafviz/
Untracked: data/liftover_files/
Untracked: data/metadata_HCpanel.txt
Untracked: data/metadata_HCpanel.xlsx
Untracked: data/metadata_HCpanel_frompantro5.txt
Untracked: data/metadata_HCpanel_frompantro5.xlsx
Untracked: data/primaryLift/
Untracked: data/reverseLift/
Untracked: data/~$RNASEQ_metadata.xlsx
Untracked: data/~$metadata_HCpanel.xlsx
Untracked: data/.xlsx
Untracked: output/dtPlots/
Untracked: projectNotes.Rmd
Unstaged changes:
Modified: analysis/OppositeMap.Rmd
Modified: analysis/annotationInfo.Rmd
Modified: analysis/investigatePantro5.Rmd
Modified: analysis/multiMap.Rmd
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view them.
File | Version | Author | Date | Message |
Rmd | 81336e4 | brimittleman | 2020-01-14 | add sig test |
html | 4c056b5 | brimittleman | 2020-01-09 | Build site. |
Rmd | abcafc4 | brimittleman | 2020-01-09 | wflow_publish(“analysis/DominantPASintronLoc.Rmd”) |
html | f34d597 | brimittleman | 2020-01-09 | Build site. |
Rmd | 07697b2 | brimittleman | 2020-01-09 | add distance distribution |
html | 6d9e369 | brimittleman | 2020-01-08 | Build site. |
Rmd | 203eae5 | brimittleman | 2020-01-08 | first steps for intron loc analysis |
This is workflowr version 1.5.0
Run ?workflowr for help getting started
── Attaching packages ─────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 3.1.1 ✔ purrr 0.3.2
✔ tibble 2.1.1 ✔ dplyr
✔ tidyr 0.8.3 ✔ stringr 1.3.1
✔ readr 1.3.1 ✔ forcats 0.3.0
── Conflicts ────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
The goal of this analysis is to look at the distribution of intronic location for the shared and not shared dominant PAS. This will help me know if the pattern is due to annotation or not. In this analysis I will use the nuclear results.
The first step is assigning each PAS to the intron it comes from.
HumanIntronicChimpUTR=read.table("../data/DominantPAS/Nuclear_HumanIntronicChimpUTR.txt",header = T, stringsAsFactors = F)%>% dplyr::select(gene, HumanPAS, HumanMean)
SameDomIntron=read.table("../data/DominantPAS/SameDominantIntronic.txt",header = T, stringsAsFactors = F)%>% dplyr::select(gene, HumanPAS,HumanMean)
HumanPAS= read.table("../data/Pheno_5perc/Human_Pheno_5perc.txt", header = T,stringsAsFactors = F) %>% dplyr::select(chr, start, end, gene, strand, PAS) %>% rename("HumanPAS"=PAS)
Subset this file by those in the set and select it as a bed file for overlap with intron file. I will use the human mean as the score for now.
HumanPAS_samedom=HumanPAS %>% inner_join(SameDomIntron, by="HumanPAS") %>% mutate(GenePAS=paste(gene.x, HumanPAS, sep="_")) %>% dplyr::select(chr, start, end, GenePAS, HumanMean,strand)
HumanPAS_diffDom=HumanPAS %>% inner_join(HumanIntronicChimpUTR, by="HumanPAS") %>% mutate(GenePAS=paste(gene.x, HumanPAS, sep="_")) %>% dplyr::select(chr, start, end, GenePAS, HumanMean,strand)
I can write these out as bed files.
write.table(HumanPAS_samedom, "../data/DominantPAS/SameDominantPAS_intronic.bed", quote = F, row.names = F, col.names = F,sep = "\t")
write.table(HumanPAS_diffDom, "../data/DominantPAS/DifferentDominantPAS_intronic.bed", quote = F, row.names = F, col.names = F,sep = "\t")
I can use bedtools intersect to find the intron these are in.
bedtools intersect -s -sorted -loj -a (PAS file) -b intron file > output
There are places where multiple transcripts with the intron included. this means the same PAS shows up multiple times ( i can look for uniq intron locations to take care of this)
SameDomIntron=read.table("../data/DominantPAS/SameDominantPAS_intronic_mapped2Intron.txt", stringsAsFactors = F, col.names = c("PASchr", "PASstart", "PASend", "PASname", "PASusage", "PASstrand", "Intronchr", "IntronStart", "IntronEnd", "IntronName", "IntronScore", "IntronStrand"))
#group by intronname and keep top intron
SameDomIntronOne=SameDomIntron %>% group_by(PASname) %>% slice(1) %>% ungroup()
DiffDomIntron=read.table("../data/DominantPAS/DifferentDominantPAS_intronic_mapped2Intron.txt", stringsAsFactors = F, col.names = c("PASchr", "PASstart", "PASend", "PASname", "PASusage", "PASstrand", "Intronchr", "IntronStart", "IntronEnd", "IntronName", "IntronScore", "IntronStrand"))
DiffDomIntronOne=DiffDomIntron %>% group_by(PASname) %>% slice(1) %>% ungroup()
Now I need to find the midpoint for the PAS and get the percent distance to the start of the intron. I will use the intron strand because this is the genomic strand.
SameDomIntronOne_dist=SameDomIntronOne %>% mutate(centerPAS=PASstart +100, intronLength=IntronEnd-IntronStart, distance2PAS=ifelse(IntronStrand=="+", centerPAS -IntronStart, IntronEnd-centerPAS),propIntron=distance2PAS/intronLength)
DiffDomIntronOne_dist=DiffDomIntronOne %>% mutate(centerPAS=PASstart +100, intronLength=IntronEnd-IntronStart, distance2PAS=ifelse(IntronStrand=="+", centerPAS -IntronStart, IntronEnd-centerPAS),propIntron=distance2PAS/intronLength)
Plot both distributions, do percentage and absolute distance
ggplot(SameDomIntronOne_dist, aes(x=distance2PAS)) + geom_histogram(bins=100) + labs(x="Distance from intron start to PAS", title="Same Dominant Intron, absolute distance")
Version | Author | Date |
f34d597 | brimittleman | 2020-01-09 |
ggplot(SameDomIntronOne_dist, aes(x=propIntron)) + geom_histogram(bins=100) +labs(x="Proportion of intron", title="Same Dominant Intron, proportion of intron")
Version | Author | Date |
f34d597 | brimittleman | 2020-01-09 |
SameDomIntronOne_dist_filt= SameDomIntronOne_dist %>% filter(distance2PAS<=100000)
ggplot(SameDomIntronOne_dist_filt, aes(x=distance2PAS)) + geom_histogram(bins=100) +labs(x="Distance from intron start to PAS", title="Same Dominant Intron, absolute distance (less than 100kb)")
Version | Author | Date |
f34d597 | brimittleman | 2020-01-09 |
ggplot(DiffDomIntronOne_dist, aes(x=distance2PAS)) + geom_histogram(bins=100)+ labs(x="Distance from intron start to PAS", title="Human Dominant Intronic, Chimp Dominant 3' UTR, absolute distance")
Version | Author | Date |
f34d597 | brimittleman | 2020-01-09 |
ggplot(DiffDomIntronOne_dist, aes(x=propIntron)) + geom_histogram(bins=100) +labs(x="Proportion of intron", title="Human Dominant Intronic, Chimp Dominant 3' UTR, Proportion of intron")
Version | Author | Date |
f34d597 | brimittleman | 2020-01-09 |
DiffDomIntronOne_dist_filt= DiffDomIntronOne_dist %>% filter(distance2PAS<=100000)
ggplot(DiffDomIntronOne_dist_filt, aes(x=distance2PAS)) + geom_histogram(bins=100) + labs(x="Distance from intron start to PAS", title="Human Dominant Intronic, Chimp Dominant 3' UTR \nabsolute distance (less than 100kb)")
Version | Author | Date |
f34d597 | brimittleman | 2020-01-09 |
Distributions look pretty similar.
ggplot(DiffDomIntronOne_dist, aes(x=propIntron)) +stat_ecdf(geom = "step", col="red") +stat_ecdf(data=SameDomIntronOne_dist, geom = "step", col="blue") + scale_colour_manual(name = 'Intron Set', values =c('red'='red','blue'='blue'), labels = c('c2','c1'),guide = 'legend')+ labs(x="PAS location by proprortion of Intron", title = "ecdf for Intronic location, red=Different Dom, blue=Same dom")
Version | Author | Date |
4c056b5 | brimittleman | 2020-01-09 |
wilcox.test(DiffDomIntronOne_dist$propIntron, SameDomIntronOne_dist$propIntron)
Wilcoxon rank sum test with continuity correction
data: DiffDomIntronOne_dist$propIntron and SameDomIntronOne_dist$propIntron
W = 949110, p-value = 0.04153
alternative hypothesis: true location shift is not equal to 0
wilcox.test(DiffDomIntronOne_dist$distance2PAS, SameDomIntronOne_dist$distance2PAS)
Wilcoxon rank sum test with continuity correction
data: DiffDomIntronOne_dist$distance2PAS and SameDomIntronOne_dist$distance2PAS
W = 887280, p-value = 0.3558
alternative hypothesis: true location shift is not equal to 0
Not a significant difference in these distributions.
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Scientific Linux 7.4 (Nitrogen)
Matrix products: default
BLAS/LAPACK: /software/openblas-0.2.19-el7-x86_64/lib/
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] forcats_0.3.0 stringr_1.3.1 dplyr_0.8.0.1 purrr_0.3.2
[5] readr_1.3.1 tidyr_0.8.3 tibble_2.1.1 ggplot2_3.1.1
[9] tidyverse_1.2.1 workflowr_1.5.0
loaded via a namespace (and not attached):
[1] Rcpp_1.0.2 cellranger_1.1.0 plyr_1.8.4 compiler_3.5.1
[5] pillar_1.3.1 later_0.7.5 git2r_0.26.1 tools_3.5.1
[9] digest_0.6.18 lubridate_1.7.4 jsonlite_1.6 evaluate_0.12
[13] nlme_3.1-137 gtable_0.2.0 lattice_0.20-38 pkgconfig_2.0.2
[17] rlang_0.4.0 cli_1.1.0 rstudioapi_0.10 yaml_2.2.0
[21] haven_1.1.2 withr_2.1.2 xml2_1.2.0 httr_1.3.1
[25] knitr_1.20 hms_0.4.2 generics_0.0.2 fs_1.3.1
[29] rprojroot_1.3-2 grid_3.5.1 tidyselect_0.2.5 glue_1.3.0
[33] R6_2.3.0 readxl_1.1.0 rmarkdown_1.10 modelr_0.1.2
[37] magrittr_1.5 whisker_0.3-2 backports_1.1.2 scales_1.0.0
[41] promises_1.0.1 htmltools_0.3.6 rvest_0.3.2 assertthat_0.2.0
[45] colorspace_1.3-2 httpuv_1.4.5 labeling_0.3 stringi_1.2.4
[49] lazyeval_0.2.1 munsell_0.5.0 broom_0.5.1 crayon_1.3.4