Last updated: 2023-05-11

Checks: 7 0

Knit directory: NMD-analysis/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20230314) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 6025d0f. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    analysis/Differential-transcript-usage.nb.html
    Ignored:    analysis/Enichment-analysis-fgsea.nb.html
    Ignored:    analysis/Enichment-analysis-goseq.nb.html

Untracked files:
    Untracked:  PCA.png
    Untracked:  PCA_plot.pdf
    Untracked:  PCA_transcript.png
    Untracked:  analysis/Differential-transcript-usage.Rmd
    Untracked:  analysis/UPF3B_KD.Rmd
    Untracked:  analysis/transcript-preprocessing.Rmd
    Untracked:  code/eisaR.R
    Untracked:  code/external_code/
    Untracked:  data/LTK_Sample Metafile_V3.txt
    Untracked:  data/Mus_musculus.GRCm39.105__nifs.tsv
    Untracked:  data/data.txt
    Untracked:  data/data2.txt
    Untracked:  data/fastqc/
    Untracked:  data/nif_output/
    Untracked:  data/samples.txt
    Untracked:  output/DEG-limma-results.Rda
    Untracked:  output/DEG-list.Rda
    Untracked:  output/DEG/
    Untracked:  output/EISA/
    Untracked:  output/ISAR/
    Untracked:  output/QC/
    Untracked:  output/Transcript/
    Untracked:  output/isoformSwitchAnalyzeR_isoform_AA_complete.fasta
    Untracked:  output/isoformSwitchAnalyzeR_isoform_AA_subset_1_of_3.fasta
    Untracked:  output/isoformSwitchAnalyzeR_isoform_AA_subset_2_of_3.fasta
    Untracked:  output/isoformSwitchAnalyzeR_isoform_AA_subset_3_of_3.fasta
    Untracked:  output/isoformSwitchAnalyzeR_isoform_nt.fasta
    Untracked:  output/limma-matrices.Rda
    Untracked:  tmp/

Unstaged changes:
    Modified:   analysis/RNA-stability.Rmd
    Modified:   analysis/_site.yml
    Modified:   code/functions.R
    Modified:   code/libraries.R

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/Enichment-analysis-goseq.Rmd) and HTML (docs/Enichment-analysis-goseq.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 6025d0f unawaz1996 2023-05-11 wflow_publish(c("analysis/index.Rmd", "analysis/Differential-transcript-expression.Rmd",
html 650317e unawaz1996 2023-03-22 Build site.
Rmd f2ed11a unawaz1996 2023-03-22 wflow_publish(c("analysis/index.Rmd", "analysis/Enichment-analysis-fgsea.Rmd",
html 1f4ee28 unawaz1996 2023-03-22 Build site.
Rmd 917d8df unawaz1996 2023-03-22 wflow_publish(c("analysis/index.Rmd", "analysis/Enichment-analysis-fgsea.Rmd",
html 0e23a3b unawaz1996 2023-03-17 Build site.
Rmd 84be588 unawaz1996 2023-03-17 Enrichment analysis

In this workbook, we are testing for enrichment within discrete sets of DE genes as defined in the DEG analysis.

In order to do this, we will be using goseq, an R package that performs enrichment analysis whilst taking length bias into account. It does so by calculating a Probability Weighting Function (PWF), which gives the probability that a gene will be differentially expressed based on length alone.

The PWD is calculated by fitting a monotonic spline to the binary data series of differential expression (1 = DE, 0= Not DE) as a function of gene length. The PWF is used to weight the chance of selecting each gene when forming a null distribution for GO category membership. The fact that the PWF is calculated directly from the dataset under consideration makes this approach robust, only correcting for the length bias present in the data.

Results can be interpreted as following:

  • overrpresented p-value: Over representation in this analysis means that there are more DE genes in categort than expected. P-value relates to the probability of observing this number of DE genes in category by chance
  • numDEinCat: number of DE genes in category
  • Expected: Expected number of genes in category
  • adjP : Bonferroni adjustment of over represented p-value
  • FDR: FDR adjustment of over represented p-value

Databases used for testing

Data was sourced using the misigdbr package.

Hallmark gene sets

Mappings were required from gene to pathway, and Ensembl identifies were used to map from gene to pathway. A total of 4,391 Ensembl IDs were mapped to pathways from the Hallmark set.

C2 gene set

The same mapping process was applied to datasets from c2 signatures. For this analysis, only Wikipathways, KEGG and Reactome gene sets were retrieved, and a total of 11,774 Ensembl IDs were mapped to c2 genesets.

Gene Ontology Gene set

For the analysis of gene-sets from GO database, gene-sets were restricted to those with 3 or more steps back to the ontology root terms.

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22
0e23a3b unawaz1996 2023-03-17

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22
0e23a3b unawaz1996 2023-03-17

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

Enrichment in the DE Gene Set

The first step of analysis using goseq, regardless of the gene-set, is estimation of the probability weight function (PWF) which quantifies the probability of a gene being considered as DE based on a single covariate.

GO terms

For gene ontology analysis, we will be using the GO summaries method. Essentially, this analysis involves creating a graph for each ontology term, and removing the node all, as this redundant. For each GO term, we get the ontology it belongs to, the shortest path back to the root note, the longest path to the root node and whether it GO term is a terminal node.

UPF3B KD vs Controls

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

UPF3A KD vs Controls

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

Double KD

UPF3A OE

UPF3A OE UPF3B KD

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

Summary plot of gene ontology terms

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

2. C2 database

  1. UPF3B KD vs Controls
  1. UPF3A KD vs Controls
  1. Double KDs vs Controls
  1. UPF3A OE vs Controls
  1. UPF3A OE UPF3B KD vs Controls

Results summary

Overlap of terms between enrichment analyses of different groups

Overlap of terms between enrichment analyses of different groups

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

Version Author Date
650317e unawaz1996 2023-03-22
1f4ee28 unawaz1996 2023-03-22

Hallmark datasets

  1. UPF3B KD vs Controls
  1. UPF3A KD vs Controls
  1. Double KD vs Controls
  1. UPF3A OE vs Controls
  1. UPF3A OE vs Controls

Results summary

  • Overlap of Hallmark gene sets

  • Summary plots

Shared transcripts between the KDs


R version 4.2.2 Patched (2022-11-10 r83330)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
 [1] grid      stats4    tools     stats     graphics  grDevices utils    
 [8] datasets  methods   base     

other attached packages:
 [1] IsoformSwitchAnalyzeR_2.01.04 pfamAnalyzeR_0.99.0          
 [3] sva_3.46.0                    genefilter_1.80.3            
 [5] mgcv_1.8-42                   nlme_3.1-162                 
 [7] satuRn_1.6.0                  DEXSeq_1.44.0                
 [9] BiocParallel_1.32.6           ggrepel_0.9.3                
[11] pander_0.6.5                  msigdbr_7.5.1                
[13] cowplot_1.1.1                 ngsReports_2.0.3             
[15] patchwork_1.1.2               VennDiagram_1.7.3            
[17] futile.logger_1.4.3           UpSetR_1.4.0                 
[19] fgsea_1.24.0                  GOplot_1.0.2                 
[21] RColorBrewer_1.1-3            gridExtra_2.3                
[23] ggdendro_0.1.23               AnnotationHub_3.6.0          
[25] BiocFileCache_2.6.1           dbplyr_2.3.2                 
[27] openxlsx_4.2.5.2              ggiraph_0.8.7                
[29] wasabi_1.0.1                  sleuth_0.30.1                
[31] DT_0.27                       VennDetail_1.14.0            
[33] msigdb_1.6.0                  GSEABase_1.60.0              
[35] graph_1.76.0                  annotate_1.76.0              
[37] XML_3.99-0.14                 pheatmap_1.0.12              
[39] ggvenn_0.1.10                 MetBrewer_0.2.0              
[41] ggpubr_0.6.0                  venn_1.11                    
[43] viridis_0.6.2                 viridisLite_0.4.1            
[45] tximeta_1.16.1                tximport_1.26.1              
[47] goseq_1.50.0                  geneLenDataBase_1.34.0       
[49] BiasedUrn_2.0.9               org.Mm.eg.db_3.16.0          
[51] EnsDb.Mmusculus.v79_2.99.0    ensembldb_2.22.0             
[53] AnnotationFilter_1.22.0       GenomicFeatures_1.50.4       
[55] AnnotationDbi_1.60.2          biomaRt_2.54.1               
[57] edgeR_3.40.2                  limma_3.54.2                 
[59] DESeq2_1.38.3                 SummarizedExperiment_1.28.0  
[61] Biobase_2.58.0                MatrixGenerics_1.10.0        
[63] matrixStats_0.63.0            GenomicRanges_1.50.2         
[65] GenomeInfoDb_1.34.9           IRanges_2.32.0               
[67] S4Vectors_0.36.2              BiocGenerics_0.44.0          
[69] corrplot_0.92                 lubridate_1.9.2              
[71] forcats_1.0.0                 purrr_1.0.1                  
[73] readr_2.1.4                   tidyverse_2.0.0              
[75] stringr_1.5.0                 tidyr_1.3.0                  
[77] scales_1.2.1                  data.table_1.14.8            
[79] readxl_1.4.2                  tibble_3.2.1                 
[81] magrittr_2.0.3                reshape2_1.4.4               
[83] ggplot2_3.4.2                 dplyr_1.1.1                  
[85] workflowr_1.7.0              

loaded via a namespace (and not attached):
  [1] rappdirs_0.3.3                rtracklayer_1.58.0           
  [3] bit64_4.0.5                   knitr_1.42                   
  [5] DelayedArray_0.24.0           hwriter_1.3.2.1              
  [7] KEGGREST_1.38.0               RCurl_1.98-1.12              
  [9] generics_0.1.3                callr_3.7.3                  
 [11] lambda.r_1.2.4                RSQLite_2.3.1                
 [13] bit_4.0.5                     tzdb_0.3.0                   
 [15] xml2_1.3.3                    httpuv_1.6.9                 
 [17] xfun_0.38                     hms_1.1.3                    
 [19] jquerylib_0.1.4               babelgene_22.9               
 [21] evaluate_0.20                 promises_1.2.0.1             
 [23] fansi_1.0.4                   restfulr_0.0.15              
 [25] progress_1.2.2                DBI_1.1.3                    
 [27] geneplotter_1.76.0            htmlwidgets_1.6.2            
 [29] ellipsis_0.3.2                crosstalk_1.2.0              
 [31] backports_1.4.1               locfdr_1.1-8                 
 [33] vctrs_0.6.1                   abind_1.4-5                  
 [35] cachem_1.0.7                  withr_2.5.0                  
 [37] BSgenome_1.66.3               GenomicAlignments_1.34.1     
 [39] prettyunits_1.1.1             lazyeval_0.2.2               
 [41] crayon_1.5.2                  labeling_0.4.2               
 [43] pkgconfig_2.0.3               ProtGenerics_1.30.0          
 [45] rlang_1.1.0                   lifecycle_1.0.3              
 [47] filelock_1.0.2                cellranger_1.1.0             
 [49] rprojroot_2.0.3               Matrix_1.5-3                 
 [51] carData_3.0-5                 boot_1.3-28.1                
 [53] Rhdf5lib_1.20.0               zoo_1.8-11                   
 [55] whisker_0.4.1                 processx_3.8.0               
 [57] png_0.1-8                     rjson_0.2.21                 
 [59] bitops_1.0-7                  getPass_0.2-2                
 [61] rhdf5filters_1.10.1           Biostrings_2.66.0            
 [63] blob_1.2.4                    rstatix_0.7.2                
 [65] ggsignif_0.6.4                memoise_2.0.1                
 [67] plyr_1.8.8                    zlibbioc_1.44.0              
 [69] compiler_4.2.2                BiocIO_1.8.0                 
 [71] Rsamtools_2.14.0              cli_3.6.1                    
 [73] XVector_0.38.0                pbapply_1.7-0                
 [75] ps_1.7.4                      formatR_1.14                 
 [77] MASS_7.3-58.3                 tidyselect_1.2.0             
 [79] stringi_1.7.12                highr_0.10                   
 [81] yaml_2.3.7                    locfit_1.5-9.7               
 [83] sass_0.4.5                    fastmatch_1.1-3              
 [85] timechange_0.2.0              parallel_4.2.2               
 [87] rstudioapi_0.14               uuid_1.1-0                   
 [89] git2r_0.31.0                  farver_2.1.1                 
 [91] digest_0.6.31                 BiocManager_1.30.20          
 [93] shiny_1.7.4                   Rcpp_1.0.10                  
 [95] car_3.1-2                     broom_1.0.4                  
 [97] BiocVersion_3.16.0            later_1.3.0                  
 [99] httr_1.4.5                    colorspace_2.1-0             
[101] fs_1.6.1                      splines_4.2.2                
[103] statmod_1.5.0                 plotly_4.10.1                
[105] systemfonts_1.0.4             xtable_1.8-4                 
[107] jsonlite_1.8.4                futile.options_1.0.1         
[109] R6_2.5.1                      pillar_1.9.0                 
[111] htmltools_0.5.5               mime_0.12                    
[113] glue_1.6.2                    fastmap_1.1.1                
[115] interactiveDisplayBase_1.36.0 codetools_0.2-19             
[117] utf8_1.2.3                    lattice_0.20-45              
[119] bslib_0.4.2                   curl_5.0.0                   
[121] zip_2.2.2                     GO.db_3.16.0                 
[123] survival_3.5-5                admisc_0.31                  
[125] rmarkdown_2.21                munsell_0.5.0                
[127] rhdf5_2.42.0                  GenomeInfoDbData_1.2.9       
[129] gtable_0.3.3