• All experiments
  • Subset of the data
    • Per experiment
    • Specify an anatomical entity
    • Specify a developmental stage

Last updated: 2020-04-20

Checks: 6 1

Knit directory: Bgee/

This reproducible R Markdown analysis was created with workflowr (version 1.6.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20200417) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

The following chunks had caches available:
  • download-data
  • download-subset
  • download-subset-ann
  • download-subset-ann-multi
  • download-subset-stg
  • load-libs
  • session-info-chunk-inserted-by-workflowr

To ensure reproducibility of the results, delete the cache directory downloaddata_cache and re-run the analysis. To have workflowr automatically delete the cache directory prior to building the file, set delete_cache = TRUE when running wflow_build() or wflow_publish().

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 9073f83. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Rhistory
    Ignored:    analysis/downloaddata_cache/
    Ignored:    analysis/extractinfo_cache/
    Ignored:    analysis/processdata_cache/

Untracked files:
    Untracked:  Bos_taurus_Bgee_14_1/
    Untracked:  Drosophila_melanogaster_Bgee_14_1/
    Untracked:  README.html
    Untracked:  release.tsv
    Untracked:  species_Bgee_14_1.tsv

Unstaged changes:
    Modified:   README.md
    Modified:   analysis/_site.yml
    Deleted:    analysis/about.Rmd
    Modified:   analysis/index.Rmd
    Deleted:    analysis/license.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/downloaddata.Rmd) and HTML (docs/downloaddata.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 9073f83 SFonsecaCosta 2020-04-20 add analysis

All experiments

If you don’t want to use filters, as showed in the previous section (Extract Information) where you target a particular experiment to download, you can always download all the data referent to the species using the getData() function. Note that in the getData() you should specify which data type you would like to retrieve from Bgee by specifying that in the argument dataType.

DrosMelRNASeq <- Bgee$new(species = "Drosophila_melanogaster", dataType = "rna_seq")

Querying Bgee to get release information...

Building URL to query species in Bgee release 14_1...

Submitting URL to Bgee webservice... (https://r.bgee.org/bgee14_1/?page=r_package&action=get_all_species&display_type=tsv&source=BgeeDB_R_package&source_version=2.12.1)

Query to Bgee webservice successful!

API key built: 342617eeb6e728567eeaf07855efc4fa274e6ad21dde329ecce7bd3668c4efa9f23c364808e45b7b772594a592dfe6855ff75027e7983a848ff43a25416ebe6f
dataBgee_DM <- getData(DrosMelRNASeq)
The experiment is not defined. Hence taking all rna_seq experiments available for Drosophila_melanogaster.
Downloading expression data...
Saved expression data file in/Users/sfonseca1/Bgee/Drosophila_melanogaster_Bgee_14_1 folder. Now untar file...
Finished uncompress tar files
Saving all data in .rds file...
## Number of experiments that exist in BgeeDB
length(dataBgee_DM)
[1] 11

Subset of the data

In case you just have interest in download a particular experiment (as showed before), anatomical entity or developmental stage, you can just download this data by specifying that in the respective arguments: experimentId , anatEntityId and stageId.

Per experiment

Taking your experimenId result from the the filtering in the previously section, download just this respective experiment by specifying that in the argument experimentId in the getData() function.

DrosMelRNASeq_SRP002072 <- getData(DrosMelRNASeq, experimentId = "SRP002072")
Downloading expression data for the experiment SRP002072...
Saved expression data file in /Users/sfonseca1/Bgee/Drosophila_melanogaster_Bgee_14_1 folder. Now untar file...
Finished uncompress tar files
Saving all data in .rds file...
head(DrosMelRNASeq_SRP002072)
  Experiment.ID Library.ID Library.type     Gene.ID Anatomical.entity.ID
1     SRP002072  SRX019645       paired FBgn0000003       UBERON:0000922
2     SRP002072  SRX019645       paired FBgn0000008       UBERON:0000922
3     SRP002072  SRX019645       paired FBgn0000014       UBERON:0000922
4     SRP002072  SRX019645       paired FBgn0000015       UBERON:0000922
5     SRP002072  SRX019645       paired FBgn0000017       UBERON:0000922
6     SRP002072  SRX019645       paired FBgn0000018       UBERON:0000922
  Anatomical.entity.name      Stage.ID                     Stage.name  Sex
1                 embryo FBdv:00005306 embryonic stage 4 (Drosophila) <NA>
2                 embryo FBdv:00005306 embryonic stage 4 (Drosophila) <NA>
3                 embryo FBdv:00005306 embryonic stage 4 (Drosophila) <NA>
4                 embryo FBdv:00005306 embryonic stage 4 (Drosophila) <NA>
5                 embryo FBdv:00005306 embryonic stage 4 (Drosophila) <NA>
6                 embryo FBdv:00005306 embryonic stage 4 (Drosophila) <NA>
  Strain Read.count         TPM         FPKM Detection.flag Detection.quality
1     NA  5071.0000 7931.226169 10518.520218        present      high quality
2     NA   172.0002    5.951876     7.893474        present      high quality
3     NA    16.0000    0.604627     0.801867        present      high quality
4     NA    19.0000    0.793388     1.052205        present      high quality
5     NA   841.1847   22.824416    30.270109        present      high quality
6     NA    94.0000    9.765199    12.950765        present      high quality
   State.in.Bgee
1 Part of a call
2 Part of a call
3 Part of a call
4 Part of a call
5 Part of a call
6 Part of a call

Specify an anatomical entity

You are also available to retrieve data from the database by specifying the anatomic entity of interest for this particular experiment. Here we will target adult organism as an example.

DrosMelRNASeq_annEnt <- getData(DrosMelRNASeq, experimentId = "SRP002072", anatEntityId = "UBERON:0007023")
head(DrosMelRNASeq_annEnt)

Note you can specify more than one anatomical entity in the anatEntityId argument.

DrosMelRNASeq_annEnt_plus <- getData(DrosMelRNASeq, experimentId = "SRP002072", anatEntityId = c("UBERON:0007023","UBERON:0000922","UBERON:0002548"))
head(DrosMelRNASeq_annEnt_plus)

Specify a developmental stage

As demonstrated before, for experiment and anatomical entities, you also can download data by specifying your target developmental stage of interest in the stageId argument.

DrosMelRNASeq_SRP002072_stg <- getData(DrosMelRNASeq, stageId = c("UBERON:0000068","FBdv:00005341"))
head(DrosMelRNASeq_SRP002072_stg)

sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.4

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] BgeeDB_2.12.1        tidyr_1.0.2          topGO_2.38.1        
 [4] SparseM_1.78         GO.db_3.10.0         AnnotationDbi_1.48.0
 [7] IRanges_2.20.2       S4Vectors_0.24.4     Biobase_2.46.0      
[10] graph_1.64.0         BiocGenerics_0.32.0  workflowr_1.6.1     

loaded via a namespace (and not attached):
 [1] tidyselect_1.0.0   xfun_0.13          purrr_0.3.3        lattice_0.20-41   
 [5] vctrs_0.2.4        htmltools_0.4.0    yaml_2.2.1         blob_1.2.1        
 [9] rlang_0.4.5        later_1.0.0        pillar_1.4.3       glue_1.4.0        
[13] DBI_1.1.0          bit64_0.9-7        matrixStats_0.56.0 lifecycle_0.2.0   
[17] stringr_1.4.0      codetools_0.2-16   memoise_1.1.0      evaluate_0.14     
[21] knitr_1.28         httpuv_1.5.2       fansi_0.4.1        Rcpp_1.0.4.6      
[25] promises_1.1.0     backports_1.1.6    fs_1.4.1           bit_1.1-15.2      
[29] digest_0.6.25      stringi_1.4.6      dplyr_0.8.5        rprojroot_1.3-2   
[33] grid_3.6.0         cli_2.0.2          tools_3.6.0        bitops_1.0-6      
[37] magrittr_1.5       RCurl_1.98-1.1     RSQLite_2.2.0      tibble_3.0.0      
[41] crayon_1.3.4       pkgconfig_2.0.3    ellipsis_0.3.0     data.table_1.12.8 
[45] assertthat_0.2.1   rmarkdown_2.1      R6_2.4.1           git2r_0.26.1      
[49] compiler_3.6.0