Last updated: 2020-06-23

Checks: 2 0

Knit directory: PSYMETAB/

This reproducible R Markdown analysis was created with workflowr (version 1.6.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility. The version displayed above was the version of the Git repository at the time these results were generated.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    ._docs
    Ignored:    .drake/
    Ignored:    analysis/.Rhistory
    Ignored:    analysis/._GWAS.Rmd
    Ignored:    analysis/._data_processing_in_genomestudio.Rmd
    Ignored:    analysis/._quality_control.Rmd
    Ignored:    analysis/GWAS/
    Ignored:    analysis/PRS/
    Ignored:    analysis/QC/
    Ignored:    analysis_prep_1_clustermq.out
    Ignored:    analysis_prep_2_clustermq.out
    Ignored:    analysis_prep_3_clustermq.out
    Ignored:    analysis_prep_4_clustermq.out
    Ignored:    data/processed/
    Ignored:    data/raw/
    Ignored:    download_impute_1_clustermq.out
    Ignored:    init_analysis_1_clustermq.out
    Ignored:    init_analysis_2_clustermq.out
    Ignored:    init_analysis_3_clustermq.out
    Ignored:    init_analysis_4_clustermq.out
    Ignored:    init_analysis_5_clustermq.out
    Ignored:    init_analysis_6_clustermq.out
    Ignored:    packrat/lib-R/
    Ignored:    packrat/lib-ext/
    Ignored:    packrat/lib/
    Ignored:    post_impute_1_clustermq.out
    Ignored:    pre_impute_qc_1_clustermq.out
    Ignored:    process_init_10_clustermq.out
    Ignored:    process_init_11_clustermq.out
    Ignored:    process_init_12_clustermq.out
    Ignored:    process_init_13_clustermq.out
    Ignored:    process_init_14_clustermq.out
    Ignored:    process_init_15_clustermq.out
    Ignored:    process_init_16_clustermq.out
    Ignored:    process_init_17_clustermq.out
    Ignored:    process_init_18_clustermq.out
    Ignored:    process_init_19_clustermq.out
    Ignored:    process_init_1_clustermq.out
    Ignored:    process_init_20_clustermq.out
    Ignored:    process_init_21_clustermq.out
    Ignored:    process_init_22_clustermq.out
    Ignored:    process_init_23_clustermq.out
    Ignored:    process_init_24_clustermq.out
    Ignored:    process_init_25_clustermq.out
    Ignored:    process_init_26_clustermq.out
    Ignored:    process_init_27_clustermq.out
    Ignored:    process_init_28_clustermq.out
    Ignored:    process_init_29_clustermq.out
    Ignored:    process_init_2_clustermq.out
    Ignored:    process_init_30_clustermq.out
    Ignored:    process_init_31_clustermq.out
    Ignored:    process_init_3_clustermq.out
    Ignored:    process_init_4_clustermq.out
    Ignored:    process_init_5_clustermq.out
    Ignored:    process_init_6_clustermq.out
    Ignored:    process_init_7_clustermq.out
    Ignored:    process_init_8_clustermq.out
    Ignored:    process_init_9_clustermq.out
    Ignored:    prs_1_clustermq.out
    Ignored:    prs_2_clustermq.out
    Ignored:    prs_3_clustermq.out
    Ignored:    prs_4_clustermq.out

Untracked files:
    Untracked:  analysis/genetic_quality_control.Rmd
    Untracked:  analysis/plans.Rmd
    Untracked:  analysis_prep.log
    Untracked:  download_impute.log
    Untracked:  grs.log
    Untracked:  init_analysis.log
    Untracked:  process_init.log
    Untracked:  prs.log

Unstaged changes:
    Modified:   analysis/GWAS.Rmd
    Modified:   analysis/data_sources.Rmd
    Modified:   analysis/index.Rmd
    Modified:   analysis/pheno_quality_control.Rmd
    Deleted:    analysis/project.Rmd
    Modified:   analysis/quality_control.Rmd
    Modified:   cache_log.csv
    Modified:   post_impute.log
    Modified:   slurm_clustermq.tmpl

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the R Markdown and HTML files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view them.

File Version Author Date Message
Rmd e6f7fb5 Jenny 2019-12-17 improve website
html 9f1ba5e Jenny Sjaarda 2019-12-06 Build site.
Rmd e724bd6 Jenny 2019-12-04 add notes on processing in genome studio
html 125be8c Jenny Sjaarda 2019-12-02 build website
Rmd 0dd02a7 Jenny 2019-12-02 modify website

Creating GenomeStudio files:

  • Instructions can be found here.
  • Required files:
    • Sample sheet: as csv file.
    • Data repository: as idat files.
    • Manifest file: as bpm file.
    • Cluster file: as egt file.
  • Data provided from Mylene Docquier, copied from sftp and saved here: L:\PCN\UBPC\ANALYSES_RECHERCHE\Jenny\PSYMETAB_GWAS\data.
  • Create new IDs based on GPCR randomization (see /scripts/format_QC_input.r), and save to above folder as: Eap0819_1t26_27to29corrected_7b9b_randomizedID.csv.
  • Note that original IDs can be found in the same folder at the file: Eap0819_1t26_27to29corrected_7b9.csv, if needed.
  • Create empty folder here: L:\PCN\UBPC\ANALYSES_RECHERCHE\Jenny\PSYMETAB_GWAS, named: GS_project_26092019 (data of creation).
  • Using new IDs, create genome studio project as follows:
    1. Open GenomeStudio.
    2. Select: File > New Genotyping Project.
    3. Select L:\PCN\UBPC\ANALYSES_RECHERCHE\Jenny\PSYMETAB_GWAS as project repository.
    4. Under project name: use “GS_project_26092019” and click “Next”.
    5. Select “Use sample sheet to load intensities” and click “Next”.
    6. Select sample, data and manifests as specified below and click “Next”:
      • Sample sheet: L:\PCN\UBPC\ANALYSES_RECHERCHE\Jenny\PSYMETAB_GWAS\data\Eap0819_1t26_27to29corrected_7b9b_randomizedID.csv,
      • Data repository: L:\PCN\UBPC\ANALYSES_RECHERCHE\Jenny\PSYMETAB_GWAS\data,
      • Manifest repository: L:\PCN\UBPC\ANALYSES_RECHERCHE\Jenny\PSYMETAB_GWAS\data.
    7. Select “Import cluster positions from cluster file” and choose cluster file located here: L:\PCN\UBPC\ANALYSES_RECHERCHE\Jenny\PSYMETAB_GWAS\data\GSPMA24v1_0-A_4349HNR_Samples.egt and click “Finish”.

Notes and Updates

  • Initial data was received on April 8, 2019 and final two plates were received on May 17, 2019.
  • Processing began with initial files.
  • July 18, 2019 update:
    • It came to our attention that 15 participants were genotyped that did not consent.
    • This list was sent to Mylene to be removed.
    • Plates 27 to 29 were re-provided on 08/08/2019 without these 15 individuals (list provided by Severine in email - see scripts/format_QC_input.r for creation of list of IDs to remove in PLINK).
    • Until new file was provided, these participants were removed using PLINK to avoid any further analysis of these individuals.
    • The new genomestudio file was copied to: L:\PCN\UBPC\ANALYSES_RECHERCHE\Jenny\PSYMETAB_GWAS\PSYMETAB_GS2\Plates27to29_0819.
    • The same process above was followed (data opened in GS, cluster positions imported, and data saved to Plates27to29_0819_cluster, and PLINK_270819_0457).
    • Old files were deleted to remove all data containing these individuals.
    • Updated files were then copied to PSYMETAB_GS1.
  • August 28, 2019 update:
    • Mylene provided one single genome studio file with all samples (excluding the list from Severine).
    • These files were copied to UPPC folders and custom clustering was re-performed.
    • These changes are reflected in the above description.
    • All old files were subsequently deleted to ensure the data from these participants is completely removed from all databases.
    • As of September 3, 2019, all clustering was complete and final PLINK files (PLINK_030919_0149) were copied to SGG directory (names of plink files according to parent directory: DATA).
  • September 6, 2019 update:
    • It was decided that all IDs part of PSYMETAB should be randomized to ensure they are not identifiable.
    • We had a meeting to discuss (Celine, Fred, Chin, Nermine, and Claire), and decided to use a CHUV program (GPCR) for the randomization process.
    • We requested with Mylene to create a new project with the new IDs, but she suggested to create our own GS project.
    • She provided all relevant data to create our own GS project.
    • The description above reflects these changes.
  • As of October 11, all GS file were created, clustered and exported as PLINK files and subsequently moved to the sgg server.

Creating full data table for use in penncnv

  • Instructions to create data for penncnv input were followed here
  • GS_project_26092019_cluster.bsc, located here: L:\PCN\UBPC\ANALYSES_RECHERCHE\Jenny\PSYMETAB_GWAS\GS_project_26092019_cluster, was used to create the full data table.
  • Output file was named the default name of Full Data Table.txt and saved in the same location.
  • The output was then moved to the server to and run on with penncnv.