Last updated: 2022-10-09
Checks: 2 0
Knit directory: Bio322/
This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 0c8b7c6. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for
the analysis have been committed to Git prior to generating the results
(you can use wflow_publish
or
wflow_git_commit
). workflowr only checks the R Markdown
file, but you know if there are other scripts or data files that it
depends on. Below is the status of the Git repository when the results
were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .RData
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: analysis/.DS_Store
Untracked files:
Untracked: 210922_genome expression_epigenetics.pptx
Untracked: 210927.module1.2.RNAseq_on_Galaxy.pdf
Untracked: 220317_Advanced topics in genomics.docx
Untracked: 220317_Advanced topics in genomics_MarieSai.docx
Untracked: BIO322_Teaching plan BIO322 2021.docx
Untracked: Bio322.09132021.pdf
Untracked: Bio322.09132021.pptx
Untracked: Bio322.09152021.backup.pptx
Untracked: Bio322.09152021.pdf
Untracked: Bio322.09152021.pptx
Untracked: Bio322.09202021.pdf
Untracked: Bio322.09202021.pptx
Untracked: Bio322.09272021.pptx
Untracked: Bio322.09272021/
Untracked: Bio322scRNAseq.tsv
Untracked: Galaxy1-[intestinalData.tsv].tabular
Untracked: Galaxy2.txt
Untracked: Group.csv
Untracked: analysis/Evolution_for_lab.Rmd
Untracked: analysis/_site/
Untracked: analysis/genomebrowser_for_lab.Rmd
Untracked: analysis/tutorial.RNAseq.foradults.xlsx
Untracked: bio322.xlsx
Untracked: bio322_2022.pptx
Untracked: chr15_inversion-v1.0.0.zip
Untracked: gene_regulation_bio322_2022.pptx
Untracked: gwassim.txt
Untracked: intestinalData.tsv
Untracked: main_workflow.ga
Untracked: markdown_test/
Untracked: mouse_intestine_scRNAseq.txt
Untracked: oharring-chr15_inversion-9615456/
Untracked: science.abg0718_data_s1_to_s8.zip
Untracked: science.abg0718_data_s1_to_s8/
Untracked: scrna_tenx.ga
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were
made to the R Markdown (analysis/2022_genome.function2.Rmd
)
and HTML (docs/2022_genome.function2.html
) files. If you’ve
configured a remote Git repository (see ?wflow_git_remote
),
click on the hyperlinks in the table below to view the files as they
were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
html | 0c87f62 | mariesaitou | 2022-10-09 | Build site. |
Rmd | 59378af | mariesaitou | 2022-10-09 | wflow_publish("analysis/2022_genome.function2.Rmd") |
html | 664b5c6 | mariesaitou | 2022-10-09 | Build site. |
Rmd | b56d9ff | mariesaitou | 2022-10-09 | wflow_publish("analysis/2022_genome.function2.Rmd") |
Goal: Today, we learn about how to analyze genetic variants and its association with phenotype.
Genome wide association study - theory
Genome wide association study - browse real data
Relevant review papers if you want to further learn:
Depends on your interest:
Prologue - Current topics in genomics
Human genome sequence
Nobel prize 2022
We will learn about:
Methods and significance of functional genomics
Challenges of human genomics
Now, we will investigate real GWAS data at GWAS ATLAS, which concains 4,756 human GWAS across 3,302 traits.
Let’s explore the human GWAS database GWAS ATLAS. In the tutorial, I did not explain everything, so please explore it yourself and get familiarised with it.
Go to “Browse GWAS” in the top, black bar. And search for the trait about eyesight and see the results.
Q1-1. Examine the Manhattan plot. Where do you see the peak? How can we interpret this?
Q1-2. See the left table. How many individuals were investigated?
Q1-3. How many variants were investigated?
Q1-4. See the bottom Manhattan plot with gene annotations. What is the top-associated gene?
Q1-5. (Discuss within groups) - Do some literature search about the top-associated gene and assume how this gene is associated with the trait, “Reason for glasses/contact lenses: For short-sightedness”.
https://atlas.ctglab.nl/traitDB/3539
A1-1. There are multiple peaks, in the chromosomes 1, 2, 4, 6, 8, 10, 15… The obserbation implies that multiple loci are associated with this trait (The trait is polygenic).
A1-2. 78647 samples (See the left box, “N”)
A1-3. 9223534 SNPs (See the left box, “Nshps”)
A1-4. Reference: T. Alternative: TAGACACTGTCTACCGAAATGTAGACACTGTCTACCGAAATG
A1-5. Discuss within groups.
Go to “Browse GWAS” in the top, black bar. And search for the trait about hair morphology and see the results.
Q2-1. Examine the Manhattan plot. Where do you see the peak? How can we interpret this?
Q2-2. See the left table. How many individuals were investigated?
Q2-3. How many variants were investigated?
Q2-4. See the table under the Manhattan plot with gene annotations. What is the ID of top-associated SNP? (SNP ID: rsXXXXX)
Q2-5. Discuss within the groups - Go to GGV map browser and enter the SNP ID above to see the global distribution of this variant. “A” is the straight hair allele, and “G” is the curly hair allele. What can we estimate from the map?
https://atlas.ctglab.nl/traitDB/4023
A2-1. There is one big peak at chromosome 2. It is plausible that one genetic factor with very strong effect is associated with this trait.
A2-2. 4878 individuals
A2-3. 560921 SNPs
A2-4. rs260643
A2-5. Discuss within groups.