gcta64
to estimate the genetic variances and heritabilities for each fitness trait
Last updated: 2021-10-01
Checks: 7 0
Knit directory: fitnessGWAS/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20180914)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version 01226ab. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rapp.history
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: .httr-oauth
Ignored: .pversion
Ignored: analysis/.DS_Store
Ignored: analysis/correlations_SNP_effects_cache/
Ignored: code/.DS_Store
Ignored: code/Drosophila_GWAS.Rmd
Ignored: data/.DS_Store
Ignored: data/derived/
Ignored: data/input/.DS_Store
Ignored: data/input/.pversion
Ignored: data/input/dgrp.fb557.annot.txt
Ignored: data/input/dgrp2.bed
Ignored: data/input/dgrp2.bim
Ignored: data/input/dgrp2.fam
Ignored: data/input/huang_transcriptome/
Ignored: figures/.DS_Store
Ignored: figures/fig1_inkscape.svg
Ignored: figures/figure1a.pdf
Ignored: figures/figure1b.pdf
Untracked files:
Untracked: big_model.rds
Untracked: code/quant_gen_1.R
Untracked: data/input/genomic_relatedness_matrix.rds
Untracked: old_analyses/
Unstaged changes:
Modified: .gitignore
Modified: figures/GWAS_stats_figure.pdf
Modified: figures/SNP_effect_ED.pdf
Modified: figures/TWAS_stats_figure.pdf
Modified: figures/antagonism_ratios.pdf
Modified: figures/boyle_plot.pdf
Modified: figures/composite_mixture_figure.pdf
Modified: figures/eff_size_histos.pdf
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/gcta_quant_genetics.Rmd
) and HTML (docs/gcta_quant_genetics.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
html | 8d14298 | lukeholman | 2021-09-26 | Build site. |
Rmd | af15dd6 | lukeholman | 2021-09-26 | Commit Sept 2021 |
library(tidyverse)
library(glue)
library(kableExtra)
library(brms)
gcta <- file.path(getwd(), "code/gcta64")
options(readr.show_col_types = FALSE)
# helper function to pass commands to the terminal
# Note that we set `intern = TRUE`, and pass the result of `system()` to `cat()`,
# ensuring that the Terminal output will be printed in this knitr report.
run_command <- function(shell_command, wd = getwd(), path = ""){
cat(system(glue("cd ", wd, path, "\n", shell_command), intern = TRUE), sep = '\n')
}
# Load the line mean fitness values calculated by brms
line_means <- read_csv("data/derived/predicted_line_means.csv") %>%
mutate(line = factor(str_remove_all(line, "_"))) %>%
filter(!is.na(male.fitness.early)) %>%
mutate(`Female fitness early` = female.fitness.early,
`Female fitness late` = female.fitness.late,
`Male fitness early` = male.fitness.early,
`Male fitness late` = male.fitness.late) %>%
rename(female_fitness_early = female.fitness.early,
female_fitness_late = female.fitness.late,
male_fitness_early = male.fitness.early,
male_fitness_late = male.fitness.late)
The table shows the posterior estimate of the intra-class correlation coefficient (ICC, which measures the proportion of variance within and between lines), using the models in the script get_predicted_line_means.Rmd
to adjust for block effects (and one covariate in the male late-life fitness analysis). ICC was calculated using ICC::ICCbare
. This value approximates the broad-sense heritability; fitness appears highly heritable, except for male late-life fitness.
read.csv('data/derived/ICC.csv') %>%
select(Fitness_trait, Estimate, Est.Error, Q2.5, Q97.5) %>%
rename(`Fitness trait` = Fitness_trait, `Est. Error` = Est.Error, `Lower 95% CI` = Q2.5, `Upper 95% CI` = Q97.5) %>%
kable(digits = 2) %>% kable_styling(full_width = FALSE)
Fitness trait | Estimate | Est. Error | Lower 95% CI | Upper 95% CI |
---|---|---|---|---|
Female early-life fitness | 0.63 | 0.01 | 0.61 | 0.64 |
Female late-life fitness | 0.56 | 0.01 | 0.54 | 0.58 |
Male early-life fitness | 0.45 | 0.01 | 0.43 | 0.46 |
Male late-life fitness | 0.14 | 0.01 | 0.12 | 0.15 |
line_means %>% select(contains(" ")) %>% cor() %>% kable(digits =3) %>% kable_styling()
Female fitness early | Female fitness late | Male fitness early | Male fitness late | |
---|---|---|---|---|
Female fitness early | 1.000 | 0.766 | 0.228 | 0.172 |
Female fitness late | 0.766 | 1.000 | 0.317 | 0.317 |
Male fitness early | 0.228 | 0.317 | 1.000 | 0.864 |
Male fitness late | 0.172 | 0.317 | 0.864 | 1.000 |
gcta64
to estimate the genetic variances and heritabilities for each fitness traitThe below uses the gcta64
command --reml
to estimate the genetic variance (V(G)
), i.e. the variance explained by the GRM), environmental variance (V(e)
), their sum (Vp
), and the so-called “SNP heritability” (V(G)/Vp
). Note that the genetic variance is very low for male late-life fitness. This is a bit puzzling because the same is not true for male early-life fitness, which is strongly correlated across lines with male late-life fitness.
do_univariate_greml <- function(trait1){
# First make 'focal_data', a 2-column data frame with the line and the focal phenotype value
focal_data <- line_means %>%
select(line, !! trait1)
names(focal_data)[2] <- c("focal_pheno")
# Write a file with the line and phenotype data called phenotype.txt, for gcta64
pheno_data <- focal_data %>%
# filter(!(line %in% lines_to_prune)) %>%
mutate(line_copy = line) %>%
select(line, line_copy, focal_pheno) %>%
as.matrix()
pheno_data %>%
write.table(row.names = FALSE, col.names = FALSE,
file = "data/derived/phenotype.txt", quote = FALSE)
console_output <- suppressWarnings(capture.output(
run_command(glue("{gcta} --reml --reml-maxit 10000 --grm gcta_GRM --pheno phenotype.txt --out delete_me"),
path = "/data/derived/")))
errors <- console_output[str_detect(console_output, "[Ee]rror")]
if(length(errors) == 0) errors <- NA
if(length(errors) > 1) errors <- paste(errors, sep = "; ")
rbind(read.delim("data/derived/delete_me.hsq", sep = "\t") ,
data.frame(Source = "Errors", Variance = errors, SE = NA)) %>%
mutate(Trait = trait1) %>% select(Trait, everything())
}
bind_rows(do_univariate_greml("Female fitness early"),
do_univariate_greml("Female fitness late"),
do_univariate_greml("Male fitness early"),
do_univariate_greml("Male fitness late")) %>%
filter(Source %in% c("V(G)", "V(e)", "Vp", "V(G)/Vp")) %>%
mutate(Variance = as.numeric(Variance)) %>%
rename(Parameter = Source) %>%
kable(digits = 3) %>%
kable_styling(full_width = FALSE)
Trait | Parameter | Variance | SE |
---|---|---|---|
Female fitness early | V(G) | 0.121 | 0.221 |
Female fitness early | V(e) | 0.765 | 0.443 |
Female fitness early | Vp | 0.886 | 0.240 |
Female fitness early | V(G)/Vp | 0.137 | 0.281 |
Female fitness late | V(G) | 0.388 | 0.202 |
Female fitness late | V(e) | 0.226 | 0.362 |
Female fitness late | Vp | 0.614 | 0.183 |
Female fitness late | V(G)/Vp | 0.632 | 0.489 |
Male fitness early | V(G) | 0.192 | 0.194 |
Male fitness early | V(e) | 0.613 | 0.381 |
Male fitness early | Vp | 0.805 | 0.207 |
Male fitness early | V(G)/Vp | 0.238 | 0.293 |
Male fitness late | V(G) | 0.000 | 0.233 |
Male fitness late | V(e) | 1.019 | 0.485 |
Male fitness late | Vp | 1.019 | 0.268 |
Male fitness late | V(G)/Vp | 0.000 | 0.229 |
The below uses the gcta64
command --reml-bivar
to estimate the genetic covariances (C(G)_tr12
), i.e. the co-variance explained by the GRM), and the genetic correlation (rG
), in addition to the univariate parameters for trait 1 and trait 2 mentioned above. Note that because the genetic variance is low for male late-life fitness, the model cannot reliably estimate genetic covariance or correlations for pairs of traits that involve male late-life fitness.
do_bivar_greml <- function(trait1, trait2){
# First make 'focal_data', a 2-column data frame with the line and the focal phenotype value
focal_data <- line_means %>%
select(line, !! trait1, !! trait2)
names(focal_data)[2:3] <- c("focal_pheno1", "focal_pheno2")
# Write a file with the line and phenotype data called phenotype.txt, for gcta64
pheno_data <- focal_data %>%
# filter(!(line %in% lines_to_prune)) %>%
mutate(line_copy = line) %>%
select(line, line_copy, focal_pheno1, focal_pheno2) %>%
as.matrix()
pheno_data %>%
write.table(row.names = FALSE, col.names = FALSE,
file = "data/derived/phenotype_pair.txt", quote = FALSE)
# Note that the warnings come when gcta could not fit the model properly. I save these errors especially, so no need to print them.
console_output <- suppressWarnings(capture.output(
run_command(glue("{gcta} --reml-bivar --reml-bivar-lrt-rg 0 --reml-maxit 10000 --grm gcta_GRM --pheno phenotype_pair.txt --out delete_me"),
path = "/data/derived/")))
errors <- console_output[str_detect(console_output, "[Ee]rror")]
if(length(errors) == 0) errors <- NA
if(length(errors) > 1) errors <- paste(errors, sep = "; ")
traits <- sort(c(trait1, trait2))
rbind(read.delim("data/derived/delete_me.hsq", sep = "\t") ,
data.frame(Source = "Errors", Variance = errors, SE = NA)) %>%
mutate(`Trait 1` = traits[1], `Trait 2` = traits[2]) %>% select(`Trait 1`, `Trait 2`, everything())
}
results <- bind_rows(do_bivar_greml("Female fitness early", "Female fitness late"),
do_bivar_greml("Male fitness early", "Male fitness late"),
do_bivar_greml("Female fitness early", "Male fitness early"),
do_bivar_greml("Female fitness late", "Male fitness late"),
do_bivar_greml("Female fitness early", "Male fitness late"),
do_bivar_greml("Female fitness late", "Male fitness early"))
unlink(c("data/derived/delete_me.log", "data/derived/phenotype_pair.txt", "data/derived/delete_me.hsq"))
unlink(list.files("data/derived/", pattern = "gcta_GRM", full.names = TRUE))
results %>%
filter(Source %in% c("V(G)_tr1", "V(e)_tr1", "Vp_tr1", "V(G)/Vp_tr1",
"V(G)_tr2", "V(e)_tr2", "Vp_tr2", "V(G)/Vp_tr2",
"C(G)_tr12", "rG")) %>%
mutate(Variance = as.numeric(Variance)) %>%
rename(Parameter = Source) %>%
kable(digit = 3) %>%
kable_styling(full_width = FALSE)
Trait 1 | Trait 2 | Parameter | Variance | SE |
---|---|---|---|---|
Female fitness early | Female fitness late | V(G)_tr1 | 0.167 | 0.208 |
Female fitness early | Female fitness late | V(G)_tr2 | 0.433 | 0.170 |
Female fitness early | Female fitness late | C(G)_tr12 | 0.223 | 0.173 |
Female fitness early | Female fitness late | V(e)_tr1 | 0.668 | 0.410 |
Female fitness early | Female fitness late | V(e)_tr2 | 0.153 | 0.292 |
Female fitness early | Female fitness late | Vp_tr1 | 0.836 | 0.221 |
Female fitness early | Female fitness late | Vp_tr2 | 0.586 | 0.151 |
Female fitness early | Female fitness late | V(G)/Vp_tr1 | 0.200 | 0.294 |
Female fitness early | Female fitness late | V(G)/Vp_tr2 | 0.739 | 0.439 |
Female fitness early | Female fitness late | rG | 0.827 | 0.230 |
Male fitness early | Male fitness late | V(G)_tr1 | 0.110 | 0.186 |
Male fitness early | Male fitness late | V(G)_tr2 | 0.001 | 0.234 |
Male fitness early | Male fitness late | C(G)_tr12 | -0.010 | 0.197 |
Male fitness early | Male fitness late | V(e)_tr1 | 0.781 | 0.383 |
Male fitness early | Male fitness late | V(e)_tr2 | 1.001 | 0.486 |
Male fitness early | Male fitness late | Vp_tr1 | 0.891 | 0.216 |
Male fitness early | Male fitness late | Vp_tr2 | 1.002 | 0.268 |
Male fitness early | Male fitness late | V(G)/Vp_tr1 | 0.124 | 0.234 |
Male fitness early | Male fitness late | V(G)/Vp_tr2 | 0.001 | 0.234 |
Male fitness early | Male fitness late | rG | -1.000 | 147.954 |
Female fitness early | Male fitness early | V(G)_tr1 | 0.123 | 0.222 |
Female fitness early | Male fitness early | V(G)_tr2 | 0.192 | 0.195 |
Female fitness early | Male fitness early | C(G)_tr12 | 0.047 | 0.153 |
Female fitness early | Male fitness early | V(e)_tr1 | 0.760 | 0.445 |
Female fitness early | Male fitness early | V(e)_tr2 | 0.612 | 0.382 |
Female fitness early | Male fitness early | Vp_tr1 | 0.883 | 0.241 |
Female fitness early | Male fitness early | Vp_tr2 | 0.805 | 0.208 |
Female fitness early | Male fitness early | V(G)/Vp_tr1 | 0.139 | 0.284 |
Female fitness early | Male fitness early | V(G)/Vp_tr2 | 0.239 | 0.294 |
Female fitness early | Male fitness early | rG | 0.306 | 0.888 |
Female fitness late | Male fitness late | V(G)_tr1 | 0.351 | 0.140 |
Female fitness late | Male fitness late | V(G)_tr2 | 0.071 | 0.282 |
Female fitness late | Male fitness late | C(G)_tr12 | 0.158 | 0.139 |
Female fitness late | Male fitness late | V(e)_tr1 | 0.119 | 0.244 |
Female fitness late | Male fitness late | V(e)_tr2 | 0.963 | 0.570 |
Female fitness late | Male fitness late | Vp_tr1 | 0.470 | 0.122 |
Female fitness late | Male fitness late | Vp_tr2 | 1.034 | 0.306 |
Female fitness late | Male fitness late | V(G)/Vp_tr1 | 0.747 | 0.459 |
Female fitness late | Male fitness late | V(G)/Vp_tr2 | 0.069 | 0.291 |
Female fitness late | Male fitness late | rG | 1.000 | 2.158 |
Female fitness early | Male fitness late | V(G)_tr1 | 0.124 | 0.218 |
Female fitness early | Male fitness late | V(G)_tr2 | 0.001 | 0.233 |
Female fitness early | Male fitness late | C(G)_tr12 | -0.012 | 0.159 |
Female fitness early | Male fitness late | V(e)_tr1 | 0.756 | 0.437 |
Female fitness early | Male fitness late | V(e)_tr2 | 1.016 | 0.484 |
Female fitness early | Male fitness late | Vp_tr1 | 0.880 | 0.237 |
Female fitness early | Male fitness late | Vp_tr2 | 1.017 | 0.267 |
Female fitness early | Male fitness late | V(G)/Vp_tr1 | 0.141 | 0.281 |
Female fitness early | Male fitness late | V(G)/Vp_tr2 | 0.001 | 0.229 |
Female fitness early | Male fitness late | rG | -1.000 | 95.966 |
Female fitness late | Male fitness early | V(G)_tr1 | 0.495 | 0.186 |
Female fitness late | Male fitness early | V(G)_tr2 | 0.241 | 0.164 |
Female fitness late | Male fitness early | C(G)_tr12 | 0.266 | 0.122 |
Female fitness late | Male fitness early | V(e)_tr1 | 0.062 | 0.311 |
Female fitness late | Male fitness early | V(e)_tr2 | 0.490 | 0.311 |
Female fitness late | Male fitness early | Vp_tr1 | 0.557 | 0.154 |
Female fitness late | Male fitness early | Vp_tr2 | 0.731 | 0.170 |
Female fitness late | Male fitness early | V(G)/Vp_tr1 | 0.889 | 0.531 |
Female fitness late | Male fitness early | V(G)/Vp_tr2 | 0.329 | 0.285 |
Female fitness late | Male fitness early | rG | 0.770 | 0.353 |
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] brms_2.14.4 Rcpp_1.0.4.6 kableExtra_1.3.4 glue_1.4.2
[5] forcats_0.5.0 stringr_1.4.0 dplyr_1.0.0 purrr_0.3.4
[9] readr_2.0.0 tidyr_1.1.0 tibble_3.0.1 ggplot2_3.3.2
[13] tidyverse_1.3.0 workflowr_1.6.2
loaded via a namespace (and not attached):
[1] readxl_1.3.1 backports_1.1.7 systemfonts_0.2.2
[4] plyr_1.8.6 igraph_1.2.5 splines_4.0.3
[7] crosstalk_1.1.0.1 TH.data_1.0-10 rstantools_2.1.1
[10] inline_0.3.15 digest_0.6.25 htmltools_0.5.0
[13] rsconnect_0.8.16 fansi_0.4.1 magrittr_2.0.1
[16] tzdb_0.1.2 modelr_0.1.8 RcppParallel_5.0.1
[19] matrixStats_0.56.0 vroom_1.5.3 xts_0.12-0
[22] sandwich_2.5-1 svglite_1.2.3 prettyunits_1.1.1
[25] colorspace_1.4-1 blob_1.2.1 rvest_0.3.5
[28] haven_2.3.1 xfun_0.22 callr_3.4.3
[31] crayon_1.3.4 jsonlite_1.7.0 lme4_1.1-23
[34] survival_3.2-7 zoo_1.8-8 gtable_0.3.0
[37] emmeans_1.4.7 webshot_0.5.2 V8_3.4.0
[40] pkgbuild_1.0.8 rstan_2.21.2 abind_1.4-5
[43] scales_1.1.1 mvtnorm_1.1-0 DBI_1.1.0
[46] miniUI_0.1.1.1 viridisLite_0.3.0 xtable_1.8-4
[49] bit_1.1-15.2 stats4_4.0.3 StanHeaders_2.21.0-3
[52] DT_0.13 htmlwidgets_1.5.1 httr_1.4.1
[55] threejs_0.3.3 ellipsis_0.3.1 pkgconfig_2.0.3
[58] loo_2.3.1 dbplyr_1.4.4 tidyselect_1.1.0
[61] rlang_0.4.6 reshape2_1.4.4 later_1.0.0
[64] munsell_0.5.0 cellranger_1.1.0 tools_4.0.3
[67] cli_2.0.2 generics_0.0.2 broom_0.5.6
[70] ggridges_0.5.2 evaluate_0.14 fastmap_1.0.1
[73] yaml_2.2.1 bit64_0.9-7 processx_3.4.2
[76] knitr_1.32 fs_1.4.1 nlme_3.1-149
[79] whisker_0.4 mime_0.9 projpred_2.0.2
[82] xml2_1.3.2 compiler_4.0.3 bayesplot_1.7.2
[85] shinythemes_1.1.2 rstudioapi_0.11 gamm4_0.2-6
[88] curl_4.3 reprex_0.3.0 statmod_1.4.34
[91] stringi_1.5.3 highr_0.8 ps_1.3.3
[94] Brobdingnag_1.2-6 gdtools_0.2.2 lattice_0.20-41
[97] Matrix_1.2-18 nloptr_1.2.2.1 markdown_1.1
[100] shinyjs_1.1 vctrs_0.3.0 pillar_1.4.4
[103] lifecycle_0.2.0 bridgesampling_1.0-0 estimability_1.3
[106] httpuv_1.5.3.1 R6_2.4.1 promises_1.1.0
[109] gridExtra_2.3 codetools_0.2-16 boot_1.3-25
[112] colourpicker_1.0 MASS_7.3-53 gtools_3.8.2
[115] assertthat_0.2.1 rprojroot_1.3-2 withr_2.2.0
[118] shinystan_2.5.0 multcomp_1.4-13 mgcv_1.8-33
[121] parallel_4.0.3 hms_0.5.3 grid_4.0.3
[124] coda_0.19-3 minqa_1.2.4 rmarkdown_2.5
[127] git2r_0.27.1 shiny_1.4.0.2 lubridate_1.7.10
[130] base64enc_0.1-3 dygraphs_1.1.1.6