mashr
Last updated: 2021-03-04
Checks: 6 1
Knit directory: fitnessGWAS/
This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.
Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.
Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.
The command set.seed(20180914)
was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.
Great job! Recording the operating system, R version, and package versions is critical for reproducibility.
To ensure reproducibility of the results, delete the cache directory plot_models_variant_effects_cache
and re-run the analysis. To have workflowr automatically delete the cache directory prior to building the file, set delete_cache = TRUE
when running wflow_build()
or wflow_publish()
.
Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.
Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.
The results in this page were generated with repository version c606d3d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.
Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish
or wflow_git_commit
). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:
Ignored files:
Ignored: .DS_Store
Ignored: .Rapp.history
Ignored: .Rhistory
Ignored: .Rproj.user/
Ignored: .httr-oauth
Ignored: .pversion
Ignored: analysis/.DS_Store
Ignored: analysis/correlations_SNP_effects_cache/
Ignored: analysis/plot_models_variant_effects_cache/
Ignored: code/.DS_Store
Ignored: code/Drosophila_GWAS.Rmd
Ignored: data/.DS_Store
Ignored: data/derived/
Ignored: data/input/.DS_Store
Ignored: data/input/.pversion
Ignored: data/input/dgrp.fb557.annot.txt
Ignored: data/input/dgrp2.bed
Ignored: data/input/dgrp2.bim
Ignored: data/input/dgrp2.fam
Ignored: data/input/huang_transcriptome/
Ignored: figures/.DS_Store
Untracked files:
Untracked: Rplots.pdf
Untracked: analysis/GO_KEGG_enrichment.Rmd
Untracked: analysis/TWAS_tables.Rmd
Untracked: code/GO_and_KEGG_gsea.R
Untracked: code/eQTL_analysis.Rmd
Untracked: code/make_sites_files_for_ARGweaver.R
Untracked: code/plotting_results.Rmd
Untracked: code/run_argweaver.R
Untracked: code/run_mashr.R
Untracked: data/argweaver/
Untracked: data/input/TF_binding_sites.csv
Untracked: figures/GWAS_mixture_proportions.rds
Untracked: figures/GWAS_stats_figure.pdf
Untracked: figures/TWAS_mixture_proportions.rds
Untracked: figures/TWAS_stats_figure.pdf
Untracked: figures/antagonism_ratios.pdf
Untracked: figures/composite_mixture_figure.pdf
Untracked: figures/eff_size_histos.pdf
Untracked: figures/stats_figure.pdf
Untracked: manuscript/
Unstaged changes:
Modified: .gitignore
Modified: README.md
Modified: analysis/GWAS_tables.Rmd
Modified: analysis/TWAS.Rmd
Modified: analysis/_site.yml
Deleted: analysis/about.Rmd
Deleted: analysis/eQTL_analysis.Rmd
Modified: analysis/gwas_adaptive_shrinkage.Rmd
Modified: analysis/license.Rmd
Modified: analysis/make_annotation_database.Rmd
Modified: analysis/plot_line_means.Rmd
Modified: analysis/plotting_results.Rmd
Deleted: code/gwas_adaptive_shrinkage.R
Modified: data/input/female_fitness.csv
Modified: data/input/female_fitness_CLEANED.csv
Modified: data/input/male_fitness.csv
Modified: data/input/male_fitness_CLEANED.csv
Modified: figures/figure1.eps
Deleted: figures/figure1.png
Modified: figures/figure2.eps
Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.
These are the previous versions of the repository in which changes were made to the R Markdown (analysis/plot_models_variant_effects.Rmd
) and HTML (docs/plot_models_variant_effects.html
) files. If you’ve configured a remote Git repository (see ?wflow_git_remote
), click on the hyperlinks in the table below to view the files as they were in that past version.
File | Version | Author | Date | Message |
---|---|---|---|---|
Rmd | c606d3d | lukeholman | 2021-03-04 | big first commit 2021 |
library(tidyverse)
library(gridExtra)
library(qqman)
library(ggbeeswarm)
library(Hmisc)
library(showtext) # For fancy Google font in figures
library(mashr)
library(kableExtra)
library(cowplot)
font_add_google(name = "Raleway", family = "Raleway", regular.wt = 400, bold.wt = 700) # Install font from Google Fonts
showtext_auto()
db <- DBI::dbConnect(RSQLite::SQLite(),
"data/derived/annotations.sqlite3")
# Results for all 1,613,615 SNPs, even those that are in 100% LD with others (these are grouped up by the SNP_clump column)
all_snps <- tbl(db, "univariate_lmm_results")
# All SNPs and SNP groups that are in <100% LD with one another (n = 1,207,357)
SNP_clumps <- all_snps %>% select(-SNP) %>% distinct() %>% collect(n = Inf)
Modify column names, reorder columns, filter based on the p-value, etc.
# Univariate analysis, mashr-adjusted, data-driven approach
univariate_lmm_results <- tbl(db, "univariate_lmm_results") %>%
select(-contains("canonical"),
-contains("raw")) %>%
inner_join(tbl(db, "variants") %>%
select(SNP, FBID, site.class, distance.to.gene, MAF),
by = "SNP") %>%
left_join(
tbl(db, "genes") %>%
select(FBID, gene_name), by = "FBID") %>%
collect(n = Inf) %>%
rename_all(~ gsub("beta_", "", .x)) %>%
rename_all(~ gsub("_mashr_ED", "", .x))
univariate_lmm_results <- univariate_lmm_results %>%
mutate(site.class = gsub("5_", "5-", site.class),
site.class = gsub("3_", "3-", site.class),
site.class = gsub("NON_", "NON-", site.class),
site.class = gsub("_", " ", site.class),
site.class = capitalize(tolower(site.class)),
site.class = gsub("Utr", "UTR", site.class)
)
# Results of one multivariate linear mixed model run in GEMMA:
# multivariate_lmm_results <-
# read_tsv("data/derived/output/all_four_traits.assoc.txt") %>%
# filter(p_fdr < 0.05) %>%
# inner_join(tbl(db, "variants") %>%
# select(-chr, -position) %>%
# collect(), by = "SNP") %>%
# arrange(p_wald) %>%
# select(-contains("Vbeta")) %>%
# left_join(
# tbl(db, "genes") %>%
# select(FBID, gene_name) %>%
# collect(n=Inf), by = "FBID") %>%
# mutate(gene_name = replace(gene_name, is.na(FBID), NA))
#
# multivariate_lmm_results <- multivariate_lmm_results[,c(1:8, 14, 9:13)]
#
# multivariate_lmm_results <- multivariate_lmm_results %>%
# mutate_at(vars(starts_with("p_")), ~ -log10(.x)) %>%
# mutate_if(is.numeric, ~ format(round(.x, 2), nsmall = 2)) %>%
# left_join(tbl(db, "univariate_lmm_results") %>%
# select(SNP, SNP_clump) %>%
# collect(),
# by = "SNP") %>%
# distinct(SNP_clump, .keep_all = TRUE) %>%
# select(SNP, SNP_clump, everything())
Here are some quantile-quantile plots, which are commonly used to check GWAS results, and to test the hypothesis that there are more SNPs than expected showing large effects on the trait of interest.
# univariate_lmm_pvals <- tbl(db, "univariate_lmm_results") %>%
# select(contains("LFSR")) %>% select(contains("ED")) %>% collect(n=Inf) %>% as.data.frame()
# qqman::qq(univariate_lmm_results$LFSR_female_early_mashr_ED)
univariate_lmm_pvals <- tbl(db, "univariate_lmm_results") %>%
select(contains("pvalue")) %>% collect(n=Inf) %>% as.data.frame()
qqman::qq(univariate_lmm_pvals$pvalue_female_early_raw)
Warning: The above code chunk cached its results, but it won’t be re-run if previous chunks it depends on are updated. If you need to use caching, it is highly recommended to also set knitr::opts_chunk$set(autodep = TRUE)
at the top of the file (in a chunk that is not cached). Alternatively, you can customize the option dependson
for each individual chunk that is cached. Using either autodep
or dependson
will remove this warning. See the knitr cache options for more details.
#qqman::qq(univariate_lmm_results$LFSR_female_late_mashr_ED)
qqman::qq(univariate_lmm_pvals$pvalue_female_late_raw)
Warning: The above code chunk cached its results, but it won’t be re-run if previous chunks it depends on are updated. If you need to use caching, it is highly recommended to also set knitr::opts_chunk$set(autodep = TRUE)
at the top of the file (in a chunk that is not cached). Alternatively, you can customize the option dependson
for each individual chunk that is cached. Using either autodep
or dependson
will remove this warning. See the knitr cache options for more details.
# qqman::qq(univariate_lmm_results$LFSR_male_early_mashr_ED)
qqman::qq(univariate_lmm_pvals$pvalue_male_early_raw)
Warning: The above code chunk cached its results, but it won’t be re-run if previous chunks it depends on are updated. If you need to use caching, it is highly recommended to also set knitr::opts_chunk$set(autodep = TRUE)
at the top of the file (in a chunk that is not cached). Alternatively, you can customize the option dependson
for each individual chunk that is cached. Using either autodep
or dependson
will remove this warning. See the knitr cache options for more details.
# qqman::qq(univariate_lmm_results$LFSR_male_late_mashr_ED)
qqman::qq(univariate_lmm_pvals$pvalue_male_late_raw)
Warning: The above code chunk cached its results, but it won’t be re-run if previous chunks it depends on are updated. If you need to use caching, it is highly recommended to also set knitr::opts_chunk$set(autodep = TRUE)
at the top of the file (in a chunk that is not cached). Alternatively, you can customize the option dependson
for each individual chunk that is cached. Using either autodep
or dependson
will remove this warning. See the knitr cache options for more details.
hex_plot <- function(x, y, xlab, ylab, title,
xlim = c(-0.6, 0.45),
ylim = c(-0.3, 0.41)
){
dat <- SNP_clumps %>%
mutate(facet = title)
ggplot(dat, aes_string(x, y)) +
geom_abline(linetype = 2) +
geom_vline(xintercept = 0, linetype = 3) +
geom_hline(yintercept = 0, linetype = 3) +
stat_binhex(bins = 50) +
geom_density_2d(colour = "white", alpha = 0.6) +
scale_fill_viridis_c() +
coord_cartesian(xlim = xlim, ylim = ylim) +
facet_wrap(~ facet) +
theme_bw() + xlab(xlab) + ylab(ylab) +
theme(legend.position = "none",
strip.background = element_blank(),
strip.text = element_text(hjust=0)) +
theme(text = element_text(family = "Raleway", size = 12))
}
grid.arrange(
hex_plot("beta_female_early_mashr_ED",
"beta_male_early_mashr_ED",
"Effect on female fitness",
NULL,
"A. Early life fitness"),
hex_plot("beta_female_late_mashr_ED",
"beta_male_late_mashr_ED",
"Effect on female fitness",
NULL,
"B. Late life fitness"),
ncol = 2, left = "Effect on male fitness")
SNP_clumps %>%
select(contains("mashr_ED")) %>%
select(contains("beta")) %>%
rename_all(~ paste(ifelse(str_detect(.x, "female"), "Female", "Male"),
ifelse(str_detect(.x, "early"), "early", "late"))) %>%
cor(method = "spearman") %>%
kable(digits = 3) %>% kable_styling(full_width = FALSE)
Female early | Female late | Male early | Male late | |
---|---|---|---|---|
Female early | 1.000 | 1.000 | 0.690 | 0.754 |
Female late | 1.000 | 1.000 | 0.686 | 0.751 |
Male early | 0.690 | 0.686 | 1.000 | 0.991 |
Male late | 0.754 | 0.751 | 0.991 | 1.000 |
The unusual distribution here reflects the fact that this analysis imposed constraints on the effects of each locus. For example, loci assigned to the ‘Female-specific’ category with high probability necessarily have an effect size close to zero in males. The figure should be interpreted cautiously - the main purpose of the canonical analysis is to inference the mixture proportions, and to assign mixture probabilities to each locus, not to accurately estimate the true effect size of each variant.
grid.arrange(
hex_plot("beta_female_early_mashr_canonical",
"beta_male_early_mashr_canonical",
"Effect on female fitness",
NULL,
"A. Early life fitness"),
hex_plot("beta_female_late_mashr_canonical",
"beta_male_late_mashr_canonical",
"Effect on female fitness",
NULL,
"B. Late life fitness"),
ncol = 2, left = "Effect on male fitness")
SNP_clumps %>%
select(contains("mashr_canonical")) %>%
select(contains("beta")) %>%
rename_all(~ paste(ifelse(str_detect(.x, "female"), "Female", "Male"),
ifelse(str_detect(.x, "early"), "early", "late"))) %>%
cor(method = "spearman") %>%
kable(digits = 3) %>% kable_styling(full_width = FALSE)
Female early | Female late | Male early | Male late | |
---|---|---|---|---|
Female early | 1.000 | 1.000 | 0.862 | 0.862 |
Female late | 1.000 | 1.000 | 0.862 | 0.862 |
Male early | 0.862 | 0.862 | 1.000 | 1.000 |
Male late | 0.862 | 0.862 | 1.000 | 1.000 |
Uses the variant effect sizes output by GEMMA, without applying any shrinkage (i.e. this is the data that was adjusted using mashr
).
grid.arrange(
hex_plot("beta_female_early_raw",
"beta_male_early_raw",
"Effect on female fitness",
NULL,
"A. Early life fitness",
xlim = c(-1, 1),
ylim = c(-1, 1)),
hex_plot("beta_female_late_raw",
"beta_male_late_raw",
"Effect on female fitness",
NULL,
"B. Late life fitness",
xlim = c(-1, 1),
ylim = c(-1, 1)),
ncol = 2, left = "Effect on male fitness")
SNP_clumps %>%
select(contains("raw")) %>%
select(contains("beta")) %>%
rename_all(~ paste(ifelse(str_detect(.x, "female"), "Female", "Male"),
ifelse(str_detect(.x, "early"), "early", "late"))) %>%
cor(method = "spearman") %>%
kable(digits = 3) %>% kable_styling(full_width = FALSE)
Female early | Female late | Male early | Male late | |
---|---|---|---|---|
Female early | 1.000 | 0.547 | 0.209 | 0.111 |
Female late | 0.547 | 1.000 | 0.204 | 0.157 |
Male early | 0.209 | 0.204 | 1.000 | 0.412 |
Male late | 0.111 | 0.157 | 0.412 | 1.000 |
mashr
# Make plot for the GWAS
mashr_results_canonical <- read_rds("data/derived/mashr_results_canonical.rds")
mashr_2L <- readRDS("data/derived/mashr_results_canonical_chr2L.rds")
mashr_2R <- readRDS("data/derived/mashr_results_canonical_chr2R.rds")
mashr_3L <- readRDS("data/derived/mashr_results_canonical_chr3L.rds")
mashr_3R <- readRDS("data/derived/mashr_results_canonical_chr3R.rds")
mashr_X <- readRDS("data/derived/mashr_results_canonical_chrX.rds")
mix <- bind_rows(
enframe(sort(get_estimated_pi(mashr_results_canonical))) %>%
mutate(Chromosome = "All"),
enframe(sort(get_estimated_pi(mashr_2L))) %>%
mutate(Chromosome = "2L"),
enframe(sort(get_estimated_pi(mashr_2R))) %>%
mutate(Chromosome = "2R"),
enframe(sort(get_estimated_pi(mashr_3L))) %>%
mutate(Chromosome = "3L"),
enframe(sort(get_estimated_pi(mashr_3R))) %>%
mutate(Chromosome = "3R"),
enframe(sort(get_estimated_pi(mashr_X))) %>%
mutate(Chromosome = "X")) %>%
rename(Mixture_component = name)
to_keep <- mix %>%
group_by(Mixture_component) %>%
summarise(value = max(value), .groups = "drop") %>%
filter(value > 0.01) %>%
pull(Mixture_component)
mix <- mix %>%
filter(Mixture_component %in% to_keep) %>%
spread(Mixture_component, value) %>%
rename(`Sexually concordant effect` = equal_effects,
`Female-specific effect` = Female_specific_1,
`Male-specific effect` = Male_specific_1,
`Sexually antagonistic effect` = Sex_antag_0.25,
`No effect on fitness` = null) %>%
gather(Mixture_component, value, -Chromosome) %>%
arrange(-value)
chr_levels <- mix %>%
filter(Mixture_component == "Sexually antagonistic effect") %>%
arrange(value) %>% pull(Chromosome)
mix <- mix %>%
mutate(Chromosome = factor(Chromosome, chr_levels),
Mixture_component = factor(Mixture_component,
c("Sexually antagonistic effect",
"Sexually concordant effect",
"Female-specific effect",
"Male-specific effect",
"No effect on fitness")))
p1 <- mix %>%
ggplot(aes(Chromosome, 100 * value)) +
geom_bar(stat = "identity",aes(fill = Chromosome), colour = "grey10", size = 0.3) +
scale_fill_brewer(palette = "Spectral", direction = -1) +
coord_flip() +
theme_bw() +
scale_y_continuous(expand = c(0,0), limits = c(0, 65)) +
scale_x_discrete(expand = c(0.14, 0.14)) +
theme(axis.ticks.y = element_blank(),
legend.position = "none",
panel.border = element_rect(size = 0.8),
text = element_text(family = "Raleway", size = 12),
strip.background = element_blank(),
panel.grid.major.y = element_blank()) +
ylab("Estimated % of loci") +
facet_wrap(~ Mixture_component, ncol = 1)
# Make plot for the TWAS
mashr_results_canonical <- readRDS("data/derived/TWAS/TWAS_canonical.rds")
mashr_2L <- readRDS("data/derived/TWAS/TWAS_canonical_2L.rds")
mashr_2R <- readRDS("data/derived/TWAS/TWAS_canonical_2R.rds")
mashr_3L <- readRDS("data/derived/TWAS/TWAS_canonical_3L.rds")
mashr_3R <- readRDS("data/derived/TWAS/TWAS_canonical_3R.rds")
mashr_X <- readRDS("data/derived/TWAS/TWAS_canonical_X.rds")
mix <- bind_rows(
enframe(sort(get_estimated_pi(mashr_results_canonical))) %>%
mutate(Chromosome = "All"),
enframe(sort(get_estimated_pi(mashr_2L))) %>%
mutate(Chromosome = "2L"),
enframe(sort(get_estimated_pi(mashr_2R))) %>%
mutate(Chromosome = "2R"),
enframe(sort(get_estimated_pi(mashr_3L))) %>%
mutate(Chromosome = "3L"),
enframe(sort(get_estimated_pi(mashr_3R))) %>%
mutate(Chromosome = "3R"),
enframe(sort(get_estimated_pi(mashr_X))) %>%
mutate(Chromosome = "X")) %>%
rename(Mixture_component = name) %>%
mutate(Mixture_component = str_remove_all(Mixture_component, "_0.25"),
Mixture_component = str_remove_all(Mixture_component, "_0.5"),
Mixture_component = str_remove_all(Mixture_component, "_0.75"),
Mixture_component = str_remove_all(Mixture_component, "_1.0")) %>%
group_by(Mixture_component, Chromosome) %>%
summarise(value = sum(value), .groups = "drop")
to_keep <- mix %>%
group_by(Mixture_component) %>%
summarise(value = max(value), .groups = "drop") %>%
filter(value > 0.01) %>%
pull(Mixture_component)
mix <- mix %>%
filter(Mixture_component %in% to_keep) %>%
spread(Mixture_component, value) %>%
rename(`Sexually concordant effect` = equal_effects,
`Female-specific effect` = Female_specific_1,
`Male-specific effect` = Male_specific_1,
`Sexually antagonistic effect` = Sex_antag,
`No effect on fitness` = null) %>%
gather(Mixture_component, value, -Chromosome) %>%
arrange(-value)
chr_levels <- rev(c("X", "3L", "3R", "All", "2R", "2L"))
mix <- mix %>%
mutate(Chromosome = factor(Chromosome, chr_levels),
Mixture_component = factor(Mixture_component,
c("Sexually antagonistic effect",
"Sexually concordant effect",
"Female-specific effect",
"Male-specific effect",
"No effect on fitness")))
p2 <- mix %>%
ggplot(aes(Chromosome, 100 * value)) +
geom_bar(stat = "identity",aes(fill = Chromosome), colour = "grey10", size = 0.3) +
scale_fill_brewer(palette = "Spectral", direction = -1) +
coord_flip() +
theme_bw() +
scale_y_continuous(expand = c(0,0), limits = c(0, 65)) + #
scale_x_discrete(expand = c(0.14, 0.14)) +
theme(axis.ticks.y = element_blank(),
legend.position = "none",
panel.border = element_rect(size = 0.8),
text = element_text(family = "Raleway", size = 12),
strip.background = element_blank(),
panel.grid.major.y = element_blank()) +
ylab("Estimated % of transcripts") +
xlab(" ") +
facet_wrap(~ Mixture_component, ncol = 1)
# Save composite figure of the GWAS and TWAS mixture proportions
ggsave(filename = "figures/composite_mixture_figure.pdf",
cowplot::plot_grid(p1, p2, labels = c('A', 'B'), label_size = 12),
width = 5, height = 8)
cowplot::plot_grid(p1, p2, labels = c('A', 'B'), label_size = 12)
Figure X: Proportions of each type of locus, as estimated using the mixture model computed by mashr
in the analysis using canonical covariance matrices. This analysis involved a number of pre-specified covariance matrices, each corresponding to a type of locus that we hypothesised to exist (shown in panel headings). The analysis fit some other matrix types not shown here, because the corresponding locus type was inferred to be rare/absent (these included neutral loci, estimated to comprise 0.5-1% of those tested, and age-antagonistic loci, none of which were detected). The analysis was run either using all 1,207,357 loci for which data were available (labeled ‘All’) or for all loci on each of the chromosomes (chromosomes 4 and Y had insufficient data). Notably, sexually-antagonistic loci were inferred to be especially common on the X chromosome, loci that affected males only were inferred to be rarer than those affecting females only, and chromosome 2R had many more female-specific than male-specific loci.
The following models test whether site class, MAF (minor allele frequency) and chromosome predict how each locus affected our four phenotypes.
These analyses use the mixture assignment probabilities calculated by mashr
(in the “canonical” analysis). For every locus, the mashr
analysis calculated the probability that each locus was A) sexually antagonistic (meaning it had opposite effects on male and female fitness that were consistent across the two age categories), B) female-specific (meaning it affect female fitness but not male), C) male-specific (affecting male fitness only), and D) had equal effects on all four phenotypes (mean the allele that associated with higher fitness in males also was associated with higher fitness in females). Other types of loci were also considered in the model (e.g. null effect loci and ones with age-specific effects), but these were found to be quite rare, precluding analysis here.
Here, we use these mixture assignment probabilities as the response variable, in order to ask whether e.g. sexually antagonistic loci tend to appear in any particular type of site or chromosome, and whether they had a higher or lower MAF than average. Since these probabilities are bounded between zero and one, we use beta regression (a form of GLM designed to model variables in the range [0,1]). We fit a full model with site class, MAF and chromosome as predictor variables, as well as all the simpler nested models, and compare them using AICc model selection. We also present the results of the full model, and plot the parameter estimates from each of the four models below.
The dataset used in this model is
library(betareg)
library(MuMIn)
library("lmtest")
LFSR_cutoff <- 0.05
dat <- univariate_lmm_results %>%
filter(LFSR_female_early < LFSR_cutoff | LFSR_female_late < LFSR_cutoff |
LFSR_male_early < LFSR_cutoff | LFSR_male_late < LFSR_cutoff) %>%
mutate(chr = gsub("_", "", substr(SNP, 1, 2))) %>%
select(SNP, SNP_clump, starts_with("P_"), MAF, site.class, chr) %>%
distinct()
# Focus only on the commonest site classes:
dat <- dat %>%
filter(site.class %in% c("Intron", "Intergenic", "Downstream", "Upstream",
"Synonymous coding", "Non-synonymous coding", "UTR 3-prime",
"Exon", "UTR 5-prime"))
# Remove chromosome 4 (too few sites)
dat <- dat %>% filter(chr != "4")
# If there are multiple site classes for a SNP, or multiple SNPs
# in the same 100% LD clump, pick one SNP and/or 1 site class at random
set.seed(1)
dat <- dat[sample(nrow(dat),nrow(dat)), ] %>%
split(.$SNP_clump) %>%
map_df(~ .x[1, ])
dat <- dat %>% arrange(site.class)
dat$site.class <- relevel(factor(dat$site.class), ref = "Synonymous coding")
n_loci <- prettyNum(nrow(dat), big.mark = ",", scientific = FALSE)
compare_mods <- function(MAF_and_Chromosome_and_site_class){
MAF_and_Chromosome <- update(MAF_and_Chromosome_and_site_class, ~ . -site.class)
MAF_and_siteclass <- update(MAF_and_Chromosome_and_site_class, ~ . -chr)
Chromosome_and_siteclass <- update(MAF_and_Chromosome_and_site_class, ~ . -chr)
MAF <- update(MAF_and_Chromosome_and_site_class, ~ . -site.class - chr)
Chromosome <- update(MAF_and_Chromosome_and_site_class, ~ . -site.class - MAF)
siteclass <- update(MAF_and_Chromosome_and_site_class, ~ . - MAF -chr)
Null_model <- update(MAF_and_Chromosome_and_site_class, ~ . -site.class - MAF -chr)
AICc(MAF_and_Chromosome_and_site_class,
MAF_and_Chromosome,
MAF_and_siteclass,
Chromosome_and_siteclass,
MAF, Chromosome, siteclass,
Null_model) %>%
rownames_to_column("Model") %>%
arrange(AICc) %>%
mutate(delta = AICc - AICc[1],
Model = str_replace_all(Model, "_and+_", " + "),
Weight = round(exp(-0.5 * delta) / sum(exp(-0.5 * delta)), 3),
delta = round(delta, 2)) %>%
kable(digits = 2) %>% kable_styling(full_width = FALSE)
}
In each of the following analyses, the response variable is the mixture assignment probability to type i for each of the 68,655 loci that affected at least one of the phenotypes significantly (defined as LFSR < 0.05), where i is one of the four variant types shown in the above figure. In cases where multiple SNP or indel loci were in 100% linkage with one another, we picked a single locus at random and discarded the others.
Three predictors were available for each locus: the minor allele frequency (MAF) in the overall DGRP, the site class of the variant, and chromosome. Loci that were annotated with more than one site class (e.g. intron as well as exon, due to an overlap of genes) were assigned one of these site classes at random.
To evaluate the effects of three predictors on the response variable, we fit 8 nested models and compared them using AICc. We also present the summary()
output for the top-ranked model according to AICc.
betareg(P_sex_antag ~ MAF + chr + site.class,
data = dat) %>% compare_mods()
Model | df | AICc | delta | Weight |
---|---|---|---|---|
MAF + Chromosome + site_class | 15 | -241307.6 | 0.00 | 1 |
MAF + Chromosome | 7 | -241296.6 | 10.98 | 0 |
MAF + siteclass | 11 | -241134.7 | 172.89 | 0 |
Chromosome + siteclass | 11 | -241134.7 | 172.89 | 0 |
MAF | 3 | -241122.7 | 184.92 | 0 |
Chromosome | 6 | -239763.5 | 1544.12 | 0 |
siteclass | 10 | -239576.6 | 1731.02 | 0 |
Null_model | 2 | -239566.2 | 1741.41 | 0 |
p1 <- summary(betareg(P_sex_antag ~ MAF + chr + site.class, data = dat), type = "deviance")
p1
Call:
betareg(formula = P_sex_antag ~ MAF + chr + site.class, data = dat)
Deviance residuals:
Min 1Q Median 3Q Max
-3.7291 -0.3614 -0.2404 0.5699 9.5453
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.798139 0.010795 -259.199 < 2e-16 ***
MAF 0.775745 0.019949 38.887 < 2e-16 ***
chr2R 0.069455 0.007259 9.568 < 2e-16 ***
chr3L 0.048502 0.007160 6.774 1.25e-11 ***
chr3R 0.084534 0.007546 11.202 < 2e-16 ***
chrX 0.086050 0.008029 10.718 < 2e-16 ***
site.classDownstream -0.001505 0.011094 -0.136 0.8921
site.classExon -0.014302 0.024914 -0.574 0.5659
site.classIntergenic 0.023060 0.009426 2.447 0.0144 *
site.classIntron 0.017865 0.008451 2.114 0.0345 *
site.classNon-synonymous coding -0.018356 0.017827 -1.030 0.3032
site.classUpstream 0.005090 0.011299 0.450 0.6524
site.classUTR 3-prime -0.010520 0.018416 -0.571 0.5678
site.classUTR 5-prime -0.051042 0.024760 -2.062 0.0393 *
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 29.3662 0.1622 181.1 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.207e+05 on 15 Df
Pseudo R-squared: 0.02869
Number of iterations: 23 (BFGS) + 2 (Fisher scoring)
betareg(P_female_specific ~ MAF + chr + site.class,
data = dat) %>% compare_mods()
Model | df | AICc | delta | Weight |
---|---|---|---|---|
MAF + Chromosome + site_class | 15 | -81289.74 | 0.00 | 1 |
MAF + Chromosome | 7 | -81271.61 | 18.13 | 0 |
Chromosome | 6 | -81253.26 | 36.48 | 0 |
MAF + siteclass | 11 | -80724.88 | 564.86 | 0 |
Chromosome + siteclass | 11 | -80724.88 | 564.86 | 0 |
MAF | 3 | -80712.02 | 577.72 | 0 |
siteclass | 10 | -80712.01 | 577.74 | 0 |
Null_model | 2 | -80698.62 | 591.12 | 0 |
p2 <- summary(betareg(P_female_specific ~ MAF + chr + site.class, data = dat), type = "deviance")
p2
Call:
betareg(formula = P_female_specific ~ MAF + chr + site.class, data = dat)
Deviance residuals:
Min 1Q Median 3Q Max
-1.7683 -0.5677 -0.3655 0.8111 5.3896
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.458023 0.016079 -90.679 < 2e-16 ***
MAF -0.132799 0.029881 -4.444 8.82e-06 ***
chr2R 0.239335 0.010850 22.058 < 2e-16 ***
chr3L 0.123513 0.010722 11.520 < 2e-16 ***
chr3R 0.193970 0.011316 17.141 < 2e-16 ***
chrX 0.179009 0.012085 14.813 < 2e-16 ***
site.classDownstream 0.018385 0.016622 1.106 0.2687
site.classExon -0.002719 0.037132 -0.073 0.9416
site.classIntergenic 0.055367 0.014154 3.912 9.16e-05 ***
site.classIntron 0.024304 0.012698 1.914 0.0556 .
site.classNon-synonymous coding -0.031334 0.026626 -1.177 0.2393
site.classUpstream 0.018190 0.016945 1.073 0.2831
site.classUTR 3-prime -0.018787 0.027549 -0.682 0.4953
site.classUTR 5-prime -0.059117 0.036726 -1.610 0.1075
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 4.17855 0.02205 189.5 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 4.066e+04 on 15 Df
Pseudo R-squared: 0.009817
Number of iterations: 23 (BFGS) + 2 (Fisher scoring)
betareg(P_male_specific ~ MAF + chr + site.class,
data = dat) %>% compare_mods()
Model | df | AICc | delta | Weight |
---|---|---|---|---|
MAF + Chromosome | 7 | -381128.0 | 0.00 | 0.98 |
MAF + Chromosome + site_class | 15 | -381120.2 | 7.77 | 0.02 |
Chromosome | 6 | -381055.3 | 72.69 | 0.00 |
MAF | 3 | -380995.0 | 132.99 | 0.00 |
MAF + siteclass | 11 | -380984.5 | 143.55 | 0.00 |
Chromosome + siteclass | 11 | -380984.5 | 143.55 | 0.00 |
Null_model | 2 | -380928.0 | 199.97 | 0.00 |
siteclass | 10 | -380918.0 | 209.97 | 0.00 |
p3 <- summary(betareg(P_male_specific ~ MAF + chr + site.class, data = dat), type = "deviance")
p3
Call:
betareg(formula = P_male_specific ~ MAF + chr + site.class, data = dat)
Deviance residuals:
Min 1Q Median 3Q Max
-2.1176 -0.8374 -0.7227 -0.5800 9.2003
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.747522 0.014364 -260.905 < 2e-16 ***
MAF 0.227406 0.026558 8.563 < 2e-16 ***
chr2R -0.114192 0.009635 -11.851 < 2e-16 ***
chr3L -0.041503 0.009396 -4.417 1.00e-05 ***
chr3R -0.052498 0.009978 -5.261 1.43e-07 ***
chrX -0.056485 0.010663 -5.297 1.18e-07 ***
site.classDownstream -0.014242 0.014758 -0.965 0.335
site.classExon -0.011391 0.032946 -0.346 0.730
site.classIntergenic -0.018763 0.012568 -1.493 0.135
site.classIntron 0.002749 0.011250 0.244 0.807
site.classNon-synonymous coding 0.012947 0.023487 0.551 0.581
site.classUpstream -0.002264 0.015030 -0.151 0.880
site.classUTR 3-prime 0.002971 0.024382 0.122 0.903
site.classUTR 5-prime -0.014438 0.032531 -0.444 0.657
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 34.7149 0.2147 161.7 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.906e+05 on 15 Df
Pseudo R-squared: 0.006034
Number of iterations: 29 (BFGS) + 2 (Fisher scoring)
betareg(P_equal_effects ~ MAF + chr + site.class,
data = dat) %>% compare_mods()
Model | df | AICc | delta | Weight |
---|---|---|---|---|
MAF + Chromosome + site_class | 15 | -37638.22 | 0.00 | 1 |
MAF + Chromosome | 7 | -37620.58 | 17.63 | 0 |
Chromosome | 6 | -37419.94 | 218.28 | 0 |
MAF + siteclass | 11 | -37318.73 | 319.49 | 0 |
Chromosome + siteclass | 11 | -37318.73 | 319.49 | 0 |
MAF | 3 | -37301.67 | 336.54 | 0 |
siteclass | 10 | -37105.80 | 532.42 | 0 |
Null_model | 2 | -37090.34 | 547.87 | 0 |
p4 <- summary(betareg(P_equal_effects ~ MAF + chr + site.class, data = dat), type = "deviance")
p4
Call:
betareg(formula = P_equal_effects ~ MAF + chr + site.class, data = dat)
Deviance residuals:
Min 1Q Median 3Q Max
-5.1450 -0.6580 0.1940 0.3527 2.0818
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.022064 0.015466 66.083 < 2e-16 ***
MAF -0.412143 0.028894 -14.264 < 2e-16 ***
chr2R -0.154896 0.010488 -14.769 < 2e-16 ***
chr3L -0.094204 0.010334 -9.116 < 2e-16 ***
chr3R -0.157356 0.010922 -14.407 < 2e-16 ***
chrX -0.156126 0.011655 -13.396 < 2e-16 ***
site.classDownstream -0.008109 0.016067 -0.505 0.613774
site.classExon 0.017694 0.035920 0.493 0.622298
site.classIntergenic -0.045783 0.013680 -3.347 0.000817 ***
site.classIntron -0.031409 0.012266 -2.561 0.010446 *
site.classNon-synonymous coding 0.031656 0.025722 1.231 0.218448
site.classUpstream -0.011775 0.016378 -0.719 0.472181
site.classUTR 3-prime 0.004774 0.026596 0.179 0.857551
site.classUTR 5-prime 0.068546 0.035500 1.931 0.053496 .
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 3.99001 0.01987 200.8 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.883e+04 on 15 Df
Pseudo R-squared: 0.009275
Number of iterations: 23 (BFGS) + 2 (Fisher scoring)
twas <- readRDS("data/derived/TWAS/TWAS_mixture_assignment_probabilities.rds") %>%
left_join(tbl(db, "genes") %>% select(FBID, chromosome) %>% collect(), by = "FBID") %>%
left_join(read_csv("data/derived/gene_expression_by_sex.csv"), by = "FBID") %>%
filter(chromosome %in% c("2L", "2R", "3L", "3R", "X")) %>%
mutate(h2 = (female_narrow_heritability + male_narrow_heritability) / 2)
compare_mods <- function(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability){
Chromosome_and_Expression_level_and_Heritability <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability,
~ . -male_bias_in_expression)
Sex_bias_and_Chromosome_and_Expression_level <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability,
~ . -h2)
Sex_bias_and_Expression_level_and_Heritability <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability,
~ . -chromosome)
Sex_bias_and_Chromosome_and_Heritability <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability,
~ . -AveExpr)
Chromosome_and_Expression_level <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, ~ . -male_bias_in_expression -h2)
Sex_bias_and_Expression_level <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, ~ . -chromosome -h2)
Sex_bias_and_Chromosome <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, ~ . -AveExpr -h2)
Heritability_and_Expression_level <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability,
~ . -male_bias_in_expression -chromosome)
Heritability_and_Sex_bias <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability,
~ . -chromosome -AveExpr)
Heritability_and_Chromosome <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability,
~ . -male_bias_in_expression -AveExpr)
Chromosome <- update(Chromosome_and_Expression_level, ~ . -AveExpr)
Sex_bias <- update(Sex_bias_and_Expression_level, ~ . -AveExpr)
Expression_level <- update(Sex_bias_and_Expression_level, ~ . - male_bias_in_expression)
Heritability <- update(Heritability_and_Sex_bias, ~ . - male_bias_in_expression)
Null_model <- update(Chromosome, ~ . -chromosome)
AICc(
Sex_bias_and_Chromosome_and_Expression_level_and_Heritability,
Chromosome_and_Expression_level_and_Heritability,
Sex_bias_and_Chromosome_and_Expression_level,
Sex_bias_and_Expression_level_and_Heritability,
Sex_bias_and_Chromosome_and_Heritability,
Chromosome_and_Expression_level,
Sex_bias_and_Expression_level,
Sex_bias_and_Chromosome,
Heritability_and_Expression_level,
Heritability_and_Sex_bias,
Heritability_and_Chromosome,
Chromosome, Sex_bias, Expression_level, Heritability,
Null_model) %>%
rownames_to_column("Model") %>%
arrange(AICc) %>%
mutate(delta = AICc - AICc[1],
Model = str_replace_all(Model, "_and+_", " + "),
Weight = round(exp(-0.5 * delta) / sum(exp(-0.5 * delta)), 3),
delta = round(delta, 2))
}
betareg(P_sex_antag ~ male_bias_in_expression + chromosome + AveExpr + h2,
data = twas) %>% compare_mods()
Model df AICc delta
1 Sex_bias + Expression_level 4 -30441.41 0.00
2 Sex_bias + Chromosome + Expression_level 8 -30441.06 0.34
3 Sex_bias + Expression_level + Heritability 5 -30439.52 1.89
4 Sex_bias + Chromosome + Expression_level + Heritability 9 -30439.15 2.26
5 Expression_level 3 -30422.02 19.38
6 Chromosome + Expression_level 7 -30421.95 19.46
7 Heritability + Expression_level 4 -30421.12 20.28
8 Chromosome + Expression_level + Heritability 8 -30420.90 20.51
9 Sex_bias + Chromosome + Heritability 8 -30392.54 48.86
10 Heritability + Sex_bias 4 -30392.47 48.94
11 Sex_bias 3 -30388.94 52.46
12 Sex_bias + Chromosome 7 -30388.74 52.67
13 Heritability + Chromosome 7 -30359.08 82.33
14 Heritability 3 -30358.16 83.25
15 Chromosome 6 -30357.43 83.98
16 Null_model 2 -30356.87 84.54
Weight
1 0.391
2 0.330
3 0.152
4 0.127
5 0.000
6 0.000
7 0.000
8 0.000
9 0.000
10 0.000
11 0.000
12 0.000
13 0.000
14 0.000
15 0.000
16 0.000
q1 <- summary(betareg(P_sex_antag ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q1
Call:
betareg(formula = P_sex_antag ~ male_bias_in_expression + AveExpr + chromosome +
h2, data = twas)
Deviance residuals:
Min 1Q Median 3Q Max
-8.4438 -0.3042 0.0289 0.2959 9.1754
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.296825 0.036808 8.064 7.37e-16 ***
male_bias_in_expression 0.029206 0.006488 4.502 6.74e-06 ***
AveExpr -0.044069 0.006316 -6.977 3.02e-12 ***
chromosome2R 0.011626 0.008820 1.318 0.187
chromosome3L 0.002737 0.008925 0.307 0.759
chromosome3R -0.010594 0.008469 -1.251 0.211
chromosomeX -0.003335 0.009569 -0.349 0.727
h2 0.007332 0.024456 0.300 0.764
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 35.4527 0.4151 85.41 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.523e+04 on 9 Df
Pseudo R-squared: 0.006467
Number of iterations: 14 (BFGS) + 2 (Fisher scoring)
betareg(P_female_specific ~ male_bias_in_expression + chromosome + AveExpr + h2,
data = twas) %>% compare_mods()
Model df AICc delta
1 Sex_bias + Chromosome + Expression_level + Heritability 9 -49193.02 0.00
2 Sex_bias + Expression_level + Heritability 5 -49183.00 10.02
3 Chromosome + Expression_level + Heritability 8 -49180.74 12.28
4 Heritability + Expression_level 4 -49172.53 20.49
5 Sex_bias + Chromosome + Heritability 8 -49140.68 52.34
6 Heritability + Chromosome 7 -49136.83 56.19
7 Heritability + Sex_bias 4 -49126.38 66.64
8 Heritability 3 -49124.07 68.95
9 Sex_bias + Chromosome + Expression_level 8 -49098.44 94.58
10 Chromosome + Expression_level 7 -49094.93 98.09
11 Sex_bias + Chromosome 7 -49084.10 108.92
12 Chromosome 6 -49082.96 110.06
13 Sex_bias + Expression_level 4 -49081.44 111.58
14 Expression_level 3 -49079.47 113.55
15 Null_model 2 -49065.62 127.40
16 Sex_bias 3 -49065.52 127.50
Weight
1 0.991
2 0.007
3 0.002
4 0.000
5 0.000
6 0.000
7 0.000
8 0.000
9 0.000
10 0.000
11 0.000
12 0.000
13 0.000
14 0.000
15 0.000
16 0.000
q2 <- summary(betareg(P_female_specific ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q2
Call:
betareg(formula = P_female_specific ~ male_bias_in_expression + AveExpr +
chromosome + h2, data = twas)
Deviance residuals:
Min 1Q Median 3Q Max
-5.8656 -0.2861 -0.2321 0.3132 9.7894
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.83425 0.06386 -44.384 < 2e-16 ***
male_bias_in_expression 0.04314 0.01121 3.848 0.000119 ***
AveExpr 0.08013 0.01094 7.327 2.34e-13 ***
chromosome2R 0.01133 0.01535 0.738 0.460479
chromosome3L 0.02644 0.01550 1.706 0.088002 .
chromosome3R 0.01641 0.01474 1.114 0.265364
chromosomeX 0.06539 0.01646 3.972 7.14e-05 ***
h2 -0.40886 0.04342 -9.417 < 2e-16 ***
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 33.4005 0.4021 83.06 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 2.461e+04 on 9 Df
Pseudo R-squared: 0.008763
Number of iterations: 17 (BFGS) + 2 (Fisher scoring)
betareg(P_male_specific ~ male_bias_in_expression + chromosome + AveExpr + h2,
data = twas) %>% compare_mods()
Model df AICc delta
1 Sex_bias + Expression_level + Heritability 5 -47455.26 0.00
2 Sex_bias + Chromosome + Expression_level + Heritability 9 -47450.05 5.21
3 Heritability + Expression_level 4 -47444.90 10.36
4 Chromosome + Expression_level + Heritability 8 -47439.34 15.91
5 Sex_bias + Expression_level 4 -47439.03 16.23
6 Sex_bias + Chromosome + Expression_level 8 -47433.74 21.52
7 Expression_level 3 -47423.39 31.87
8 Chromosome + Expression_level 7 -47417.98 37.27
9 Heritability + Sex_bias 4 -47414.16 41.10
10 Sex_bias 3 -47412.03 43.23
11 Sex_bias + Chromosome + Heritability 8 -47408.91 46.34
12 Sex_bias + Chromosome 7 -47406.82 48.44
13 Heritability 3 -47392.28 62.98
14 Null_model 2 -47388.24 67.02
15 Heritability + Chromosome 7 -47387.03 68.22
16 Chromosome 6 -47383.19 72.07
Weight
1 0.926
2 0.068
3 0.005
4 0.000
5 0.000
6 0.000
7 0.000
8 0.000
9 0.000
10 0.000
11 0.000
12 0.000
13 0.000
14 0.000
15 0.000
16 0.000
q3 <- summary(betareg(P_male_specific ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q3
Call:
betareg(formula = P_male_specific ~ male_bias_in_expression + AveExpr +
chromosome + h2, data = twas)
Deviance residuals:
Min 1Q Median 3Q Max
-4.6317 -0.3338 -0.2835 0.2417 7.8801
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.845465 0.070036 -40.629 < 2e-16 ***
male_bias_in_expression -0.043538 0.012423 -3.505 0.000457 ***
AveExpr 0.078981 0.011989 6.588 4.46e-11 ***
chromosome2R -0.020552 0.016788 -1.224 0.220868
chromosome3L -0.012103 0.016969 -0.713 0.475699
chromosome3R 0.003115 0.016070 0.194 0.846299
chromosomeX -0.008758 0.018145 -0.483 0.629336
h2 -0.193851 0.046887 -4.134 3.56e-05 ***
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 27.0203 0.3273 82.54 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 2.373e+04 on 9 Df
Pseudo R-squared: 0.005672
Number of iterations: 16 (BFGS) + 2 (Fisher scoring)
betareg(P_equal_effects ~ male_bias_in_expression + chromosome + AveExpr + chromosome + h2,
data = twas) %>% compare_mods()
Model df AICc delta
1 Sex_bias + Chromosome + Expression_level + Heritability 9 -28799.43 0.00
2 Sex_bias + Expression_level + Heritability 5 -28794.12 5.30
3 Chromosome + Expression_level + Heritability 8 -28789.58 9.85
4 Sex_bias + Chromosome + Expression_level 8 -28783.96 15.47
5 Heritability + Expression_level 4 -28783.81 15.62
6 Sex_bias + Expression_level 4 -28777.46 21.97
7 Chromosome + Expression_level 7 -28769.30 30.13
8 Expression_level 3 -28761.81 37.62
9 Sex_bias + Chromosome 7 -28689.41 110.02
10 Sex_bias + Chromosome + Heritability 8 -28687.47 111.96
11 Sex_bias 3 -28682.03 117.39
12 Heritability + Sex_bias 4 -28680.19 119.24
13 Chromosome 6 -28660.46 138.97
14 Heritability + Chromosome 7 -28658.93 140.50
15 Null_model 2 -28651.16 148.27
16 Heritability 3 -28649.93 149.49
Weight
1 0.927
2 0.065
3 0.007
4 0.000
5 0.000
6 0.000
7 0.000
8 0.000
9 0.000
10 0.000
11 0.000
12 0.000
13 0.000
14 0.000
15 0.000
16 0.000
q4 <- summary(betareg(P_equal_effects ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q4
Call:
betareg(formula = P_equal_effects ~ male_bias_in_expression + AveExpr +
chromosome + h2, data = twas)
Deviance residuals:
Min 1Q Median 3Q Max
-4.7139 -0.2446 -0.1877 0.2974 7.6173
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.209333 0.067199 -32.877 < 2e-16 ***
male_bias_in_expression -0.040995 0.011892 -3.447 0.000566 ***
AveExpr 0.122788 0.011499 10.678 < 2e-16 ***
chromosome2R -0.024335 0.016146 -1.507 0.131758
chromosome3L -0.001919 0.016297 -0.118 0.906247
chromosome3R 0.026460 0.015421 1.716 0.086181 .
chromosomeX 0.020194 0.017384 1.162 0.245382
h2 -0.183550 0.044779 -4.099 4.15e-05 ***
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 15.8446 0.1861 85.16 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.441e+04 on 9 Df
Pseudo R-squared: 0.01127
Number of iterations: 16 (BFGS) + 2 (Fisher scoring)
betareg(P_null ~ male_bias_in_expression + chromosome + AveExpr + chromosome + h2,
data = twas) %>% compare_mods()
Model df AICc delta
1 Sex_bias + Chromosome + Heritability 8 -34144.13 0.00
2 Sex_bias + Chromosome + Expression_level + Heritability 9 -34144.06 0.07
3 Sex_bias + Expression_level + Heritability 5 -34131.89 12.24
4 Heritability + Sex_bias 4 -34131.37 12.76
5 Heritability + Chromosome 7 -34046.97 97.16
6 Chromosome + Expression_level + Heritability 8 -34045.38 98.75
7 Heritability 3 -34040.36 103.77
8 Heritability + Expression_level 4 -34038.56 105.57
9 Sex_bias + Chromosome + Expression_level 8 -33977.32 166.82
10 Sex_bias + Chromosome 7 -33967.50 176.63
11 Sex_bias + Expression_level 4 -33960.34 183.79
12 Sex_bias 3 -33951.77 192.36
13 Chromosome + Expression_level 7 -33913.19 230.94
14 Expression_level 3 -33902.16 241.97
15 Chromosome 6 -33892.38 251.75
16 Null_model 2 -33883.30 260.83
Weight
1 0.508
2 0.490
3 0.001
4 0.001
5 0.000
6 0.000
7 0.000
8 0.000
9 0.000
10 0.000
11 0.000
12 0.000
13 0.000
14 0.000
15 0.000
16 0.000
q5 <- summary(betareg(P_null ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q5
Call:
betareg(formula = P_null ~ male_bias_in_expression + AveExpr + chromosome +
h2, data = twas)
Deviance residuals:
Min 1Q Median 3Q Max
-5.7312 -0.2524 0.2238 0.5991 4.1395
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.83405 0.06705 -27.352 < 2e-16 ***
male_bias_in_expression 0.11931 0.01170 10.200 < 2e-16 ***
AveExpr 0.01597 0.01151 1.388 0.165237
chromosome2R 0.05831 0.01607 3.630 0.000284 ***
chromosome3L 0.02796 0.01632 1.713 0.086675 .
chromosome3R 0.02435 0.01550 1.571 0.116190
chromosomeX 0.06474 0.01739 3.723 0.000197 ***
h2 -0.58302 0.04607 -12.656 < 2e-16 ***
Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi) 18.8218 0.2231 84.36 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.708e+04 on 9 Df
Pseudo R-squared: 0.01273
Number of iterations: 19 (BFGS) + 2 (Fisher scoring)
gwas_stats_fig <- bind_rows(p1$coefficients$mean %>%
as.data.frame() %>%
rownames_to_column("Parameter") %>%
mutate(type = "P(sexually antagonistic)"),
p2$coefficients$mean %>%
as.data.frame() %>%
rownames_to_column("Parameter") %>%
mutate(type = "P(female-specific)"),
p3$coefficients$mean %>%
as.data.frame() %>%
rownames_to_column("Parameter") %>%
mutate(type = "P(male-specific)"),
p4$coefficients$mean %>%
as.data.frame() %>%
rownames_to_column("Parameter") %>%
mutate(type = "P(equal effects)")) %>%
mutate(Parameter = str_replace_all(Parameter, "site[.]class", "Site class: "),
Parameter = str_replace_all(Parameter, "chr", "Chromosome: "),
sig = ifelse(`Pr(>|z|)` < 0.05, "yes", "no")) %>%
filter(Parameter != "(Intercept)") %>%
arrange(Parameter) %>%
mutate(Parameter = factor(Parameter, rev(unique(Parameter)))) %>%
ggplot(aes(Parameter, Estimate, colour = sig)) +
geom_hline(yintercept = 0, linetype = 2) +
geom_errorbar(aes(ymin = Estimate - `Std. Error`*1.96, ymax = Estimate + `Std. Error`*1.96), width = 0) +
geom_point() +
coord_flip() +
facet_wrap(~type) +
theme_bw() +
theme(legend.position = "none",
strip.background = element_blank(),
text = element_text(family = "Raleway", size = 12)) +
scale_color_manual(values = c("black", "tomato")) +
ylab("Estimate \u00B1 95% CIs")
twas_stats_fig <- bind_rows(q1$coefficients$mean %>%
as.data.frame() %>%
rownames_to_column("Parameter") %>%
mutate(type = "P(sexually antagonistic)"),
q2$coefficients$mean %>%
as.data.frame() %>%
rownames_to_column("Parameter") %>%
mutate(type = "P(female-specific)"),
q3$coefficients$mean %>%
as.data.frame() %>%
rownames_to_column("Parameter") %>%
mutate(type = "P(male-specific)"),
q4$coefficients$mean %>%
as.data.frame() %>%
rownames_to_column("Parameter") %>%
mutate(type = "P(equal effects)"),
q5$coefficients$mean %>%
as.data.frame() %>%
rownames_to_column("Parameter") %>%
mutate(type = "P(null effect)")) %>%
mutate(Parameter = str_replace_all(Parameter, "AveExpr", "Average expression level"),
Parameter = str_replace_all(Parameter, "male_bias_in_expression", "Male bias in expression"),
Parameter = str_replace_all(Parameter, "chromosome", "Chromosome: "),
sig = ifelse(`Pr(>|z|)` < 0.05, "yes", "no"),
type = factor(type, c("P(null effect)", "P(equal effects)", "P(sexually antagonistic)",
"P(female-specific)", "P(male-specific)"))) %>%
filter(Parameter != "(Intercept)") %>%
arrange(Parameter) %>%
mutate(Parameter = factor(Parameter, rev(unique(Parameter)))) %>%
ggplot(aes(Parameter, Estimate, colour = sig)) +
geom_hline(yintercept = 0, linetype = 2) +
geom_errorbar(aes(ymin = Estimate - `Std. Error`*1.96, ymax = Estimate + `Std. Error`*1.96), width = 0) +
geom_point() +
coord_flip() +
facet_wrap(~type) +
theme_bw() +
theme(legend.position = "none",
strip.background = element_blank(),
text = element_text(family = "Raleway", size = 12)) +
scale_color_manual(values = c("black", "tomato")) +
ylab("Estimate \u00B1 95% CIs")
ggsave("figures/GWAS_stats_figure.pdf", gwas_stats_fig, width = 6, height = 6)
ggsave("figures/TWAS_stats_figure.pdf", twas_stats_fig, width = 7, height = 6)
gwas_stats_fig
Figure XX: The figure shows estimated parameters from four separate beta regression models seeking predictors of the mixture assignment probabilities calculated using mashr
on the GWAS results. Positive values indicate that the predictor is associated with an elevated probability across loci; estimates that are non-zero with 95% confidence are shown in red. The clearest result is the strong, positive relationship between minor allele frequency (MAF) and the probability that the locus was assigned to the ‘sexually antagonistic’ mixture component. There was also a strong, negative relationship between MAF and the chance that the locus was assigned to the ‘equal effects’ mixture component (i.e. that the locus had concordant effects on male and female fitness). Loci with a relatively high probability of being female-specific tended to have lower MAF, while loci with a relatively high probability of being male-specific tended to have higher MAF. Chromosome 2L seems to be enriched for loci with male-specific effects, and de-enriched for loci that affect females relative to the other chromosomes. Finally, there were some associations between site class and mixture assignment probability, which (though sometimes statistically significant) were small and subtle.
twas_stats_fig
Figure XX: The figure shows estimated parameters from five separate beta regression models seeking predictors of the mixture assignment probabilities calculated using mashr
on the TWAS results. Positive values indicate that the predictor is associated with an elevated probability across transcripts; estimates that are non-zero with 95% confidence are shown in red. The clearest result is the strong, negative relationship between the heritability of the transcript’s expression level (calculated by Huang et al. 2015) and the probability that the locus was assigned to any mixture component other than ‘sexually antagonistic’. Transcripts with strongly male-biased expression were more likely to have no relationship with fitness (i.e. to be assigned to the ‘null’ mixture component); sex bias was a significant but weak predictor for the other mixture components. Transcripts with high expression levels were more likely to have a sexually concordant relationship with fitness, and less likely to be sexually antagonistic. Transcripts from genes located on the X chromosome were slightly more likely to have a female-specific correlation with fitness, or no correlation.
Evolutionary theory makes various testable predictions about the effects of the major and minor alleles on phenotypes, especially phenotypes that are closely correlated with fitness. For example, the minor allele should generally be associated with the lower-fitness phenotype (and the major allele with the higher-fitness phenotype), due to selection removing the lower-fitness allele. Furthermore, it is interesting to ask whether the female-beneficial allele or the male-beneficial allele tends to be the major allele at sexually-antagonistic loci? Lastly, we might expect different results at loci where the minor allele is quite rare vs loci where the minor allele is only slightly rarer than the major allele; loci with comparatively low MAF are more likely to reflect new polymorphisms (since most mutations are neutral or deleterious, and it takes time for the minor allele to drift towards high frequencies), and are more likely to be under selection (since selection against the minor allele keeps it rare).
To address these questions graphically, we here plot the effects on male and female early-life fitness of each allele, that was assigned to the female-specific, male-specific, or sexually antagonistic mixture component by mashr
, with an assignment probability in the highest 0.1% across all the loci analysed. Since there were 1,602,873 loci, each column plots the male and female effect sizes for 1,602 loci.
Among loci with female-specific effects on fitness (first column), the minor allele was usually associated with reduced female fitness, especially among loci where the minor allele frequency was below 0.2, as expected if alleles that harm female fitness are removed by selection. There was no similar finding for loci with male-specific effects (second column): the minor allele was equally likely to increase or reduce fitness, regardless of MAF. For loci with sexually-antagonistic effects on fitness (third column), the minor allele tended to be the one that reduced female fitness and elevated male fitness; this pattern was clearest among loci with a MAF < 0.2. Finally, among loci inferred to be under sexually concordant selection, the effect sizes were strongest among loci with MAF < 0.2, as expected if selection prevents alleles with strong, detrimental effects on the fitness of both sexes from becoming common.
histo_data <- tbl(db, "univariate_lmm_results") %>%
inner_join(tbl(db, "variants") %>%
select(-chr, -position),
by = "SNP") %>%
collect(n=Inf) %>%
mutate(class =
case_when(
P_sex_antag > quantile(P_sex_antag, probs=0.999) ~ "Top 0.1%\nsexually antagonistic loci",
P_equal_effects > quantile(P_equal_effects, probs=0.999) ~ "Top 0.1%\nsexually concordant loci",
P_female_specific > quantile(P_female_specific, probs=0.999) ~ "Top 0.1%\nfemale-specific loci",
P_male_specific > quantile(P_male_specific, probs=0.999) ~ "Top 0.1%\nmale-specific loci",
)) %>%
mutate(MAF = ifelse(MAF < 0.2, "0.05 < MAF < 0.2", "0.2 < MAF < 0.5")) %>%
filter(!is.na(class)) %>%
select(beta_female_early_mashr_ED, beta_male_early_mashr_ED, class, MAF)
eff_size_histos <- ggplot(histo_data, aes(beta_female_early_mashr_ED)) +
geom_vline(xintercept = 0, linetype = 2) +
geom_histogram(bins=30,fill="tomato",colour=NA, alpha = 0.5) +
geom_histogram(aes(beta_male_early_mashr_ED), bins=30,fill="steelblue",colour=NA, alpha = 0.5) +
geom_histogram(bins=30,fill=NA,colour="black") +
geom_histogram(aes(beta_male_early_mashr_ED), bins=30,fill=NA,colour="black") +
facet_grid(MAF ~ class, scales = "free_y") +
xlab("Effect of the minor allele on early-life fitness\n(females:red, males:blue)") +
theme_bw() +
theme(legend.position = "none",
strip.background = element_blank(),
text = element_text(family = "Raleway", size = 12)) +
ylab("Number of loci")
ggsave("figures/eff_size_histos.pdf", width = 8, height = 5)
eff_size_histos
manhattan_data <- tbl(db, "univariate_lmm_results") %>%
select(SNP, starts_with("P_"),
beta_female_early_mashr_ED, beta_male_early_mashr_ED) %>%
distinct() %>%
collect(n=Inf) %>%
mutate(position = str_split(SNP, "_"),
chr = map_chr(position, ~ .x[1]),
position = as.numeric(map_chr(position, ~ .x[2])),
P_pleiotropy = 1 - P_null - P_female_specific - P_male_specific,
SA_not_concord = P_sex_antag > P_equal_effects,
top_SA = P_sex_antag >= quantile(P_sex_antag, probs=0.9999)) %>%
mutate(beta_female_early_mashr_ED = abs(beta_female_early_mashr_ED),
beta_male_early_mashr_ED = abs(beta_male_early_mashr_ED)) %>%
# mutate(beta_male_early_mashr_ED = replace(
# beta_male_early_mashr_ED,
# beta_female_early_mashr_ED < 0,
# beta_male_early_mashr_ED[beta_female_early_mashr_ED < 0] * -1),
# beta_female_early_mashr_ED = replace(
# beta_female_early_mashr_ED,
# beta_female_early_mashr_ED < 0,
# beta_female_early_mashr_ED[beta_female_early_mashr_ED < 0] * -1),
# ) %>%
filter(chr != "4")
max_pos <- manhattan_data %>%
group_by(chr) %>%
summarise(max_pos = max(position), .groups = "drop") %>%
as.data.frame()
max_pos$max_pos <- c(0, cumsum(max_pos$max_pos[1:4]))
manhattan_data <- manhattan_data %>%
left_join(max_pos, by = "chr") %>%
mutate(position = position + max_pos)
# cols <- c("blue", "skyblue", "red", "pink", "orange", "yellow",
# "purple", "steelblue", "green", "tomato")
# manhattan_data %>%
# filter(P_pleiotropy > 0.7485376) %>%
# sample_n(100000) %>%
# ggplot(aes(position, -1 * log10(1 - P_pleiotropy), colour = chr, stroke = 0.2)) +
# geom_point(size = 0.5, pch = 1) +
# # scale_color_manual(values = cols) +
# # scale_fill_brewer(palette = "Paired", name = "Chromosome") +
# scale_colour_brewer(palette = "Paired", name = "Chromosome") +
# ylab("-log10(Posterior probability of null effect)") +
# theme_bw() +
# theme(axis.text.x = element_blank(),
# axis.ticks.x = element_blank())
p1 <- manhattan_data %>%
ggplot(aes(position, beta_female_early_mashr_ED, group = chr, colour = chr, stroke = 0.2)) +
geom_point(size = 0.5) +
geom_point(data = manhattan_data %>% filter(top_SA),
aes(y = 0.43), size = 3, pch = 6) +
scale_colour_brewer(palette = "Paired", name = "Chromosome") +
ylab("Effect on female fitness") +
xlab("") +
theme_bw() +
theme(axis.text.x = element_blank(),
panel.border = element_blank(),
axis.ticks.x = element_blank())
p2 <- manhattan_data %>%
ggplot(aes(position, beta_male_early_mashr_ED, group = chr, colour = chr, stroke = 0.2)) +
geom_point(size = 0.5) +
scale_colour_brewer(palette = "Paired", name = "Chromosome") +
ylab("Effect on male fitness") +
xlab("Position") +
scale_y_reverse() +
theme_bw() +
theme(axis.text.x = element_blank(),
panel.border = element_blank(),
axis.ticks.x = element_blank())
grid.arrange(p1, p2)
make_evidence_ratio_plot <- function(dat, ymax, ylab){
# Argument needs to be dataframe of loci or transcripts, which are identified by col called "identifier"
# The FE, FL,ME, ML effect sizes need to be called pheno1-pheno4
# The corresponding LFSR columns need to be called LFSR1-LFSR4
antagonism_evidence_ratios <- dat %>%
# Convert the LFSR to probability that beta is positive
# Here, we use the ED results because they are agnostic to our expectations for the results
# However, running this with the Canonocial results instead gives an essentially identical graph
mutate(pp_female_early = ifelse(pheno1 > 0, LFSR1, 1 - LFSR1),
pp_female_late = ifelse(pheno2 > 0, LFSR2, 1 - LFSR2),
pp_male_early = ifelse(pheno3 > 0, LFSR3, 1 - LFSR3),
pp_male_late = ifelse(pheno4 > 0, LFSR4, 1 - LFSR4)) %>%
# Calculate the probabilities that beta_i and beta_j have the same/opposite signs
mutate(p_sex_concord_early = pp_female_early * pp_male_early +
(1 - pp_female_early) * (1 - pp_male_early),
p_sex_antag_early = pp_female_early * (1 - pp_male_early) +
(1 - pp_female_early) * pp_male_early,
p_sex_concord_late = pp_female_late * pp_male_late +
(1 - pp_female_late) * (1 - pp_male_late),
p_sex_antag_late = pp_female_late * (1 - pp_male_late) +
(1 - pp_female_late) * pp_male_late,
p_age_concord_females = pp_female_early * pp_female_late +
(1 - pp_female_early) * (1 - pp_female_late),
p_age_antag_females = pp_female_early * (1 - pp_female_late) +
(1 - pp_female_early) * pp_female_late,
p_age_concord_males = pp_male_early * pp_male_late + (1 - pp_male_early) * (1 - pp_male_late),
p_age_antag_males = pp_male_early * (1 - pp_male_late) + (1 - pp_male_early) * pp_male_late) %>%
# Find the ratios of some of these probabilities (i.e. "evidence ratios")
mutate(`Inter-sex (early life)` = p_sex_concord_early / p_sex_antag_early,
`Inter-sex (late life)` = p_sex_concord_late / p_sex_antag_late,
`Inter-age (females)` = p_age_concord_females / p_age_antag_females,
`Inter-age (males)` = p_age_concord_males / p_age_antag_males) %>%
select(identifier, starts_with("Inter")) %>%
gather(trait, evidence_ratio, -identifier) %>%
mutate(trait = factor(trait, c("Inter-sex (early life)",
"Inter-sex (late life)",
"Inter-age (females)",
"Inter-age (males)")))
antagonism_evidence_ratios %>%
ggplot(aes(log2(evidence_ratio))) +
geom_histogram(data=subset(antagonism_evidence_ratios, evidence_ratio < 1),
bins = 500, fill = "#FF635C") +
geom_histogram(data=subset(antagonism_evidence_ratios, evidence_ratio > 1),
bins = 500, fill = "#5B8AFD") +
coord_cartesian(xlim = c(-10, 10), ylim = c(0, ymax)) +
scale_x_continuous(breaks = c(-10, -6, -2, 2, 6, 10),
labels = c(paste("1/",2 ^ c(10, 6, 2), sep = ""), 2 ^ c(2,6,10))) +
facet_wrap(~ trait) +
xlab("Evidence ratio (log2 scale)") + ylab(ylab) +
theme_bw() +
theme(panel.border = element_rect(size = 0.8),
# text = element_text(family = "Raleway", size = 12),
axis.text.x = element_text(angle = 45, vjust = 1, hjust=1),
strip.background = element_blank())
}
GWAS_ratios_plot <- SNP_clumps %>%
rename(pheno1 = beta_female_early_mashr_ED,
pheno2 = beta_female_late_mashr_ED,
pheno3 = beta_male_early_mashr_ED,
pheno4 = beta_male_late_mashr_ED,
LFSR1 = LFSR_female_early_mashr_canonical,
LFSR2 = LFSR_female_late_mashr_canonical,
LFSR3 = LFSR_male_early_mashr_canonical,
LFSR4 = LFSR_male_late_mashr_canonical,
identifier = SNP_clump) %>%
make_evidence_ratio_plot(ymax = 10000, ylab = "Number of loci")
TWAS_ED <- readRDS("data/derived/TWAS/TWAS_ED.rds")
TWAS_canonical <- readRDS("data/derived/TWAS/TWAS_canonical.rds")
TWAS_mashr_results <-
data.frame(
FBID = read_delim("data/input/huang_transcriptome/dgrp.array.exp.female.txt", delim = " ") %>% pull(gene),
as.data.frame(get_pm(TWAS_canonical)) %>% rename_all(~str_c("canonical_", .)),
as.data.frame(get_lfsr(TWAS_canonical)) %>%
rename_all(~str_replace_all(., "beta", "LFSR")) %>% rename_all(~str_c("canonical_", .))) %>%
as_tibble() %>%
bind_cols(
data.frame(
as.data.frame(get_pm(TWAS_ED)) %>% rename_all(~str_c("ED_", .)),
as.data.frame(get_lfsr(TWAS_ED)) %>%
rename_all(~str_replace_all(., "beta", "LFSR")) %>% rename_all(~str_c("ED_", .))) %>%
as_tibble())
TWAS_ratios_plot <- TWAS_mashr_results %>%
rename(pheno1 = ED_beta_FE,
pheno2 = ED_beta_FL,
pheno3 = ED_beta_ME,
pheno4 = ED_beta_ML,
LFSR1 = ED_LFSR_FE,
LFSR2 = ED_LFSR_FL,
LFSR3 = ED_LFSR_ME,
LFSR4 = ED_LFSR_ML,
identifier = FBID) %>%
make_evidence_ratio_plot(ymax = 500, ylab = "Number of transcripts")
pp <- plot_grid(GWAS_ratios_plot, TWAS_ratios_plot,
labels = c('A', 'B'), label_size = 12)
# ggdraw(add_sub(pp, "Evidence ratio (log scale)",
# vpadding=grid::unit(0,"lines"),
# y=6.3, x=0.5, vjust=6))
ggsave("figures/antagonism_ratios.pdf", pp, width = 8.5, height = 4.9)
pp
sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] lmtest_0.9-37 zoo_1.8-8 MuMIn_1.43.17 betareg_3.1-4
[5] cowplot_1.0.0 kableExtra_1.1.0 mashr_0.2.38 ashr_2.2-47
[9] showtext_0.9-1 showtextdb_3.0 sysfonts_0.8.2 Hmisc_4.4-0
[13] Formula_1.2-3 survival_3.2-7 lattice_0.20-41 ggbeeswarm_0.6.0
[17] qqman_0.1.4 gridExtra_2.3 forcats_0.5.0 stringr_1.4.0
[21] dplyr_1.0.0 purrr_0.3.4 readr_1.3.1 tidyr_1.1.0
[25] tibble_3.0.1 ggplot2_3.3.2 tidyverse_1.3.0 workflowr_1.6.2
loaded via a namespace (and not attached):
[1] colorspace_1.4-1 modeltools_0.2-23 ellipsis_0.3.1
[4] rprojroot_1.3-2 htmlTable_1.13.3 base64enc_0.1-3
[7] fs_1.4.1 rstudioapi_0.11 hexbin_1.28.1
[10] farver_2.0.3 flexmix_2.3-15 bit64_0.9-7
[13] fansi_0.4.1 mvtnorm_1.1-0 lubridate_1.7.8
[16] xml2_1.3.2 splines_4.0.3 knitr_1.30
[19] jsonlite_1.7.0 broom_0.5.6 cluster_2.1.0
[22] dbplyr_1.4.4 png_0.1-7 compiler_4.0.3
[25] httr_1.4.1 backports_1.1.7 assertthat_0.2.1
[28] Matrix_1.2-18 cli_2.0.2 later_1.0.0
[31] acepack_1.4.1 htmltools_0.5.0 tools_4.0.3
[34] gtable_0.3.0 glue_1.4.2 Rcpp_1.0.4.6
[37] cellranger_1.1.0 vctrs_0.3.0 nlme_3.1-149
[40] xfun_0.19 rvest_0.3.5 lifecycle_0.2.0
[43] irlba_2.3.3 MASS_7.3-53 scales_1.1.1
[46] hms_0.5.3 promises_1.1.0 sandwich_2.5-1
[49] RColorBrewer_1.1-2 yaml_2.2.1 curl_4.3
[52] memoise_1.1.0 rpart_4.1-15 latticeExtra_0.6-29
[55] calibrate_1.7.7 stringi_1.5.3 RSQLite_2.2.0
[58] SQUAREM_2020.2 rmeta_3.0 highr_0.8
[61] checkmate_2.0.0 truncnorm_1.0-8 rlang_0.4.6
[64] pkgconfig_2.0.3 evaluate_0.14 invgamma_1.1
[67] labeling_0.3 htmlwidgets_1.5.1 bit_1.1-15.2
[70] tidyselect_1.1.0 plyr_1.8.6 magrittr_2.0.1
[73] R6_2.4.1 generics_0.0.2 DBI_1.1.0
[76] pillar_1.4.4 haven_2.3.1 whisker_0.4
[79] foreign_0.8-80 withr_2.2.0 abind_1.4-5
[82] mixsqp_0.3-43 nnet_7.3-14 modelr_0.1.8
[85] crayon_1.3.4 rmarkdown_2.5 jpeg_0.1-8.1
[88] isoband_0.2.1 grid_4.0.3 readxl_1.3.1
[91] data.table_1.12.8 blob_1.2.1 git2r_0.27.1
[94] reprex_0.3.0 digest_0.6.25 webshot_0.5.2
[97] httpuv_1.5.3.1 stats4_4.0.3 munsell_0.5.0
[100] beeswarm_0.2.3 viridisLite_0.3.0 vipor_0.4.5