Loading [MathJax]/jax/output/HTML-CSS/jax.js
  • Load and clean the variant effect sizes
  • Q-Q plots
  • Hex bin plots and correlations in variant effect sizes
    • Effect sizes adjusted using ‘data-driven’ adaptive shrinkage
    • Effect sizes adjusted using ‘canonical’ adaptive shrinkage
    • Unadjusted effect sizes
  • Plot the estimated mixture proportions from mashr
  • Statistical models: effects of site class, MAF and chromosome
    • Tables of statistical results (models of GWAS effects)
    • Tables of statistical results (models of TWAS effects)
    • Plots showing the statistical results
  • Effect of the minor allele
  • Manhattan plot
  • Antagonism ratios figure

Last updated: 2021-03-04

Checks: 6 1

Knit directory: fitnessGWAS/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20180914) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

The following chunks had caches available:
  • qqplot1
  • qqplot2
  • qqplot3
  • qqplot4

To ensure reproducibility of the results, delete the cache directory plot_models_variant_effects_cache and re-run the analysis. To have workflowr automatically delete the cache directory prior to building the file, set delete_cache = TRUE when running wflow_build() or wflow_publish().

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version c606d3d. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .DS_Store
    Ignored:    .Rapp.history
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    .httr-oauth
    Ignored:    .pversion
    Ignored:    analysis/.DS_Store
    Ignored:    analysis/correlations_SNP_effects_cache/
    Ignored:    analysis/plot_models_variant_effects_cache/
    Ignored:    code/.DS_Store
    Ignored:    code/Drosophila_GWAS.Rmd
    Ignored:    data/.DS_Store
    Ignored:    data/derived/
    Ignored:    data/input/.DS_Store
    Ignored:    data/input/.pversion
    Ignored:    data/input/dgrp.fb557.annot.txt
    Ignored:    data/input/dgrp2.bed
    Ignored:    data/input/dgrp2.bim
    Ignored:    data/input/dgrp2.fam
    Ignored:    data/input/huang_transcriptome/
    Ignored:    figures/.DS_Store

Untracked files:
    Untracked:  Rplots.pdf
    Untracked:  analysis/GO_KEGG_enrichment.Rmd
    Untracked:  analysis/TWAS_tables.Rmd
    Untracked:  code/GO_and_KEGG_gsea.R
    Untracked:  code/eQTL_analysis.Rmd
    Untracked:  code/make_sites_files_for_ARGweaver.R
    Untracked:  code/plotting_results.Rmd
    Untracked:  code/run_argweaver.R
    Untracked:  code/run_mashr.R
    Untracked:  data/argweaver/
    Untracked:  data/input/TF_binding_sites.csv
    Untracked:  figures/GWAS_mixture_proportions.rds
    Untracked:  figures/GWAS_stats_figure.pdf
    Untracked:  figures/TWAS_mixture_proportions.rds
    Untracked:  figures/TWAS_stats_figure.pdf
    Untracked:  figures/antagonism_ratios.pdf
    Untracked:  figures/composite_mixture_figure.pdf
    Untracked:  figures/eff_size_histos.pdf
    Untracked:  figures/stats_figure.pdf
    Untracked:  manuscript/

Unstaged changes:
    Modified:   .gitignore
    Modified:   README.md
    Modified:   analysis/GWAS_tables.Rmd
    Modified:   analysis/TWAS.Rmd
    Modified:   analysis/_site.yml
    Deleted:    analysis/about.Rmd
    Deleted:    analysis/eQTL_analysis.Rmd
    Modified:   analysis/gwas_adaptive_shrinkage.Rmd
    Modified:   analysis/license.Rmd
    Modified:   analysis/make_annotation_database.Rmd
    Modified:   analysis/plot_line_means.Rmd
    Modified:   analysis/plotting_results.Rmd
    Deleted:    code/gwas_adaptive_shrinkage.R
    Modified:   data/input/female_fitness.csv
    Modified:   data/input/female_fitness_CLEANED.csv
    Modified:   data/input/male_fitness.csv
    Modified:   data/input/male_fitness_CLEANED.csv
    Modified:   figures/figure1.eps
    Deleted:    figures/figure1.png
    Modified:   figures/figure2.eps

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/plot_models_variant_effects.Rmd) and HTML (docs/plot_models_variant_effects.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd c606d3d lukeholman 2021-03-04 big first commit 2021

library(tidyverse)
library(gridExtra)
library(qqman)
library(ggbeeswarm)
library(Hmisc)
library(showtext) # For fancy Google font in figures
library(mashr)
library(kableExtra)
library(cowplot)
font_add_google(name = "Raleway", family = "Raleway", regular.wt = 400, bold.wt = 700) # Install font from Google Fonts
showtext_auto()

db <- DBI::dbConnect(RSQLite::SQLite(), 
                     "data/derived/annotations.sqlite3")

# Results for all 1,613,615 SNPs, even those that are in 100% LD with others (these are grouped up by the SNP_clump column)
all_snps <- tbl(db, "univariate_lmm_results")

# All SNPs and SNP groups that are in <100% LD with one another (n = 1,207,357)
SNP_clumps <- all_snps %>% select(-SNP) %>% distinct() %>% collect(n = Inf)

Load and clean the variant effect sizes

Modify column names, reorder columns, filter based on the p-value, etc.

# Univariate analysis, mashr-adjusted, data-driven approach
univariate_lmm_results <- tbl(db, "univariate_lmm_results") %>% 
  select(-contains("canonical"), 
         -contains("raw")) %>% 
  inner_join(tbl(db, "variants") %>% 
               select(SNP, FBID, site.class, distance.to.gene, MAF), 
             by = "SNP") %>%
  left_join(
    tbl(db, "genes") %>% 
      select(FBID, gene_name), by = "FBID") %>%
  collect(n = Inf) %>%
  rename_all(~ gsub("beta_", "", .x)) %>%
  rename_all(~ gsub("_mashr_ED", "", .x))

univariate_lmm_results <- univariate_lmm_results %>%
  mutate(site.class = gsub("5_", "5-", site.class),
         site.class = gsub("3_", "3-", site.class),
         site.class = gsub("NON_", "NON-", site.class),
         site.class = gsub("_", " ", site.class),
         site.class = capitalize(tolower(site.class)),
         site.class = gsub("Utr", "UTR", site.class)
  )

# Results of one multivariate linear mixed model run in GEMMA:
# multivariate_lmm_results <- 
#   read_tsv("data/derived/output/all_four_traits.assoc.txt") %>%
#   filter(p_fdr < 0.05) %>% 
#   inner_join(tbl(db, "variants") %>% 
#                select(-chr, -position) %>%
#                collect(), by = "SNP") %>% 
#   arrange(p_wald) %>% 
#   select(-contains("Vbeta")) %>%
#   left_join(
#     tbl(db, "genes") %>% 
#       select(FBID, gene_name) %>% 
#       collect(n=Inf), by = "FBID") %>%
#   mutate(gene_name = replace(gene_name, is.na(FBID), NA)) 
# 
# multivariate_lmm_results <- multivariate_lmm_results[,c(1:8, 14, 9:13)]
# 
# multivariate_lmm_results <- multivariate_lmm_results %>%
#   mutate_at(vars(starts_with("p_")), ~ -log10(.x)) %>%
#   mutate_if(is.numeric, ~ format(round(.x, 2), nsmall = 2)) %>%
#   left_join(tbl(db, "univariate_lmm_results") %>% 
#               select(SNP, SNP_clump) %>% 
#               collect(),
#             by = "SNP") %>%
#   distinct(SNP_clump, .keep_all = TRUE) %>%
#   select(SNP, SNP_clump, everything())

Q-Q plots

Here are some quantile-quantile plots, which are commonly used to check GWAS results, and to test the hypothesis that there are more SNPs than expected showing large effects on the trait of interest.

# univariate_lmm_pvals <- tbl(db, "univariate_lmm_results") %>%
#   select(contains("LFSR")) %>% select(contains("ED")) %>% collect(n=Inf) %>% as.data.frame()
# qqman::qq(univariate_lmm_results$LFSR_female_early_mashr_ED)

univariate_lmm_pvals <- tbl(db, "univariate_lmm_results") %>%
  select(contains("pvalue")) %>% collect(n=Inf) %>% as.data.frame()
qqman::qq(univariate_lmm_pvals$pvalue_female_early_raw)

Warning: The above code chunk cached its results, but it won’t be re-run if previous chunks it depends on are updated. If you need to use caching, it is highly recommended to also set knitr::opts_chunk$set(autodep = TRUE) at the top of the file (in a chunk that is not cached). Alternatively, you can customize the option dependson for each individual chunk that is cached. Using either autodep or dependson will remove this warning. See the knitr cache options for more details.

#qqman::qq(univariate_lmm_results$LFSR_female_late_mashr_ED)
qqman::qq(univariate_lmm_pvals$pvalue_female_late_raw)

Warning: The above code chunk cached its results, but it won’t be re-run if previous chunks it depends on are updated. If you need to use caching, it is highly recommended to also set knitr::opts_chunk$set(autodep = TRUE) at the top of the file (in a chunk that is not cached). Alternatively, you can customize the option dependson for each individual chunk that is cached. Using either autodep or dependson will remove this warning. See the knitr cache options for more details.

# qqman::qq(univariate_lmm_results$LFSR_male_early_mashr_ED)
qqman::qq(univariate_lmm_pvals$pvalue_male_early_raw)

Warning: The above code chunk cached its results, but it won’t be re-run if previous chunks it depends on are updated. If you need to use caching, it is highly recommended to also set knitr::opts_chunk$set(autodep = TRUE) at the top of the file (in a chunk that is not cached). Alternatively, you can customize the option dependson for each individual chunk that is cached. Using either autodep or dependson will remove this warning. See the knitr cache options for more details.

# qqman::qq(univariate_lmm_results$LFSR_male_late_mashr_ED)
qqman::qq(univariate_lmm_pvals$pvalue_male_late_raw)

Warning: The above code chunk cached its results, but it won’t be re-run if previous chunks it depends on are updated. If you need to use caching, it is highly recommended to also set knitr::opts_chunk$set(autodep = TRUE) at the top of the file (in a chunk that is not cached). Alternatively, you can customize the option dependson for each individual chunk that is cached. Using either autodep or dependson will remove this warning. See the knitr cache options for more details.

Hex bin plots and correlations in variant effect sizes

Effect sizes adjusted using ‘data-driven’ adaptive shrinkage

hex_plot <- function(x, y, xlab, ylab, title,
                     xlim = c(-0.6, 0.45),
                     ylim = c(-0.3, 0.41)
                     ){
  dat <- SNP_clumps %>%
    mutate(facet = title)
  ggplot(dat, aes_string(x, y)) + 
    geom_abline(linetype = 2) + 
    geom_vline(xintercept = 0, linetype = 3) +
    geom_hline(yintercept = 0, linetype = 3) +
    stat_binhex(bins = 50)  +
    geom_density_2d(colour = "white", alpha = 0.6) + 
    scale_fill_viridis_c() + 
    coord_cartesian(xlim = xlim, ylim = ylim) + 
    facet_wrap(~ facet) +
    theme_bw() + xlab(xlab) + ylab(ylab) +
    theme(legend.position = "none",
          strip.background = element_blank(),
          strip.text = element_text(hjust=0))   + 
    theme(text = element_text(family = "Raleway", size = 12))

}

grid.arrange(
  hex_plot("beta_female_early_mashr_ED", 
           "beta_male_early_mashr_ED", 
           "Effect on female fitness", 
           NULL,
           "A. Early life fitness"),
  hex_plot("beta_female_late_mashr_ED", 
           "beta_male_late_mashr_ED", 
           "Effect on female fitness", 
           NULL,
           "B. Late life fitness"),
  ncol = 2, left = "Effect on male fitness")

SNP_clumps %>%
  select(contains("mashr_ED")) %>%
  select(contains("beta")) %>%
  rename_all(~ paste(ifelse(str_detect(.x, "female"), "Female", "Male"), 
                     ifelse(str_detect(.x, "early"), "early", "late"))) %>%
  cor(method = "spearman") %>% 
  kable(digits = 3) %>% kable_styling(full_width = FALSE)
Female early Female late Male early Male late
Female early 1.000 1.000 0.690 0.754
Female late 1.000 1.000 0.686 0.751
Male early 0.690 0.686 1.000 0.991
Male late 0.754 0.751 0.991 1.000

Effect sizes adjusted using ‘canonical’ adaptive shrinkage

The unusual distribution here reflects the fact that this analysis imposed constraints on the effects of each locus. For example, loci assigned to the ‘Female-specific’ category with high probability necessarily have an effect size close to zero in males. The figure should be interpreted cautiously - the main purpose of the canonical analysis is to inference the mixture proportions, and to assign mixture probabilities to each locus, not to accurately estimate the true effect size of each variant.

grid.arrange(
  hex_plot("beta_female_early_mashr_canonical", 
           "beta_male_early_mashr_canonical", 
           "Effect on female fitness", 
           NULL,
           "A. Early life fitness"),
  hex_plot("beta_female_late_mashr_canonical", 
           "beta_male_late_mashr_canonical", 
           "Effect on female fitness", 
           NULL,
           "B. Late life fitness"),
  ncol = 2, left = "Effect on male fitness")

SNP_clumps %>%
  select(contains("mashr_canonical")) %>%
  select(contains("beta")) %>%
  rename_all(~ paste(ifelse(str_detect(.x, "female"), "Female", "Male"), 
                     ifelse(str_detect(.x, "early"), "early", "late"))) %>%
  cor(method = "spearman") %>% 
  kable(digits = 3) %>% kable_styling(full_width = FALSE)
Female early Female late Male early Male late
Female early 1.000 1.000 0.862 0.862
Female late 1.000 1.000 0.862 0.862
Male early 0.862 0.862 1.000 1.000
Male late 0.862 0.862 1.000 1.000

Unadjusted effect sizes

Uses the variant effect sizes output by GEMMA, without applying any shrinkage (i.e. this is the data that was adjusted using mashr).

grid.arrange(
  hex_plot("beta_female_early_raw", 
           "beta_male_early_raw", 
           "Effect on female fitness", 
           NULL,
           "A. Early life fitness",
           xlim = c(-1, 1),
           ylim = c(-1, 1)),
  hex_plot("beta_female_late_raw", 
           "beta_male_late_raw", 
           "Effect on female fitness", 
           NULL,
           "B. Late life fitness",
           xlim = c(-1, 1),
           ylim = c(-1, 1)),
  ncol = 2, left = "Effect on male fitness")

SNP_clumps %>%
  select(contains("raw")) %>%
  select(contains("beta")) %>%
  rename_all(~ paste(ifelse(str_detect(.x, "female"), "Female", "Male"), 
                     ifelse(str_detect(.x, "early"), "early", "late"))) %>%
  cor(method = "spearman") %>% 
  kable(digits = 3) %>% kable_styling(full_width = FALSE)
Female early Female late Male early Male late
Female early 1.000 0.547 0.209 0.111
Female late 0.547 1.000 0.204 0.157
Male early 0.209 0.204 1.000 0.412
Male late 0.111 0.157 0.412 1.000

Plot the estimated mixture proportions from mashr

# Make plot for the GWAS
mashr_results_canonical <- read_rds("data/derived/mashr_results_canonical.rds")
mashr_2L <- readRDS("data/derived/mashr_results_canonical_chr2L.rds")
mashr_2R <- readRDS("data/derived/mashr_results_canonical_chr2R.rds")
mashr_3L <- readRDS("data/derived/mashr_results_canonical_chr3L.rds")
mashr_3R <- readRDS("data/derived/mashr_results_canonical_chr3R.rds")
mashr_X <- readRDS("data/derived/mashr_results_canonical_chrX.rds")

mix <- bind_rows(
  enframe(sort(get_estimated_pi(mashr_results_canonical))) %>%
    mutate(Chromosome = "All"),
  enframe(sort(get_estimated_pi(mashr_2L))) %>%
    mutate(Chromosome = "2L"),
  enframe(sort(get_estimated_pi(mashr_2R))) %>%
    mutate(Chromosome = "2R"),
  enframe(sort(get_estimated_pi(mashr_3L))) %>%
    mutate(Chromosome = "3L"),
  enframe(sort(get_estimated_pi(mashr_3R))) %>%
    mutate(Chromosome = "3R"),
  enframe(sort(get_estimated_pi(mashr_X))) %>%
    mutate(Chromosome = "X")) %>% 
  rename(Mixture_component = name) 

to_keep <- mix %>%
  group_by(Mixture_component) %>%
  summarise(value = max(value), .groups = "drop") %>%
  filter(value > 0.01) %>%
  pull(Mixture_component)

mix <- mix %>%
  filter(Mixture_component %in% to_keep) %>%
  spread(Mixture_component, value) %>%
  rename(`Sexually concordant effect` = equal_effects,
         `Female-specific effect` = Female_specific_1,
         `Male-specific effect` = Male_specific_1,
         `Sexually antagonistic effect` = Sex_antag_0.25,
         `No effect on fitness` = null) %>%
  gather(Mixture_component, value, -Chromosome) %>%
  arrange(-value) 

chr_levels <- mix %>% 
  filter(Mixture_component == "Sexually antagonistic effect") %>%
  arrange(value) %>% pull(Chromosome)

mix <- mix %>%
  mutate(Chromosome = factor(Chromosome, chr_levels),
         Mixture_component = factor(Mixture_component, 
                                    c("Sexually antagonistic effect",
                                      "Sexually concordant effect",
                                      "Female-specific effect",
                                      "Male-specific effect",
                                      "No effect on fitness"))) 


p1 <- mix %>%
  ggplot(aes(Chromosome, 100 * value)) + 
  geom_bar(stat = "identity",aes(fill = Chromosome), colour = "grey10",  size = 0.3) + 
  scale_fill_brewer(palette = "Spectral", direction = -1) + 
  coord_flip() + 
  theme_bw() + 
  scale_y_continuous(expand = c(0,0), limits = c(0, 65)) + 
  scale_x_discrete(expand = c(0.14, 0.14)) + 
  theme(axis.ticks.y = element_blank(),
        legend.position = "none",
        panel.border = element_rect(size = 0.8),
        text = element_text(family = "Raleway", size = 12),
        strip.background = element_blank(),
        panel.grid.major.y = element_blank()) +
  ylab("Estimated % of loci") +
  facet_wrap(~ Mixture_component, ncol = 1)


# Make plot for the TWAS
mashr_results_canonical <- readRDS("data/derived/TWAS/TWAS_canonical.rds")
mashr_2L <- readRDS("data/derived/TWAS/TWAS_canonical_2L.rds")
mashr_2R <- readRDS("data/derived/TWAS/TWAS_canonical_2R.rds")
mashr_3L <- readRDS("data/derived/TWAS/TWAS_canonical_3L.rds")
mashr_3R <- readRDS("data/derived/TWAS/TWAS_canonical_3R.rds")
mashr_X <- readRDS("data/derived/TWAS/TWAS_canonical_X.rds")

mix <- bind_rows(
  enframe(sort(get_estimated_pi(mashr_results_canonical))) %>%
    mutate(Chromosome = "All"),
  enframe(sort(get_estimated_pi(mashr_2L))) %>%
    mutate(Chromosome = "2L"),
  enframe(sort(get_estimated_pi(mashr_2R))) %>%
    mutate(Chromosome = "2R"),
  enframe(sort(get_estimated_pi(mashr_3L))) %>%
    mutate(Chromosome = "3L"),
  enframe(sort(get_estimated_pi(mashr_3R))) %>%
    mutate(Chromosome = "3R"),
  enframe(sort(get_estimated_pi(mashr_X))) %>%
    mutate(Chromosome = "X")) %>% 
  rename(Mixture_component = name) %>%
  mutate(Mixture_component = str_remove_all(Mixture_component, "_0.25"),
         Mixture_component = str_remove_all(Mixture_component, "_0.5"),
         Mixture_component = str_remove_all(Mixture_component, "_0.75"),
         Mixture_component = str_remove_all(Mixture_component, "_1.0")) %>%
  group_by(Mixture_component, Chromosome) %>%
  summarise(value = sum(value), .groups = "drop")

to_keep <- mix %>%
  group_by(Mixture_component) %>%
  summarise(value = max(value), .groups = "drop") %>%
  filter(value > 0.01) %>%
  pull(Mixture_component)

mix <- mix %>%
  filter(Mixture_component %in% to_keep) %>%
  spread(Mixture_component, value) %>%
  rename(`Sexually concordant effect` = equal_effects,
         `Female-specific effect` = Female_specific_1,
         `Male-specific effect` = Male_specific_1,
         `Sexually antagonistic effect` = Sex_antag,
         `No effect on fitness` = null) %>%
  gather(Mixture_component, value, -Chromosome) %>%
  arrange(-value) 

chr_levels <- rev(c("X", "3L", "3R", "All", "2R", "2L"))

mix <- mix %>%
  mutate(Chromosome = factor(Chromosome, chr_levels),
         Mixture_component = factor(Mixture_component, 
                                    c("Sexually antagonistic effect",
                                      "Sexually concordant effect",
                                      "Female-specific effect",
                                      "Male-specific effect",
                                      "No effect on fitness"))) 

p2 <- mix %>%
  ggplot(aes(Chromosome, 100 * value)) + 
  geom_bar(stat = "identity",aes(fill = Chromosome), colour = "grey10",  size = 0.3) + 
  scale_fill_brewer(palette = "Spectral", direction = -1) + 
  coord_flip() + 
  theme_bw() + 
  scale_y_continuous(expand = c(0,0), limits = c(0, 65)) + # 
  scale_x_discrete(expand = c(0.14, 0.14)) + 
  theme(axis.ticks.y = element_blank(),
        legend.position = "none",
        panel.border = element_rect(size = 0.8),
        text = element_text(family = "Raleway", size = 12),
        strip.background = element_blank(),
        panel.grid.major.y = element_blank()) +
  ylab("Estimated % of transcripts") +
  xlab(" ") + 
  facet_wrap(~ Mixture_component, ncol = 1)


# Save composite figure of the GWAS and TWAS mixture proportions
ggsave(filename = "figures/composite_mixture_figure.pdf", 
       cowplot::plot_grid(p1, p2, labels = c('A', 'B'), label_size = 12),
       width = 5, height = 8)

cowplot::plot_grid(p1, p2, labels = c('A', 'B'), label_size = 12)

Figure X: Proportions of each type of locus, as estimated using the mixture model computed by mashr in the analysis using canonical covariance matrices. This analysis involved a number of pre-specified covariance matrices, each corresponding to a type of locus that we hypothesised to exist (shown in panel headings). The analysis fit some other matrix types not shown here, because the corresponding locus type was inferred to be rare/absent (these included neutral loci, estimated to comprise 0.5-1% of those tested, and age-antagonistic loci, none of which were detected). The analysis was run either using all 1,207,357 loci for which data were available (labeled ‘All’) or for all loci on each of the chromosomes (chromosomes 4 and Y had insufficient data). Notably, sexually-antagonistic loci were inferred to be especially common on the X chromosome, loci that affected males only were inferred to be rarer than those affecting females only, and chromosome 2R had many more female-specific than male-specific loci.

Statistical models: effects of site class, MAF and chromosome

The following models test whether site class, MAF (minor allele frequency) and chromosome predict how each locus affected our four phenotypes.

These analyses use the mixture assignment probabilities calculated by mashr (in the “canonical” analysis). For every locus, the mashr analysis calculated the probability that each locus was A) sexually antagonistic (meaning it had opposite effects on male and female fitness that were consistent across the two age categories), B) female-specific (meaning it affect female fitness but not male), C) male-specific (affecting male fitness only), and D) had equal effects on all four phenotypes (mean the allele that associated with higher fitness in males also was associated with higher fitness in females). Other types of loci were also considered in the model (e.g. null effect loci and ones with age-specific effects), but these were found to be quite rare, precluding analysis here.

Here, we use these mixture assignment probabilities as the response variable, in order to ask whether e.g. sexually antagonistic loci tend to appear in any particular type of site or chromosome, and whether they had a higher or lower MAF than average. Since these probabilities are bounded between zero and one, we use beta regression (a form of GLM designed to model variables in the range [0,1]). We fit a full model with site class, MAF and chromosome as predictor variables, as well as all the simpler nested models, and compare them using AICc model selection. We also present the results of the full model, and plot the parameter estimates from each of the four models below.

The dataset used in this model is

Tables of statistical results (models of GWAS effects)

library(betareg)
library(MuMIn)
library("lmtest")

LFSR_cutoff <- 0.05

dat <- univariate_lmm_results %>%
  filter(LFSR_female_early < LFSR_cutoff | LFSR_female_late < LFSR_cutoff | 
           LFSR_male_early < LFSR_cutoff | LFSR_male_late < LFSR_cutoff) %>%
  mutate(chr = gsub("_", "", substr(SNP, 1, 2))) %>%
  select(SNP, SNP_clump, starts_with("P_"), MAF, site.class, chr) %>%
  distinct() 

# Focus only on the commonest site classes:
dat <- dat %>% 
  filter(site.class %in% c("Intron", "Intergenic", "Downstream", "Upstream",
                           "Synonymous coding", "Non-synonymous coding", "UTR 3-prime",
                           "Exon", "UTR 5-prime")) 

# Remove chromosome 4 (too few sites)
dat <- dat %>% filter(chr != "4") 

# If there are multiple site classes for a SNP, or multiple SNPs 
# in the same 100% LD clump, pick one SNP and/or 1 site class at random
set.seed(1)
dat <- dat[sample(nrow(dat),nrow(dat)), ] %>%
  split(.$SNP_clump) %>%
  map_df(~ .x[1, ])

dat <- dat %>% arrange(site.class)
dat$site.class <- relevel(factor(dat$site.class), ref = "Synonymous coding")

n_loci <- prettyNum(nrow(dat), big.mark = ",", scientific = FALSE)

compare_mods <- function(MAF_and_Chromosome_and_site_class){
  
  MAF_and_Chromosome <- update(MAF_and_Chromosome_and_site_class, ~ . -site.class)
  MAF_and_siteclass <- update(MAF_and_Chromosome_and_site_class, ~ . -chr)
  Chromosome_and_siteclass <- update(MAF_and_Chromosome_and_site_class, ~ . -chr)
  
  MAF <- update(MAF_and_Chromosome_and_site_class, ~ . -site.class - chr)
  Chromosome <- update(MAF_and_Chromosome_and_site_class, ~ . -site.class - MAF)
  siteclass <- update(MAF_and_Chromosome_and_site_class, ~ .  - MAF -chr)
  Null_model <- update(MAF_and_Chromosome_and_site_class, ~ . -site.class - MAF -chr)
  
  AICc(MAF_and_Chromosome_and_site_class, 
       MAF_and_Chromosome, 
       MAF_and_siteclass,
       Chromosome_and_siteclass,
       MAF, Chromosome, siteclass, 
       Null_model) %>% 
    rownames_to_column("Model") %>%
    arrange(AICc) %>%
    mutate(delta = AICc - AICc[1],
           Model = str_replace_all(Model, "_and+_", " + "),
           Weight = round(exp(-0.5 * delta) / sum(exp(-0.5 * delta)), 3),
           delta = round(delta, 2)) %>%
    kable(digits = 2) %>% kable_styling(full_width = FALSE)
}

In each of the following analyses, the response variable is the mixture assignment probability to type i for each of the 68,655 loci that affected at least one of the phenotypes significantly (defined as LFSR < 0.05), where i is one of the four variant types shown in the above figure. In cases where multiple SNP or indel loci were in 100% linkage with one another, we picked a single locus at random and discarded the others.

Three predictors were available for each locus: the minor allele frequency (MAF) in the overall DGRP, the site class of the variant, and chromosome. Loci that were annotated with more than one site class (e.g. intron as well as exon, due to an overlap of genes) were assigned one of these site classes at random.

To evaluate the effects of three predictors on the response variable, we fit 8 nested models and compared them using AICc. We also present the summary() output for the top-ranked model according to AICc.

Probability of sexual antagonism

betareg(P_sex_antag ~ MAF + chr + site.class, 
        data = dat) %>% compare_mods()
Model df AICc delta Weight
MAF + Chromosome + site_class 15 -241307.6 0.00 1
MAF + Chromosome 7 -241296.6 10.98 0
MAF + siteclass 11 -241134.7 172.89 0
Chromosome + siteclass 11 -241134.7 172.89 0
MAF 3 -241122.7 184.92 0
Chromosome 6 -239763.5 1544.12 0
siteclass 10 -239576.6 1731.02 0
Null_model 2 -239566.2 1741.41 0
p1 <- summary(betareg(P_sex_antag ~ MAF + chr + site.class, data = dat), type = "deviance")
p1

Call:
betareg(formula = P_sex_antag ~ MAF + chr + site.class, data = dat)

Deviance residuals:
    Min      1Q  Median      3Q     Max 
-3.7291 -0.3614 -0.2404  0.5699  9.5453 

Coefficients (mean model with logit link):
                                 Estimate Std. Error  z value Pr(>|z|)    
(Intercept)                     -2.798139   0.010795 -259.199  < 2e-16 ***
MAF                              0.775745   0.019949   38.887  < 2e-16 ***
chr2R                            0.069455   0.007259    9.568  < 2e-16 ***
chr3L                            0.048502   0.007160    6.774 1.25e-11 ***
chr3R                            0.084534   0.007546   11.202  < 2e-16 ***
chrX                             0.086050   0.008029   10.718  < 2e-16 ***
site.classDownstream            -0.001505   0.011094   -0.136   0.8921    
site.classExon                  -0.014302   0.024914   -0.574   0.5659    
site.classIntergenic             0.023060   0.009426    2.447   0.0144 *  
site.classIntron                 0.017865   0.008451    2.114   0.0345 *  
site.classNon-synonymous coding -0.018356   0.017827   -1.030   0.3032    
site.classUpstream               0.005090   0.011299    0.450   0.6524    
site.classUTR 3-prime           -0.010520   0.018416   -0.571   0.5678    
site.classUTR 5-prime           -0.051042   0.024760   -2.062   0.0393 *  

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)  29.3662     0.1622   181.1   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.207e+05 on 15 Df
Pseudo R-squared: 0.02869
Number of iterations: 23 (BFGS) + 2 (Fisher scoring) 

Probability of being female-specific

betareg(P_female_specific ~ MAF + chr + site.class, 
        data = dat) %>% compare_mods()
Model df AICc delta Weight
MAF + Chromosome + site_class 15 -81289.74 0.00 1
MAF + Chromosome 7 -81271.61 18.13 0
Chromosome 6 -81253.26 36.48 0
MAF + siteclass 11 -80724.88 564.86 0
Chromosome + siteclass 11 -80724.88 564.86 0
MAF 3 -80712.02 577.72 0
siteclass 10 -80712.01 577.74 0
Null_model 2 -80698.62 591.12 0
p2 <- summary(betareg(P_female_specific ~ MAF + chr + site.class, data = dat), type = "deviance")
p2 

Call:
betareg(formula = P_female_specific ~ MAF + chr + site.class, data = dat)

Deviance residuals:
    Min      1Q  Median      3Q     Max 
-1.7683 -0.5677 -0.3655  0.8111  5.3896 

Coefficients (mean model with logit link):
                                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)                     -1.458023   0.016079 -90.679  < 2e-16 ***
MAF                             -0.132799   0.029881  -4.444 8.82e-06 ***
chr2R                            0.239335   0.010850  22.058  < 2e-16 ***
chr3L                            0.123513   0.010722  11.520  < 2e-16 ***
chr3R                            0.193970   0.011316  17.141  < 2e-16 ***
chrX                             0.179009   0.012085  14.813  < 2e-16 ***
site.classDownstream             0.018385   0.016622   1.106   0.2687    
site.classExon                  -0.002719   0.037132  -0.073   0.9416    
site.classIntergenic             0.055367   0.014154   3.912 9.16e-05 ***
site.classIntron                 0.024304   0.012698   1.914   0.0556 .  
site.classNon-synonymous coding -0.031334   0.026626  -1.177   0.2393    
site.classUpstream               0.018190   0.016945   1.073   0.2831    
site.classUTR 3-prime           -0.018787   0.027549  -0.682   0.4953    
site.classUTR 5-prime           -0.059117   0.036726  -1.610   0.1075    

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)  4.17855    0.02205   189.5   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 4.066e+04 on 15 Df
Pseudo R-squared: 0.009817
Number of iterations: 23 (BFGS) + 2 (Fisher scoring) 

Probability of being male-specific

betareg(P_male_specific ~ MAF + chr + site.class, 
        data = dat) %>% compare_mods()
Model df AICc delta Weight
MAF + Chromosome 7 -381128.0 0.00 0.98
MAF + Chromosome + site_class 15 -381120.2 7.77 0.02
Chromosome 6 -381055.3 72.69 0.00
MAF 3 -380995.0 132.99 0.00
MAF + siteclass 11 -380984.5 143.55 0.00
Chromosome + siteclass 11 -380984.5 143.55 0.00
Null_model 2 -380928.0 199.97 0.00
siteclass 10 -380918.0 209.97 0.00
p3 <- summary(betareg(P_male_specific ~ MAF + chr + site.class, data = dat), type = "deviance")
p3

Call:
betareg(formula = P_male_specific ~ MAF + chr + site.class, data = dat)

Deviance residuals:
    Min      1Q  Median      3Q     Max 
-2.1176 -0.8374 -0.7227 -0.5800  9.2003 

Coefficients (mean model with logit link):
                                 Estimate Std. Error  z value Pr(>|z|)    
(Intercept)                     -3.747522   0.014364 -260.905  < 2e-16 ***
MAF                              0.227406   0.026558    8.563  < 2e-16 ***
chr2R                           -0.114192   0.009635  -11.851  < 2e-16 ***
chr3L                           -0.041503   0.009396   -4.417 1.00e-05 ***
chr3R                           -0.052498   0.009978   -5.261 1.43e-07 ***
chrX                            -0.056485   0.010663   -5.297 1.18e-07 ***
site.classDownstream            -0.014242   0.014758   -0.965    0.335    
site.classExon                  -0.011391   0.032946   -0.346    0.730    
site.classIntergenic            -0.018763   0.012568   -1.493    0.135    
site.classIntron                 0.002749   0.011250    0.244    0.807    
site.classNon-synonymous coding  0.012947   0.023487    0.551    0.581    
site.classUpstream              -0.002264   0.015030   -0.151    0.880    
site.classUTR 3-prime            0.002971   0.024382    0.122    0.903    
site.classUTR 5-prime           -0.014438   0.032531   -0.444    0.657    

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)  34.7149     0.2147   161.7   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.906e+05 on 15 Df
Pseudo R-squared: 0.006034
Number of iterations: 29 (BFGS) + 2 (Fisher scoring) 

Probability of equal effects

betareg(P_equal_effects ~ MAF + chr + site.class, 
        data = dat) %>% compare_mods()
Model df AICc delta Weight
MAF + Chromosome + site_class 15 -37638.22 0.00 1
MAF + Chromosome 7 -37620.58 17.63 0
Chromosome 6 -37419.94 218.28 0
MAF + siteclass 11 -37318.73 319.49 0
Chromosome + siteclass 11 -37318.73 319.49 0
MAF 3 -37301.67 336.54 0
siteclass 10 -37105.80 532.42 0
Null_model 2 -37090.34 547.87 0
p4 <- summary(betareg(P_equal_effects ~ MAF + chr + site.class, data = dat), type = "deviance")
p4

Call:
betareg(formula = P_equal_effects ~ MAF + chr + site.class, data = dat)

Deviance residuals:
    Min      1Q  Median      3Q     Max 
-5.1450 -0.6580  0.1940  0.3527  2.0818 

Coefficients (mean model with logit link):
                                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)                      1.022064   0.015466  66.083  < 2e-16 ***
MAF                             -0.412143   0.028894 -14.264  < 2e-16 ***
chr2R                           -0.154896   0.010488 -14.769  < 2e-16 ***
chr3L                           -0.094204   0.010334  -9.116  < 2e-16 ***
chr3R                           -0.157356   0.010922 -14.407  < 2e-16 ***
chrX                            -0.156126   0.011655 -13.396  < 2e-16 ***
site.classDownstream            -0.008109   0.016067  -0.505 0.613774    
site.classExon                   0.017694   0.035920   0.493 0.622298    
site.classIntergenic            -0.045783   0.013680  -3.347 0.000817 ***
site.classIntron                -0.031409   0.012266  -2.561 0.010446 *  
site.classNon-synonymous coding  0.031656   0.025722   1.231 0.218448    
site.classUpstream              -0.011775   0.016378  -0.719 0.472181    
site.classUTR 3-prime            0.004774   0.026596   0.179 0.857551    
site.classUTR 5-prime            0.068546   0.035500   1.931 0.053496 .  

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)  3.99001    0.01987   200.8   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.883e+04 on 15 Df
Pseudo R-squared: 0.009275
Number of iterations: 23 (BFGS) + 2 (Fisher scoring) 

Tables of statistical results (models of TWAS effects)

twas <- readRDS("data/derived/TWAS/TWAS_mixture_assignment_probabilities.rds") %>%
  left_join(tbl(db, "genes") %>% select(FBID, chromosome) %>% collect(), by = "FBID") %>%
  left_join(read_csv("data/derived/gene_expression_by_sex.csv"), by = "FBID") %>%
  filter(chromosome %in% c("2L", "2R", "3L", "3R", "X")) %>%
  mutate(h2 = (female_narrow_heritability + male_narrow_heritability) / 2)


compare_mods <- function(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability){
  
  Chromosome_and_Expression_level_and_Heritability <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, 
                                                             ~ . -male_bias_in_expression)
  Sex_bias_and_Chromosome_and_Expression_level <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, 
                                                             ~ . -h2)
  Sex_bias_and_Expression_level_and_Heritability <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, 
                                                             ~ . -chromosome)
  Sex_bias_and_Chromosome_and_Heritability <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, 
                                                             ~ . -AveExpr)
  
  Chromosome_and_Expression_level <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, ~ . -male_bias_in_expression -h2)
  Sex_bias_and_Expression_level <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, ~ . -chromosome -h2)
  Sex_bias_and_Chromosome <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, ~ . -AveExpr -h2)
  
  Heritability_and_Expression_level <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, 
                                              ~ . -male_bias_in_expression -chromosome)
  Heritability_and_Sex_bias <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, 
                                      ~ . -chromosome -AveExpr)
  Heritability_and_Chromosome <- update(Sex_bias_and_Chromosome_and_Expression_level_and_Heritability, 
                                        ~ . -male_bias_in_expression -AveExpr)
  
  Chromosome <- update(Chromosome_and_Expression_level, ~ . -AveExpr)
  Sex_bias <- update(Sex_bias_and_Expression_level, ~ . -AveExpr)
  Expression_level <- update(Sex_bias_and_Expression_level, ~ .  - male_bias_in_expression)
  Heritability <- update(Heritability_and_Sex_bias, ~ .  - male_bias_in_expression)
  Null_model <- update(Chromosome, ~ . -chromosome)
  
  AICc(
    Sex_bias_and_Chromosome_and_Expression_level_and_Heritability,
    Chromosome_and_Expression_level_and_Heritability,
    Sex_bias_and_Chromosome_and_Expression_level,
    Sex_bias_and_Expression_level_and_Heritability,
    Sex_bias_and_Chromosome_and_Heritability,
    Chromosome_and_Expression_level, 
    Sex_bias_and_Expression_level,
    Sex_bias_and_Chromosome,
    Heritability_and_Expression_level,
    Heritability_and_Sex_bias,
    Heritability_and_Chromosome,
    Chromosome, Sex_bias, Expression_level, Heritability,
    Null_model) %>% 
    rownames_to_column("Model") %>%
    arrange(AICc) %>%
    mutate(delta = AICc - AICc[1],
           Model = str_replace_all(Model, "_and+_", " + "),
           Weight = round(exp(-0.5 * delta) / sum(exp(-0.5 * delta)), 3),
           delta = round(delta, 2))
}

Probability of sexual antagonism

betareg(P_sex_antag ~ male_bias_in_expression + chromosome + AveExpr + h2, 
        data = twas) %>% compare_mods()
                                                     Model df      AICc delta
1                              Sex_bias + Expression_level  4 -30441.41  0.00
2                 Sex_bias + Chromosome + Expression_level  8 -30441.06  0.34
3               Sex_bias + Expression_level + Heritability  5 -30439.52  1.89
4  Sex_bias + Chromosome + Expression_level + Heritability  9 -30439.15  2.26
5                                         Expression_level  3 -30422.02 19.38
6                            Chromosome + Expression_level  7 -30421.95 19.46
7                          Heritability + Expression_level  4 -30421.12 20.28
8             Chromosome + Expression_level + Heritability  8 -30420.90 20.51
9                     Sex_bias + Chromosome + Heritability  8 -30392.54 48.86
10                                 Heritability + Sex_bias  4 -30392.47 48.94
11                                                Sex_bias  3 -30388.94 52.46
12                                   Sex_bias + Chromosome  7 -30388.74 52.67
13                               Heritability + Chromosome  7 -30359.08 82.33
14                                            Heritability  3 -30358.16 83.25
15                                              Chromosome  6 -30357.43 83.98
16                                              Null_model  2 -30356.87 84.54
   Weight
1   0.391
2   0.330
3   0.152
4   0.127
5   0.000
6   0.000
7   0.000
8   0.000
9   0.000
10  0.000
11  0.000
12  0.000
13  0.000
14  0.000
15  0.000
16  0.000
q1 <- summary(betareg(P_sex_antag ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q1

Call:
betareg(formula = P_sex_antag ~ male_bias_in_expression + AveExpr + chromosome + 
    h2, data = twas)

Deviance residuals:
    Min      1Q  Median      3Q     Max 
-8.4438 -0.3042  0.0289  0.2959  9.1754 

Coefficients (mean model with logit link):
                         Estimate Std. Error z value Pr(>|z|)    
(Intercept)              0.296825   0.036808   8.064 7.37e-16 ***
male_bias_in_expression  0.029206   0.006488   4.502 6.74e-06 ***
AveExpr                 -0.044069   0.006316  -6.977 3.02e-12 ***
chromosome2R             0.011626   0.008820   1.318    0.187    
chromosome3L             0.002737   0.008925   0.307    0.759    
chromosome3R            -0.010594   0.008469  -1.251    0.211    
chromosomeX             -0.003335   0.009569  -0.349    0.727    
h2                       0.007332   0.024456   0.300    0.764    

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)  35.4527     0.4151   85.41   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.523e+04 on 9 Df
Pseudo R-squared: 0.006467
Number of iterations: 14 (BFGS) + 2 (Fisher scoring) 

Probability of being female-specific

betareg(P_female_specific ~ male_bias_in_expression + chromosome + AveExpr + h2, 
        data = twas) %>% compare_mods()
                                                     Model df      AICc  delta
1  Sex_bias + Chromosome + Expression_level + Heritability  9 -49193.02   0.00
2               Sex_bias + Expression_level + Heritability  5 -49183.00  10.02
3             Chromosome + Expression_level + Heritability  8 -49180.74  12.28
4                          Heritability + Expression_level  4 -49172.53  20.49
5                     Sex_bias + Chromosome + Heritability  8 -49140.68  52.34
6                                Heritability + Chromosome  7 -49136.83  56.19
7                                  Heritability + Sex_bias  4 -49126.38  66.64
8                                             Heritability  3 -49124.07  68.95
9                 Sex_bias + Chromosome + Expression_level  8 -49098.44  94.58
10                           Chromosome + Expression_level  7 -49094.93  98.09
11                                   Sex_bias + Chromosome  7 -49084.10 108.92
12                                              Chromosome  6 -49082.96 110.06
13                             Sex_bias + Expression_level  4 -49081.44 111.58
14                                        Expression_level  3 -49079.47 113.55
15                                              Null_model  2 -49065.62 127.40
16                                                Sex_bias  3 -49065.52 127.50
   Weight
1   0.991
2   0.007
3   0.002
4   0.000
5   0.000
6   0.000
7   0.000
8   0.000
9   0.000
10  0.000
11  0.000
12  0.000
13  0.000
14  0.000
15  0.000
16  0.000
q2 <- summary(betareg(P_female_specific ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q2

Call:
betareg(formula = P_female_specific ~ male_bias_in_expression + AveExpr + 
    chromosome + h2, data = twas)

Deviance residuals:
    Min      1Q  Median      3Q     Max 
-5.8656 -0.2861 -0.2321  0.3132  9.7894 

Coefficients (mean model with logit link):
                        Estimate Std. Error z value Pr(>|z|)    
(Intercept)             -2.83425    0.06386 -44.384  < 2e-16 ***
male_bias_in_expression  0.04314    0.01121   3.848 0.000119 ***
AveExpr                  0.08013    0.01094   7.327 2.34e-13 ***
chromosome2R             0.01133    0.01535   0.738 0.460479    
chromosome3L             0.02644    0.01550   1.706 0.088002 .  
chromosome3R             0.01641    0.01474   1.114 0.265364    
chromosomeX              0.06539    0.01646   3.972 7.14e-05 ***
h2                      -0.40886    0.04342  -9.417  < 2e-16 ***

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)  33.4005     0.4021   83.06   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 2.461e+04 on 9 Df
Pseudo R-squared: 0.008763
Number of iterations: 17 (BFGS) + 2 (Fisher scoring) 

Probability of being female-specific

betareg(P_male_specific ~ male_bias_in_expression + chromosome + AveExpr + h2,
        data = twas) %>% compare_mods()
                                                     Model df      AICc delta
1               Sex_bias + Expression_level + Heritability  5 -47455.26  0.00
2  Sex_bias + Chromosome + Expression_level + Heritability  9 -47450.05  5.21
3                          Heritability + Expression_level  4 -47444.90 10.36
4             Chromosome + Expression_level + Heritability  8 -47439.34 15.91
5                              Sex_bias + Expression_level  4 -47439.03 16.23
6                 Sex_bias + Chromosome + Expression_level  8 -47433.74 21.52
7                                         Expression_level  3 -47423.39 31.87
8                            Chromosome + Expression_level  7 -47417.98 37.27
9                                  Heritability + Sex_bias  4 -47414.16 41.10
10                                                Sex_bias  3 -47412.03 43.23
11                    Sex_bias + Chromosome + Heritability  8 -47408.91 46.34
12                                   Sex_bias + Chromosome  7 -47406.82 48.44
13                                            Heritability  3 -47392.28 62.98
14                                              Null_model  2 -47388.24 67.02
15                               Heritability + Chromosome  7 -47387.03 68.22
16                                              Chromosome  6 -47383.19 72.07
   Weight
1   0.926
2   0.068
3   0.005
4   0.000
5   0.000
6   0.000
7   0.000
8   0.000
9   0.000
10  0.000
11  0.000
12  0.000
13  0.000
14  0.000
15  0.000
16  0.000
q3 <- summary(betareg(P_male_specific ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q3

Call:
betareg(formula = P_male_specific ~ male_bias_in_expression + AveExpr + 
    chromosome + h2, data = twas)

Deviance residuals:
    Min      1Q  Median      3Q     Max 
-4.6317 -0.3338 -0.2835  0.2417  7.8801 

Coefficients (mean model with logit link):
                         Estimate Std. Error z value Pr(>|z|)    
(Intercept)             -2.845465   0.070036 -40.629  < 2e-16 ***
male_bias_in_expression -0.043538   0.012423  -3.505 0.000457 ***
AveExpr                  0.078981   0.011989   6.588 4.46e-11 ***
chromosome2R            -0.020552   0.016788  -1.224 0.220868    
chromosome3L            -0.012103   0.016969  -0.713 0.475699    
chromosome3R             0.003115   0.016070   0.194 0.846299    
chromosomeX             -0.008758   0.018145  -0.483 0.629336    
h2                      -0.193851   0.046887  -4.134 3.56e-05 ***

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)  27.0203     0.3273   82.54   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 2.373e+04 on 9 Df
Pseudo R-squared: 0.005672
Number of iterations: 16 (BFGS) + 2 (Fisher scoring) 

Probability of equal effects

betareg(P_equal_effects ~ male_bias_in_expression + chromosome + AveExpr + chromosome + h2, 
        data = twas) %>% compare_mods()
                                                     Model df      AICc  delta
1  Sex_bias + Chromosome + Expression_level + Heritability  9 -28799.43   0.00
2               Sex_bias + Expression_level + Heritability  5 -28794.12   5.30
3             Chromosome + Expression_level + Heritability  8 -28789.58   9.85
4                 Sex_bias + Chromosome + Expression_level  8 -28783.96  15.47
5                          Heritability + Expression_level  4 -28783.81  15.62
6                              Sex_bias + Expression_level  4 -28777.46  21.97
7                            Chromosome + Expression_level  7 -28769.30  30.13
8                                         Expression_level  3 -28761.81  37.62
9                                    Sex_bias + Chromosome  7 -28689.41 110.02
10                    Sex_bias + Chromosome + Heritability  8 -28687.47 111.96
11                                                Sex_bias  3 -28682.03 117.39
12                                 Heritability + Sex_bias  4 -28680.19 119.24
13                                              Chromosome  6 -28660.46 138.97
14                               Heritability + Chromosome  7 -28658.93 140.50
15                                              Null_model  2 -28651.16 148.27
16                                            Heritability  3 -28649.93 149.49
   Weight
1   0.927
2   0.065
3   0.007
4   0.000
5   0.000
6   0.000
7   0.000
8   0.000
9   0.000
10  0.000
11  0.000
12  0.000
13  0.000
14  0.000
15  0.000
16  0.000
q4 <- summary(betareg(P_equal_effects ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q4

Call:
betareg(formula = P_equal_effects ~ male_bias_in_expression + AveExpr + 
    chromosome + h2, data = twas)

Deviance residuals:
    Min      1Q  Median      3Q     Max 
-4.7139 -0.2446 -0.1877  0.2974  7.6173 

Coefficients (mean model with logit link):
                         Estimate Std. Error z value Pr(>|z|)    
(Intercept)             -2.209333   0.067199 -32.877  < 2e-16 ***
male_bias_in_expression -0.040995   0.011892  -3.447 0.000566 ***
AveExpr                  0.122788   0.011499  10.678  < 2e-16 ***
chromosome2R            -0.024335   0.016146  -1.507 0.131758    
chromosome3L            -0.001919   0.016297  -0.118 0.906247    
chromosome3R             0.026460   0.015421   1.716 0.086181 .  
chromosomeX              0.020194   0.017384   1.162 0.245382    
h2                      -0.183550   0.044779  -4.099 4.15e-05 ***

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)  15.8446     0.1861   85.16   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.441e+04 on 9 Df
Pseudo R-squared: 0.01127
Number of iterations: 16 (BFGS) + 2 (Fisher scoring) 

Probability of null effect

betareg(P_null ~ male_bias_in_expression + chromosome + AveExpr + chromosome + h2,
        data = twas) %>% compare_mods()
                                                     Model df      AICc  delta
1                     Sex_bias + Chromosome + Heritability  8 -34144.13   0.00
2  Sex_bias + Chromosome + Expression_level + Heritability  9 -34144.06   0.07
3               Sex_bias + Expression_level + Heritability  5 -34131.89  12.24
4                                  Heritability + Sex_bias  4 -34131.37  12.76
5                                Heritability + Chromosome  7 -34046.97  97.16
6             Chromosome + Expression_level + Heritability  8 -34045.38  98.75
7                                             Heritability  3 -34040.36 103.77
8                          Heritability + Expression_level  4 -34038.56 105.57
9                 Sex_bias + Chromosome + Expression_level  8 -33977.32 166.82
10                                   Sex_bias + Chromosome  7 -33967.50 176.63
11                             Sex_bias + Expression_level  4 -33960.34 183.79
12                                                Sex_bias  3 -33951.77 192.36
13                           Chromosome + Expression_level  7 -33913.19 230.94
14                                        Expression_level  3 -33902.16 241.97
15                                              Chromosome  6 -33892.38 251.75
16                                              Null_model  2 -33883.30 260.83
   Weight
1   0.508
2   0.490
3   0.001
4   0.001
5   0.000
6   0.000
7   0.000
8   0.000
9   0.000
10  0.000
11  0.000
12  0.000
13  0.000
14  0.000
15  0.000
16  0.000
q5 <- summary(betareg(P_null ~ male_bias_in_expression + AveExpr + chromosome + h2, data = twas), type = "deviance")
q5

Call:
betareg(formula = P_null ~ male_bias_in_expression + AveExpr + chromosome + 
    h2, data = twas)

Deviance residuals:
    Min      1Q  Median      3Q     Max 
-5.7312 -0.2524  0.2238  0.5991  4.1395 

Coefficients (mean model with logit link):
                        Estimate Std. Error z value Pr(>|z|)    
(Intercept)             -1.83405    0.06705 -27.352  < 2e-16 ***
male_bias_in_expression  0.11931    0.01170  10.200  < 2e-16 ***
AveExpr                  0.01597    0.01151   1.388 0.165237    
chromosome2R             0.05831    0.01607   3.630 0.000284 ***
chromosome3L             0.02796    0.01632   1.713 0.086675 .  
chromosome3R             0.02435    0.01550   1.571 0.116190    
chromosomeX              0.06474    0.01739   3.723 0.000197 ***
h2                      -0.58302    0.04607 -12.656  < 2e-16 ***

Phi coefficients (precision model with identity link):
      Estimate Std. Error z value Pr(>|z|)    
(phi)  18.8218     0.2231   84.36   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Type of estimator: ML (maximum likelihood)
Log-likelihood: 1.708e+04 on 9 Df
Pseudo R-squared: 0.01273
Number of iterations: 19 (BFGS) + 2 (Fisher scoring) 

Plots showing the statistical results

gwas_stats_fig <- bind_rows(p1$coefficients$mean %>%
            as.data.frame() %>%
            rownames_to_column("Parameter") %>%
            mutate(type = "P(sexually antagonistic)"),
          p2$coefficients$mean %>%
            as.data.frame() %>%
            rownames_to_column("Parameter") %>%
            mutate(type = "P(female-specific)"),
          p3$coefficients$mean %>%
            as.data.frame() %>%
            rownames_to_column("Parameter") %>%
            mutate(type = "P(male-specific)"),
          p4$coefficients$mean %>%
            as.data.frame() %>%
            rownames_to_column("Parameter") %>%
            mutate(type = "P(equal effects)")) %>%
  mutate(Parameter = str_replace_all(Parameter, "site[.]class", "Site class: "),
         Parameter = str_replace_all(Parameter, "chr", "Chromosome: "),
         sig = ifelse(`Pr(>|z|)` < 0.05, "yes", "no")) %>%
  filter(Parameter != "(Intercept)") %>%
  arrange(Parameter) %>%
  mutate(Parameter = factor(Parameter, rev(unique(Parameter)))) %>%
  ggplot(aes(Parameter, Estimate, colour = sig)) +
  geom_hline(yintercept = 0, linetype = 2) +
  geom_errorbar(aes(ymin = Estimate - `Std. Error`*1.96, ymax = Estimate + `Std. Error`*1.96), width = 0) + 
  geom_point() + 
  coord_flip() +
  facet_wrap(~type) +
  theme_bw() +
  theme(legend.position = "none",
        strip.background = element_blank(), 
        text = element_text(family = "Raleway", size = 12)) +
  scale_color_manual(values = c("black", "tomato")) + 
  ylab("Estimate \u00B1 95% CIs") 

twas_stats_fig <- bind_rows(q1$coefficients$mean %>%
            as.data.frame() %>%
            rownames_to_column("Parameter") %>%
            mutate(type = "P(sexually antagonistic)"),
          q2$coefficients$mean %>%
            as.data.frame() %>%
            rownames_to_column("Parameter") %>%
            mutate(type = "P(female-specific)"),
          q3$coefficients$mean %>%
            as.data.frame() %>%
            rownames_to_column("Parameter") %>%
            mutate(type = "P(male-specific)"),
          q4$coefficients$mean %>%
            as.data.frame() %>%
            rownames_to_column("Parameter") %>%
            mutate(type = "P(equal effects)"),
          q5$coefficients$mean %>%
            as.data.frame() %>%
            rownames_to_column("Parameter") %>%
            mutate(type = "P(null effect)")) %>%
  mutate(Parameter = str_replace_all(Parameter, "AveExpr", "Average expression level"),
         Parameter = str_replace_all(Parameter, "male_bias_in_expression", "Male bias in expression"),
         Parameter = str_replace_all(Parameter, "chromosome", "Chromosome: "),
         sig = ifelse(`Pr(>|z|)` < 0.05, "yes", "no"),
         type = factor(type, c("P(null effect)", "P(equal effects)", "P(sexually antagonistic)",
                               "P(female-specific)", "P(male-specific)"))) %>%
  filter(Parameter != "(Intercept)") %>%
  arrange(Parameter) %>%
  mutate(Parameter = factor(Parameter, rev(unique(Parameter)))) %>%
  ggplot(aes(Parameter, Estimate, colour = sig)) +
  geom_hline(yintercept = 0, linetype = 2) +
  geom_errorbar(aes(ymin = Estimate - `Std. Error`*1.96, ymax = Estimate + `Std. Error`*1.96), width = 0) + 
  geom_point() + 
  coord_flip() +
  facet_wrap(~type) +
  theme_bw() +
  theme(legend.position = "none",
        strip.background = element_blank(), 
        text = element_text(family = "Raleway", size = 12)) +
  scale_color_manual(values = c("black", "tomato")) + 
  ylab("Estimate \u00B1 95% CIs") 


ggsave("figures/GWAS_stats_figure.pdf", gwas_stats_fig, width = 6, height = 6)
ggsave("figures/TWAS_stats_figure.pdf", twas_stats_fig, width = 7, height = 6)

Effect sizes for the GWAS analysis

gwas_stats_fig

Figure XX: The figure shows estimated parameters from four separate beta regression models seeking predictors of the mixture assignment probabilities calculated using mashr on the GWAS results. Positive values indicate that the predictor is associated with an elevated probability across loci; estimates that are non-zero with 95% confidence are shown in red. The clearest result is the strong, positive relationship between minor allele frequency (MAF) and the probability that the locus was assigned to the ‘sexually antagonistic’ mixture component. There was also a strong, negative relationship between MAF and the chance that the locus was assigned to the ‘equal effects’ mixture component (i.e. that the locus had concordant effects on male and female fitness). Loci with a relatively high probability of being female-specific tended to have lower MAF, while loci with a relatively high probability of being male-specific tended to have higher MAF. Chromosome 2L seems to be enriched for loci with male-specific effects, and de-enriched for loci that affect females relative to the other chromosomes. Finally, there were some associations between site class and mixture assignment probability, which (though sometimes statistically significant) were small and subtle.

Effect sizes for the TWAS analysis

twas_stats_fig

Figure XX: The figure shows estimated parameters from five separate beta regression models seeking predictors of the mixture assignment probabilities calculated using mashr on the TWAS results. Positive values indicate that the predictor is associated with an elevated probability across transcripts; estimates that are non-zero with 95% confidence are shown in red. The clearest result is the strong, negative relationship between the heritability of the transcript’s expression level (calculated by Huang et al. 2015) and the probability that the locus was assigned to any mixture component other than ‘sexually antagonistic’. Transcripts with strongly male-biased expression were more likely to have no relationship with fitness (i.e. to be assigned to the ‘null’ mixture component); sex bias was a significant but weak predictor for the other mixture components. Transcripts with high expression levels were more likely to have a sexually concordant relationship with fitness, and less likely to be sexually antagonistic. Transcripts from genes located on the X chromosome were slightly more likely to have a female-specific correlation with fitness, or no correlation.

Effect of the minor allele

Evolutionary theory makes various testable predictions about the effects of the major and minor alleles on phenotypes, especially phenotypes that are closely correlated with fitness. For example, the minor allele should generally be associated with the lower-fitness phenotype (and the major allele with the higher-fitness phenotype), due to selection removing the lower-fitness allele. Furthermore, it is interesting to ask whether the female-beneficial allele or the male-beneficial allele tends to be the major allele at sexually-antagonistic loci? Lastly, we might expect different results at loci where the minor allele is quite rare vs loci where the minor allele is only slightly rarer than the major allele; loci with comparatively low MAF are more likely to reflect new polymorphisms (since most mutations are neutral or deleterious, and it takes time for the minor allele to drift towards high frequencies), and are more likely to be under selection (since selection against the minor allele keeps it rare).

To address these questions graphically, we here plot the effects on male and female early-life fitness of each allele, that was assigned to the female-specific, male-specific, or sexually antagonistic mixture component by mashr, with an assignment probability in the highest 0.1% across all the loci analysed. Since there were 1,602,873 loci, each column plots the male and female effect sizes for 1,602 loci.

Among loci with female-specific effects on fitness (first column), the minor allele was usually associated with reduced female fitness, especially among loci where the minor allele frequency was below 0.2, as expected if alleles that harm female fitness are removed by selection. There was no similar finding for loci with male-specific effects (second column): the minor allele was equally likely to increase or reduce fitness, regardless of MAF. For loci with sexually-antagonistic effects on fitness (third column), the minor allele tended to be the one that reduced female fitness and elevated male fitness; this pattern was clearest among loci with a MAF < 0.2. Finally, among loci inferred to be under sexually concordant selection, the effect sizes were strongest among loci with MAF < 0.2, as expected if selection prevents alleles with strong, detrimental effects on the fitness of both sexes from becoming common.

histo_data <- tbl(db, "univariate_lmm_results") %>% 
  inner_join(tbl(db, "variants") %>% 
               select(-chr, -position), 
             by = "SNP") %>%
  collect(n=Inf) %>%
    mutate(class = 
             case_when(
               P_sex_antag > quantile(P_sex_antag, probs=0.999)  ~ "Top 0.1%\nsexually antagonistic loci",
               P_equal_effects > quantile(P_equal_effects, probs=0.999)  ~ "Top 0.1%\nsexually concordant loci",
               P_female_specific > quantile(P_female_specific, probs=0.999)  ~ "Top 0.1%\nfemale-specific loci",
               P_male_specific > quantile(P_male_specific, probs=0.999)  ~ "Top 0.1%\nmale-specific loci",
             )) %>%
    mutate(MAF = ifelse(MAF < 0.2, "0.05 < MAF < 0.2", "0.2 < MAF < 0.5")) %>%
    filter(!is.na(class)) %>%
  select(beta_female_early_mashr_ED, beta_male_early_mashr_ED, class, MAF)

eff_size_histos <- ggplot(histo_data, aes(beta_female_early_mashr_ED)) + 
  geom_vline(xintercept = 0, linetype = 2) +
  geom_histogram(bins=30,fill="tomato",colour=NA, alpha = 0.5) + 
  geom_histogram(aes(beta_male_early_mashr_ED), bins=30,fill="steelblue",colour=NA, alpha = 0.5) + 
  geom_histogram(bins=30,fill=NA,colour="black") + 
  geom_histogram(aes(beta_male_early_mashr_ED), bins=30,fill=NA,colour="black") + 
  facet_grid(MAF ~ class, scales = "free_y") +
  xlab("Effect of the minor allele on early-life fitness\n(females:red, males:blue)") + 
  theme_bw() +
  theme(legend.position = "none",
        strip.background = element_blank(), 
        text = element_text(family = "Raleway", size = 12)) +
  ylab("Number of loci")

ggsave("figures/eff_size_histos.pdf", width = 8, height = 5)
eff_size_histos

Manhattan plot

manhattan_data <- tbl(db, "univariate_lmm_results") %>% 
  select(SNP, starts_with("P_"),
         beta_female_early_mashr_ED, beta_male_early_mashr_ED) %>%
  distinct() %>%
  collect(n=Inf) %>%
  mutate(position = str_split(SNP, "_"),
         chr = map_chr(position, ~ .x[1]),
         position = as.numeric(map_chr(position, ~ .x[2])),
         P_pleiotropy = 1 - P_null - P_female_specific - P_male_specific,
         SA_not_concord = P_sex_antag > P_equal_effects,
         top_SA = P_sex_antag >= quantile(P_sex_antag, probs=0.9999)) %>%
  mutate(beta_female_early_mashr_ED = abs(beta_female_early_mashr_ED),
         beta_male_early_mashr_ED = abs(beta_male_early_mashr_ED)) %>%
  # mutate(beta_male_early_mashr_ED = replace(
  #   beta_male_early_mashr_ED, 
  #   beta_female_early_mashr_ED < 0, 
  #   beta_male_early_mashr_ED[beta_female_early_mashr_ED < 0] * -1),
  #   beta_female_early_mashr_ED = replace(
  #   beta_female_early_mashr_ED, 
  #   beta_female_early_mashr_ED < 0, 
  #   beta_female_early_mashr_ED[beta_female_early_mashr_ED < 0] * -1),
  #   ) %>%
  filter(chr != "4") 

max_pos <- manhattan_data %>%
  group_by(chr) %>%
  summarise(max_pos = max(position), .groups = "drop") %>%
  as.data.frame()
max_pos$max_pos <- c(0, cumsum(max_pos$max_pos[1:4]))

manhattan_data <- manhattan_data %>%
    left_join(max_pos, by = "chr") %>%
    mutate(position = position + max_pos) 

# cols <- c("blue", "skyblue", "red", "pink", "orange", "yellow",
#           "purple", "steelblue", "green", "tomato")
# manhattan_data %>%
#   filter(P_pleiotropy > 0.7485376) %>%
#   sample_n(100000) %>%
#   ggplot(aes(position, -1 * log10(1 - P_pleiotropy), colour = chr, stroke = 0.2)) +
#   geom_point(size = 0.5, pch = 1) +
#  # scale_color_manual(values = cols) +
#   # scale_fill_brewer(palette = "Paired", name = "Chromosome") + 
#   scale_colour_brewer(palette = "Paired", name = "Chromosome") + 
#   ylab("-log10(Posterior probability of null effect)") +
#   theme_bw() +
#   theme(axis.text.x = element_blank(),
#         axis.ticks.x = element_blank())


p1 <- manhattan_data %>%
  ggplot(aes(position, beta_female_early_mashr_ED, group = chr, colour = chr, stroke = 0.2)) + 
  geom_point(size = 0.5) +
  geom_point(data = manhattan_data %>% filter(top_SA),
             aes(y = 0.43), size = 3, pch = 6) +
  scale_colour_brewer(palette = "Paired", name = "Chromosome") + 
  ylab("Effect on female fitness") + 
  xlab("") +
  theme_bw() + 
  theme(axis.text.x = element_blank(),
        panel.border = element_blank(),
        axis.ticks.x = element_blank())

p2 <- manhattan_data %>%
  ggplot(aes(position, beta_male_early_mashr_ED, group = chr, colour = chr, stroke = 0.2)) + 
  geom_point(size = 0.5) +
  scale_colour_brewer(palette = "Paired", name = "Chromosome") + 
  ylab("Effect on male fitness") + 
  xlab("Position") +
  scale_y_reverse() +
  theme_bw() + 
  theme(axis.text.x = element_blank(),
        panel.border = element_blank(),
        axis.ticks.x = element_blank())

grid.arrange(p1, p2)

Antagonism ratios figure

make_evidence_ratio_plot <- function(dat, ymax, ylab){
  # Argument needs to be dataframe of loci or transcripts, which are identified by col called "identifier"
  # The FE, FL,ME, ML effect sizes need to be called pheno1-pheno4
  # The corresponding LFSR columns need to be called LFSR1-LFSR4
  
  antagonism_evidence_ratios <- dat %>%
    
    # Convert the LFSR to probability that beta is positive
    # Here, we use the ED results because they are agnostic to our expectations for the results
    # However, running this with the Canonocial results instead gives an essentially identical graph
    mutate(pp_female_early = ifelse(pheno1 > 0, LFSR1, 1 - LFSR1),
           pp_female_late  = ifelse(pheno2 > 0, LFSR2, 1 - LFSR2),
           pp_male_early   = ifelse(pheno3 > 0, LFSR3, 1 - LFSR3),
           pp_male_late    = ifelse(pheno4 > 0, LFSR4, 1 - LFSR4)) %>%
    
    # Calculate the probabilities that beta_i and beta_j have the same/opposite signs
    mutate(p_sex_concord_early = pp_female_early * pp_male_early + 
             (1 - pp_female_early) * (1 - pp_male_early),
           p_sex_antag_early = pp_female_early * (1 - pp_male_early) + 
             (1 - pp_female_early) * pp_male_early,
           p_sex_concord_late  = pp_female_late * pp_male_late + 
             (1 - pp_female_late) * (1 - pp_male_late),
           p_sex_antag_late = pp_female_late * (1 - pp_male_late) + 
             (1 - pp_female_late) * pp_male_late,
           p_age_concord_females = pp_female_early * pp_female_late + 
             (1 - pp_female_early) * (1 - pp_female_late),
           p_age_antag_females = pp_female_early * (1 - pp_female_late) + 
             (1 - pp_female_early) * pp_female_late,
           p_age_concord_males = pp_male_early * pp_male_late + (1 - pp_male_early) * (1 - pp_male_late),
           p_age_antag_males = pp_male_early * (1 - pp_male_late) + (1 - pp_male_early) * pp_male_late) %>%
    
    # Find the ratios of some of these probabilities (i.e. "evidence ratios")
    mutate(`Inter-sex (early life)` = p_sex_concord_early / p_sex_antag_early,
           `Inter-sex (late life)` = p_sex_concord_late / p_sex_antag_late,
           `Inter-age (females)` = p_age_concord_females / p_age_antag_females,
           `Inter-age (males)` = p_age_concord_males / p_age_antag_males) %>%
    select(identifier,  starts_with("Inter")) %>%
    gather(trait, evidence_ratio, -identifier) %>%
    mutate(trait = factor(trait, c("Inter-sex (early life)",
                                   "Inter-sex (late life)",
                                   "Inter-age (females)",
                                   "Inter-age (males)")))
  
  antagonism_evidence_ratios %>%
    ggplot(aes(log2(evidence_ratio))) + 
    geom_histogram(data=subset(antagonism_evidence_ratios, evidence_ratio < 1), 
                   bins = 500, fill = "#FF635C") + 
    geom_histogram(data=subset(antagonism_evidence_ratios, evidence_ratio > 1), 
                   bins = 500, fill = "#5B8AFD") +
    coord_cartesian(xlim = c(-10, 10), ylim = c(0, ymax)) + 
    scale_x_continuous(breaks = c(-10,  -6,  -2,   2,   6,  10), 
                       labels = c(paste("1/",2 ^ c(10,  6,  2), sep = ""), 2 ^ c(2,6,10))) + 
    facet_wrap(~ trait) + 
    xlab("Evidence ratio (log2 scale)") + ylab(ylab) + 
    theme_bw() + 
    theme(panel.border = element_rect(size = 0.8),
         # text = element_text(family = "Raleway", size = 12),
          axis.text.x = element_text(angle = 45, vjust = 1, hjust=1),
          strip.background = element_blank())
}

GWAS_ratios_plot <- SNP_clumps %>%
  rename(pheno1 = beta_female_early_mashr_ED, 
         pheno2 = beta_female_late_mashr_ED, 
         pheno3 = beta_male_early_mashr_ED, 
         pheno4 = beta_male_late_mashr_ED,
         LFSR1 = LFSR_female_early_mashr_canonical, 
         LFSR2 = LFSR_female_late_mashr_canonical, 
         LFSR3 = LFSR_male_early_mashr_canonical, 
         LFSR4 = LFSR_male_late_mashr_canonical,
         identifier = SNP_clump) %>%
  make_evidence_ratio_plot(ymax = 10000, ylab = "Number of loci")

TWAS_ED <- readRDS("data/derived/TWAS/TWAS_ED.rds")
TWAS_canonical <- readRDS("data/derived/TWAS/TWAS_canonical.rds")

TWAS_mashr_results <- 
  data.frame(
    FBID = read_delim("data/input/huang_transcriptome/dgrp.array.exp.female.txt", delim = " ") %>% pull(gene),
    as.data.frame(get_pm(TWAS_canonical)) %>% rename_all(~str_c("canonical_", .)),
    as.data.frame(get_lfsr(TWAS_canonical)) %>% 
      rename_all(~str_replace_all(., "beta", "LFSR")) %>% rename_all(~str_c("canonical_", .))) %>% 
  as_tibble() %>% 
  bind_cols(
    data.frame(
      as.data.frame(get_pm(TWAS_ED)) %>% rename_all(~str_c("ED_", .)),
      as.data.frame(get_lfsr(TWAS_ED)) %>% 
        rename_all(~str_replace_all(., "beta", "LFSR")) %>% rename_all(~str_c("ED_", .))) %>% 
      as_tibble())

TWAS_ratios_plot <- TWAS_mashr_results %>%
  rename(pheno1 = ED_beta_FE, 
         pheno2 = ED_beta_FL, 
         pheno3 = ED_beta_ME, 
         pheno4 = ED_beta_ML,
         LFSR1 = ED_LFSR_FE, 
         LFSR2 = ED_LFSR_FL, 
         LFSR3 = ED_LFSR_ME, 
         LFSR4 = ED_LFSR_ML,
         identifier = FBID) %>%
  make_evidence_ratio_plot(ymax = 500, ylab = "Number of transcripts")


pp <- plot_grid(GWAS_ratios_plot, TWAS_ratios_plot, 
                   labels = c('A', 'B'), label_size = 12)
# ggdraw(add_sub(pp, "Evidence ratio (log scale)", 
#                vpadding=grid::unit(0,"lines"), 
#                y=6.3, x=0.5, vjust=6))
ggsave("figures/antagonism_ratios.pdf", pp, width = 8.5, height = 4.9)

pp


sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lmtest_0.9-37    zoo_1.8-8        MuMIn_1.43.17    betareg_3.1-4   
 [5] cowplot_1.0.0    kableExtra_1.1.0 mashr_0.2.38     ashr_2.2-47     
 [9] showtext_0.9-1   showtextdb_3.0   sysfonts_0.8.2   Hmisc_4.4-0     
[13] Formula_1.2-3    survival_3.2-7   lattice_0.20-41  ggbeeswarm_0.6.0
[17] qqman_0.1.4      gridExtra_2.3    forcats_0.5.0    stringr_1.4.0   
[21] dplyr_1.0.0      purrr_0.3.4      readr_1.3.1      tidyr_1.1.0     
[25] tibble_3.0.1     ggplot2_3.3.2    tidyverse_1.3.0  workflowr_1.6.2 

loaded via a namespace (and not attached):
  [1] colorspace_1.4-1    modeltools_0.2-23   ellipsis_0.3.1     
  [4] rprojroot_1.3-2     htmlTable_1.13.3    base64enc_0.1-3    
  [7] fs_1.4.1            rstudioapi_0.11     hexbin_1.28.1      
 [10] farver_2.0.3        flexmix_2.3-15      bit64_0.9-7        
 [13] fansi_0.4.1         mvtnorm_1.1-0       lubridate_1.7.8    
 [16] xml2_1.3.2          splines_4.0.3       knitr_1.30         
 [19] jsonlite_1.7.0      broom_0.5.6         cluster_2.1.0      
 [22] dbplyr_1.4.4        png_0.1-7           compiler_4.0.3     
 [25] httr_1.4.1          backports_1.1.7     assertthat_0.2.1   
 [28] Matrix_1.2-18       cli_2.0.2           later_1.0.0        
 [31] acepack_1.4.1       htmltools_0.5.0     tools_4.0.3        
 [34] gtable_0.3.0        glue_1.4.2          Rcpp_1.0.4.6       
 [37] cellranger_1.1.0    vctrs_0.3.0         nlme_3.1-149       
 [40] xfun_0.19           rvest_0.3.5         lifecycle_0.2.0    
 [43] irlba_2.3.3         MASS_7.3-53         scales_1.1.1       
 [46] hms_0.5.3           promises_1.1.0      sandwich_2.5-1     
 [49] RColorBrewer_1.1-2  yaml_2.2.1          curl_4.3           
 [52] memoise_1.1.0       rpart_4.1-15        latticeExtra_0.6-29
 [55] calibrate_1.7.7     stringi_1.5.3       RSQLite_2.2.0      
 [58] SQUAREM_2020.2      rmeta_3.0           highr_0.8          
 [61] checkmate_2.0.0     truncnorm_1.0-8     rlang_0.4.6        
 [64] pkgconfig_2.0.3     evaluate_0.14       invgamma_1.1       
 [67] labeling_0.3        htmlwidgets_1.5.1   bit_1.1-15.2       
 [70] tidyselect_1.1.0    plyr_1.8.6          magrittr_2.0.1     
 [73] R6_2.4.1            generics_0.0.2      DBI_1.1.0          
 [76] pillar_1.4.4        haven_2.3.1         whisker_0.4        
 [79] foreign_0.8-80      withr_2.2.0         abind_1.4-5        
 [82] mixsqp_0.3-43       nnet_7.3-14         modelr_0.1.8       
 [85] crayon_1.3.4        rmarkdown_2.5       jpeg_0.1-8.1       
 [88] isoband_0.2.1       grid_4.0.3          readxl_1.3.1       
 [91] data.table_1.12.8   blob_1.2.1          git2r_0.27.1       
 [94] reprex_0.3.0        digest_0.6.25       webshot_0.5.2      
 [97] httpuv_1.5.3.1      stats4_4.0.3        munsell_0.5.0      
[100] beeswarm_0.2.3      viridisLite_0.3.0   vipor_0.4.5