Last updated: 2023-08-22

Checks: 6 1

Knit directory: m6A_in_disease_genetics/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20230331) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute relative
~/projects/m6A_in_disease_genetics/code/ctwas/ctwas_config_b37.R code/ctwas/ctwas_config_b37.R
~/projects/m6A_in_disease_genetics/code/ctwas/qiansheng/locus_plot.R code/ctwas/qiansheng/locus_plot.R

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version fb910fa. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .ipynb_checkpoints/
    Ignored:    analysis/m6A_switch_to_disease_h2g.nb.html
    Ignored:    data/plots/

Untracked files:
    Untracked:  HMGCR_locus_gene_tracks.pdf
    Untracked:  Rplots.pdf
    Untracked:  analysis/.ipynb_checkpoints/
    Untracked:  analysis/CD_m6A_output_hg19.Rmd
    Untracked:  analysis/IBD_E_S_m6A.Rmd
    Untracked:  analysis/IBD_E_S_m6A_output.Rmd
    Untracked:  analysis/LDL_E_S_m6A.Rmd
    Untracked:  analysis/LDL_m6A_output.Rmd
    Untracked:  analysis/RA_m6A_output.Rmd
    Untracked:  analysis/WhiteBlood_WholeBlood_E_M.Rmd
    Untracked:  analysis/asthma_m6A_output_hg19.Rmd
    Untracked:  analysis/identify_m6A_mechanisms_with_finemapping.Rmd
    Untracked:  analysis/lymph_m6A_output.Rmd
    Untracked:  analysis/pre_weights_m6AQTL.txt
    Untracked:  analysis/rbc_E_S_m6A_output.Rmd
    Untracked:  analysis/rbc_m6A_output.Rmd
    Untracked:  analysis/summarize_ctwas_m6A_results.Rmd
    Untracked:  analysis/wbc_E_S_m6A_output.Rmd
    Untracked:  code/.ipynb_checkpoints/
    Untracked:  code/all_m6a_sites_with_paired_cisNATs_summary.csv
    Untracked:  code/annotating_fine-mapped_m6A_QTLs.Rmd
    Untracked:  code/check_double_strand.ipynb
    Untracked:  code/check_double_strand_v2.ipynb
    Untracked:  code/ctwas/
    Untracked:  code/figure/
    Untracked:  code/learn_gviz.Rmd
    Untracked:  code/learn_gviz.html
    Untracked:  code/learn_gviz.nb.html
    Untracked:  code/m6AQTL_finemapping.Rmd
    Untracked:  code/plot_genomic_tracks_gviz.ipynb
    Untracked:  code/summary_TWAS_coloc_m6A_2023.Rmd
    Untracked:  code/test_gviz.ipynb
    Untracked:  code/twas_genes_PP4_0.3_immune_traits_trackplots.pdf
    Untracked:  data/.ipynb_checkpoints/
    Untracked:  data/ADCY7_gwas_input.tsv
    Untracked:  data/ADCY7_qtl_input.tsv
    Untracked:  data/Allergy_full_coloc.txt
    Untracked:  data/Asthma_full_coloc.txt
    Untracked:  data/CAD_full_coloc.txt
    Untracked:  data/Eosinophil_count_full_coloc.txt
    Untracked:  data/GSE125377_jointPeakReadCount.txt
    Untracked:  data/G_list.Rd
    Untracked:  data/HMGCR_ctwas_dat.Rd
    Untracked:  data/IBD_full_coloc.txt
    Untracked:  data/JointPeaks.bed
    Untracked:  data/Li2022_dsRNAs.xlsx
    Untracked:  data/Lupus_full_coloc.txt
    Untracked:  data/RA_full_coloc.txt
    Untracked:  data/TABLE1_hg19.txt
    Untracked:  data/TABLE1_hg19.txt.zip
    Untracked:  data/__MACOSX/
    Untracked:  data/coloc_blood_traits.csv
    Untracked:  data/crohns_disease_full_coloc.txt
    Untracked:  data/ctwas_m6a_joint_top_PIP.txt
    Untracked:  data/edit_sites_and_GE_neg_correlated.txt
    Untracked:  data/edit_sites_and_GE_pos_correlated.txt
    Untracked:  data/features
    Untracked:  data/human_EERs.csv
    Untracked:  data/human_EERs.txt
    Untracked:  data/lymph_full_coloc.txt
    Untracked:  data/m6A_TWAS_results.csv
    Untracked:  data/m6a_TWAS_genes.txt
    Untracked:  data/m6a_joint_calling_peaks.csv
    Untracked:  data/nasser_2021_ABC_IBD_genes.txt
    Untracked:  data/nat_sense_pairs.csv
    Untracked:  data/plt_full_coloc.txt
    Untracked:  data/rbc_full_coloc.txt
    Untracked:  data/rdw_full_coloc.txt
    Untracked:  data/reported_AS_targets_S1.txt
    Untracked:  data/reported_AS_wanowska.txt
    Untracked:  data/sig_coloc_results/
    Untracked:  data/test_locuscomparer.pdf
    Untracked:  data/ulcerative_colitis_full_coloc.txt
    Untracked:  data/wbc_full_coloc.txt
    Untracked:  data/zhao_silver_genes.csv
    Untracked:  output/.ipynb_checkpoints/
    Untracked:  output/HMGCR_gene_track_plot.pdf
    Untracked:  output/HMGCR_locus_plot.pdf
    Untracked:  output/IBD_DHX38_plot.pdf
    Untracked:  output/IBD_DHX38_plot_genetrack.pdf
    Untracked:  output/all_m6a_sites_with_cisNATs.csv
    Untracked:  output/all_m6a_sites_with_paired_cisNATs_summary.csv
    Untracked:  output/all_m6a_sites_with_paired_cisNATs_summary_PP40.3.csv
    Untracked:  output/all_m6a_sites_with_paired_cisNATs_summary_PP40.5.csv
    Untracked:  output/all_m6a_sites_with_paired_cis_NATs.csv
    Untracked:  output/fine_mapped_m6AQTLs_TWAS_genes_highPP4.rds
    Untracked:  output/gene_summary.csv
    Untracked:  output/immune_related_m6A_targets.csv
    Untracked:  output/lupus_MIR210HG_plot.pdf
    Untracked:  output/lupus_MIR210HG_plot_genetrack.pdf
    Untracked:  output/m6aQTL_dsRNAs_PPP2R3C_PRORP.pdf
    Untracked:  output/m6a_QTL_genes.csv
    Untracked:  output/m6a_genes_PIP_0.6_blood_immune.csv
    Untracked:  output/m6a_genes_PIP_0.6_blood_immune.txt
    Untracked:  output/m6a_peaks_nearby_dsRNAs.csv
    Untracked:  output/m6a_sites_near_all_dsRNAs_twas.csv
    Untracked:  output/m6a_sites_near_dsRNAs_coloc.csv
    Untracked:  output/m6a_sites_near_dsRNAs_twas.csv
    Untracked:  output/m6a_sites_near_dsRNAs_twas_summary.csv
    Untracked:  output/m6a_sites_overlapping_NAT_twas.csv
    Untracked:  output/m6a_sites_overlapping_dsRNAs_coloc.csv
    Untracked:  output/m6a_sites_overlapping_dsRNAs_twas.csv
    Untracked:  output/m6a_sites_overlapping_dsRegions.csv
    Untracked:  output/m6a_sites_overlapping_dsRegions_coloc.csv
    Untracked:  output/negatively_correlated_genes.txt
    Untracked:  output/postively_correlated_genes.txt
    Untracked:  output/rs1806261_RABEP1-NUP88_focused_locusview.pdf
    Untracked:  output/rs1806261_RABEP1-NUP88_locusview.pdf
    Untracked:  output/rs3177647_MAPKAPK5-AS1-MAPKAPK5_locusview.pdf
    Untracked:  output/rs3204541_DDX55-EIF2B1_locusview.pdf
    Untracked:  output/rs7184802_ADCY7-BRD7_locusview.pdf
    Untracked:  output/rs7184802_ADCY7_locuscompare.pdf
    Untracked:  output/twas_genes_PP4_0.3_immune_traits_trackplots.pdf
    Untracked:  output/twas_genes_PP4_0.5_blood_traits_trackplots.pdf
    Untracked:  output/twas_m6a_sites_with_all_cisNATs.RDS
    Untracked:  output/twas_m6a_sites_with_cisNATs_range.RDS
    Untracked:  output/twas_m6a_sites_with_the_nearest_cisNAT.RDS
    Untracked:  twas_genes_PP4_0.3_immune_traits_trackplots.pdf

Unstaged changes:
    Deleted:    analysis/learn_ctwas.Rmd
    Modified:   analysis/lymph_m6A_output_hg19.Rmd
    Modified:   analysis/m6A_switch_to_disease_h2g.Rmd
    Modified:   analysis/rbc_m6A_output_hg19.Rmd
    Modified:   analysis/wbc_m6A_output.Rmd
    Modified:   analysis/wbc_m6A_output_hg19.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/eosinoph_m6A_output_hg19.Rmd) and HTML (docs/eosinoph_m6A_output_hg19.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd fb910fa Jing Gu 2023-08-22 blood traits

Load ctwas results

# top 1 method
res <- impute_expr_z(z_snp, weight = weight, ld_R_dir = ld_R_dir,
                         method = NULL, outputdir = outputdir, outname = outname.e,
                         harmonize_z = T, harmonize_wgt = T, scale_by_ld_variance=F,
                         strand_ambig_action_z = "recover", 
                         recover_strand_ambig_wgt = T
# lasso/elastic-net method
res <- impute_expr_z(z_snp, weight = weight, ld_R_dir = ld_R_dir,
                         method = NULL, outputdir = outputdir, outname = outname.e,
                         harmonize_z = T, harmonize_wgt = T, scale_by_ld_variance=F,
                         strand_ambig_action_z = "none", 
                         recover_strand_ambig_wgt = F

GWAS: UK Biobank GWAS summary statistics - European individuals

Weights: FUSION weights using top1, lasso, or elastic-net models were converted into PredictDB format and were not needed to do scaling when running ctwas.

Check convergence of parameters

cTWAS analysis on m6A alone

Joint analysis of expression, splicing and m6A

[1] "Check convergence for the lasso model when jointly analyzing expression, splicing and m6A:"
[1] "Table of group size before/after matching with UKBB SNPs:"
                          SNP      eQTL      sQTL   m6AQTL
prior_group_size    9.324e+06 2005.0000 2191.0000 918.0000
group_size          8.604e+06 1998.0000 2173.0000 912.0000
percent_of_overlaps 9.228e-01    0.9965    0.9918   0.9935
                                SNP     eQTL     sQTL   m6AQTL
estimated_group_prior     9.284e-05  0.01161 0.017250 0.027690
estimated_group_prior_var 2.043e+01 21.82962 7.252449 8.885392
estimated_group_pve       9.474e-02  0.00294 0.001578 0.001302
attributable_group_pve    9.421e-01  0.02923 0.015692 0.012952
$lasso

cTWAS results for individual analysis with m6A

top1 model

Summing up PIPs for m6A peaks located in the same gene

Top m6A PIPs by genes

cTWAS results for joint analysis using a lasso model

Top m6A modification pip

Top expression/splicing/m6A units

For m6A or splicing QTLs, they are assigned to the nearest genes (m6A needs to be confirmed with Kevin).

Top SNPs or genes with PIP > 0.6

$eQTL
            genename susie_pip group region_tag
1934          EFEMP2    0.9973  eQTL      11_36
1907 ENSG00000180139    0.9243  eQTL      10_57
1948           RPS26    0.8111  eQTL      12_36
1864           CDPF1    0.7878  eQTL      22_21
1378          POLR2C    0.7516  eQTL      16_31
1459 ENSG00000264538    0.7162  eQTL      17_18
35             FUCA1    0.6637  eQTL       1_17
1367            TBX6    0.6361  eQTL      16_24
56              PHC2    0.6203  eQTL       1_21
635             MDN1    0.6106  eQTL       6_61

$m6AQTL
     genename susie_pip  group region_tag
5050     ZER1    0.9781 m6AQTL       9_66
5042    PTK2B    0.9591 m6AQTL       8_27
5059  SLC43A3    0.9219 m6AQTL      11_33
5044   SH2D3C    0.7874 m6AQTL       9_66
4487    MEPCE    0.7451 m6AQTL       7_61
4207   TRMT13    0.6013 m6AQTL       1_62

$sQTL
     genename susie_pip group region_tag
4059    STAT1    0.9202  sQTL      2_113
4144      BAX    0.8734  sQTL      19_34
2641     IRF1    0.7121  sQTL       5_79
2538     RHOH    0.6456  sQTL       4_33
3868 C19orf54    0.6387  sQTL      19_28
3599    NLRC5    0.6058  sQTL      16_31

Top m6A modification pip

   genename region_tag susie_pip      z
1      ZER1       9_66    0.9781 -4.523
2     PTK2B       8_27    0.9591  5.080
3   SLC43A3      11_33    0.9219 -5.139
4    SH2D3C       9_66    0.7874  3.962
5     MEPCE       7_61    0.7451  3.425
6    TRMT13       1_62    0.6013 -3.538
7     SRPK1       6_29    0.5901 -5.543
8    FBXO34      14_25    0.5776 -5.179
9     RRP1B      21_23    0.5604 -3.859
10  WAC-AS1      10_20    0.5411  6.687

Summing up PIPs for m6A peaks located in the same gene

Top 10 m6A PIPs by genes

# A tibble: 819 × 2
   genename total_susie_pip
   <chr>              <dbl>
 1 ZER1               0.978
 2 SH2D3C             0.975
 3 PTK2B              0.959
 4 SLC43A3            0.922
 5 MEPCE              0.856
 6 RRP1B              0.751
 7 WAC-AS1            0.619
 8 TRMT13             0.601
 9 SRPK1              0.590
10 FBXO34             0.578
# ℹ 809 more rows

Top splicing PIPs

                    peak_id genename       pos region_tag susie_pip      z
1  chr2:191851794-191855954    STAT1 191770613      2_113    0.9202 -5.123
2   chr19:49458856-49459455      BAX  49442933      19_34    0.8734 -6.336
3  chr5:131823717-131825084     IRF1 131741498       5_79    0.7121  6.240
4    chr4:40198920-40236797     RHOH  40224043       4_33    0.6456  3.582
5   chr19:41247344-41249820 C19orf54  41168196      19_28    0.6387 -3.815
6   chr16:57079404-57081457    NLRC5  57069650      16_31    0.6058  3.446
7   chr17:46211178-46214591    SKAP1  46123642      17_28    0.5769 -4.738
8   chr19:42626728-42628331   POU2F2  42542466      19_29    0.5392 -4.750
9   chr11:65770541-65770706    BANF1  65714925      11_36    0.4871  7.791
10   chr7:75625917-75633076   STYXL1  75588366       7_48    0.4407  4.938

Summing up PIPs for spliced introns located in the same gene

Top 10 splicing PIPs by genes

# A tibble: 10 × 2
   genename total_susie_pip
   <chr>              <dbl>
 1 SP140              1.12 
 2 STAT1              1.04 
 3 STYXL1             0.930
 4 BAX                0.901
 5 WARS1              0.897
 6 IFI44L             0.880
 7 IRF1               0.875
 8 MCOLN2             0.866
 9 POU2AF1            0.851
10 BCL2A1             0.841

Top genes by combined PIP

            genename combined_pip expression_pip splicing_pip m6A_pip
2799           SP140        1.120      0.000e+00       1.1200  0.0000
2381           PTK2B        1.102      0.000e+00       0.1434  0.9591
2850           STAT1        1.040      0.000e+00       1.0397  0.0000
801           EFEMP2        1.016      9.973e-01       0.0186  0.0000
1552          IFI44L        1.007      0.000e+00       0.8801  0.1265
3301            ZER1        0.978      0.000e+00       0.0000  0.9781
2670          SH2D3C        0.975      0.000e+00       0.0000  0.9751
2870          STYXL1        0.930      1.401e-05       0.9304  0.0000
1790          MCOLN2        0.928      6.141e-02       0.8661  0.0000
874  ENSG00000180139        0.924      9.243e-01       0.0000  0.0000
2729         SLC43A3        0.922      0.000e+00       0.0000  0.9219
3239           WARS1        0.922      2.449e-02       0.8975  0.0000
283              BAX        0.901      0.000e+00       0.9013  0.0000
2294         POU2AF1        0.883      3.193e-02       0.8508  0.0000
1593            IRF1        0.875      1.091e-05       0.8750  0.0000
1813           MEPCE        0.856      0.000e+00       0.0000  0.8555
2295          POU2F2        0.849      3.097e-01       0.5392  0.0000
291           BCL2A1        0.841      0.000e+00       0.8411  0.0000
1518          HNRNPK        0.841      0.000e+00       0.7180  0.1230
3001         TMEM156        0.839      2.107e-01       0.6282  0.0000
2105            OAS1        0.838      0.000e+00       0.6796  0.1588
2577           RPS26        0.811      8.111e-01       0.0000  0.0000
     region_tag
2799      2_135
2381       8_27
2850      2_113
801       11_36
1552       1_48
3301       9_66
2670       9_66
2870       7_48
1790       1_52
874       10_57
2729      11_33
3239      14_52
283       19_34
2294      11_66
1593       5_79
1813       7_61
2295      19_29
291       15_37
1518       9_41
3001       4_32
2105      12_68
2577      12_36
Loading required package: grid
Warning: replacing previous import 'utils::download.file' by
'restfulr::download.file' when loading 'rtracklayer'

Locus plots for specific examples


R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
 [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
 [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
[10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] biomaRt_2.52.0       Gviz_1.40.1          cowplot_1.1.1       
 [4] ggplot2_3.4.3        GenomicRanges_1.48.0 GenomeInfoDb_1.32.2 
 [7] IRanges_2.30.1       S4Vectors_0.34.0     BiocGenerics_0.42.0 
[10] ctwas_0.1.38         dplyr_1.1.2          workflowr_1.7.0     

loaded via a namespace (and not attached):
  [1] colorspace_2.1-0            deldir_1.0-6               
  [3] rjson_0.2.21                rprojroot_2.0.3            
  [5] biovizBase_1.44.0           htmlTable_2.4.0            
  [7] XVector_0.36.0              base64enc_0.1-3            
  [9] fs_1.6.3                    dichromat_2.0-0.1          
 [11] rstudioapi_0.15.0           farver_2.1.1               
 [13] bit64_4.0.5                 AnnotationDbi_1.58.0       
 [15] fansi_1.0.4                 xml2_1.3.3                 
 [17] codetools_0.2-18            logging_0.10-108           
 [19] cachem_1.0.8                knitr_1.39                 
 [21] Formula_1.2-4               jsonlite_1.8.7             
 [23] Rsamtools_2.12.0            cluster_2.1.3              
 [25] dbplyr_2.3.3                png_0.1-7                  
 [27] compiler_4.2.0              httr_1.4.7                 
 [29] backports_1.4.1             lazyeval_0.2.2             
 [31] Matrix_1.6-1                fastmap_1.1.1              
 [33] cli_3.6.1                   later_1.3.0                
 [35] htmltools_0.5.2             prettyunits_1.1.1          
 [37] tools_4.2.0                 gtable_0.3.3               
 [39] glue_1.6.2                  GenomeInfoDbData_1.2.8     
 [41] rappdirs_0.3.3              Rcpp_1.0.11                
 [43] Biobase_2.56.0              jquerylib_0.1.4            
 [45] vctrs_0.6.3                 Biostrings_2.64.0          
 [47] rtracklayer_1.56.0          iterators_1.0.14           
 [49] xfun_0.30                   stringr_1.5.0              
 [51] ps_1.7.0                    lifecycle_1.0.3            
 [53] ensembldb_2.20.2            restfulr_0.0.14            
 [55] XML_3.99-0.14               getPass_0.2-2              
 [57] zlibbioc_1.42.0             scales_1.2.1               
 [59] BSgenome_1.64.0             VariantAnnotation_1.42.1   
 [61] ProtGenerics_1.28.0         hms_1.1.3                  
 [63] promises_1.2.0.1            MatrixGenerics_1.8.0       
 [65] parallel_4.2.0              SummarizedExperiment_1.26.1
 [67] AnnotationFilter_1.20.0     RColorBrewer_1.1-3         
 [69] yaml_2.3.5                  curl_5.0.2                 
 [71] memoise_2.0.1               gridExtra_2.3              
 [73] sass_0.4.1                  rpart_4.1.16               
 [75] latticeExtra_0.6-30         stringi_1.7.12             
 [77] RSQLite_2.3.1               highr_0.9                  
 [79] BiocIO_1.6.0                foreach_1.5.2              
 [81] checkmate_2.1.0             GenomicFeatures_1.48.4     
 [83] filelock_1.0.2              BiocParallel_1.30.3        
 [85] rlang_1.1.1                 pkgconfig_2.0.3            
 [87] matrixStats_0.62.0          bitops_1.0-7               
 [89] evaluate_0.15               lattice_0.20-45            
 [91] htmlwidgets_1.5.4           GenomicAlignments_1.32.0   
 [93] labeling_0.4.2              bit_4.0.5                  
 [95] processx_3.8.0              tidyselect_1.2.0           
 [97] magrittr_2.0.3              R6_2.5.1                   
 [99] generics_0.1.3              Hmisc_5.1-0                
[101] DelayedArray_0.22.0         DBI_1.1.3                  
[103] pgenlibr_0.3.6              pillar_1.9.0               
[105] whisker_0.4                 foreign_0.8-82             
[107] withr_2.5.0                 KEGGREST_1.36.2            
[109] RCurl_1.98-1.7              nnet_7.3-17                
[111] tibble_3.2.1                crayon_1.5.2               
[113] interp_1.1-4                utf8_1.2.3                 
[115] BiocFileCache_2.4.0         rmarkdown_2.14             
[117] jpeg_0.1-10                 progress_1.2.2             
[119] data.table_1.14.8           blob_1.2.4                 
[121] callr_3.7.3                 git2r_0.30.1               
[123] digest_0.6.33               httpuv_1.6.5               
[125] munsell_0.5.0               bslib_0.3.1