Last updated: 2023-08-22

Checks: 6 1

Knit directory: m6A_in_disease_genetics/

This reproducible R Markdown analysis was created with workflowr (version 1.7.0). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20230331) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Using absolute paths to the files within your workflowr project makes it difficult for you and others to run your code on a different machine. Change the absolute path(s) below to the suggested relative path(s) to make your code more reproducible.

absolute relative
~/projects/m6A_in_disease_genetics/code/ctwas/ctwas_config_b37.R code/ctwas/ctwas_config_b37.R
~/projects/m6A_in_disease_genetics/code/ctwas/qiansheng/locus_plot.R code/ctwas/qiansheng/locus_plot.R

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version daca415. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .ipynb_checkpoints/
    Ignored:    analysis/m6A_switch_to_disease_h2g.nb.html
    Ignored:    data/plots/

Untracked files:
    Untracked:  HMGCR_locus_gene_tracks.pdf
    Untracked:  Rplots.pdf
    Untracked:  analysis/.ipynb_checkpoints/
    Untracked:  analysis/IBD_E_S_m6A.Rmd
    Untracked:  analysis/IBD_E_S_m6A_output.Rmd
    Untracked:  analysis/LDL_E_S_m6A.Rmd
    Untracked:  analysis/LDL_m6A_output.Rmd
    Untracked:  analysis/RA_m6A_output.Rmd
    Untracked:  analysis/WhiteBlood_WholeBlood_E_M.Rmd
    Untracked:  analysis/identify_m6A_mechanisms_with_finemapping.Rmd
    Untracked:  analysis/lymph_m6A_output.Rmd
    Untracked:  analysis/pre_weights_m6AQTL.txt
    Untracked:  analysis/rbc_E_S_m6A_output.Rmd
    Untracked:  analysis/rbc_m6A_output.Rmd
    Untracked:  analysis/summarize_ctwas_m6A_results.Rmd
    Untracked:  analysis/wbc_E_S_m6A_output.Rmd
    Untracked:  code/.ipynb_checkpoints/
    Untracked:  code/all_m6a_sites_with_paired_cisNATs_summary.csv
    Untracked:  code/annotating_fine-mapped_m6A_QTLs.Rmd
    Untracked:  code/check_double_strand.ipynb
    Untracked:  code/check_double_strand_v2.ipynb
    Untracked:  code/ctwas/
    Untracked:  code/figure/
    Untracked:  code/learn_gviz.Rmd
    Untracked:  code/learn_gviz.html
    Untracked:  code/learn_gviz.nb.html
    Untracked:  code/m6AQTL_finemapping.Rmd
    Untracked:  code/plot_genomic_tracks_gviz.ipynb
    Untracked:  code/summary_TWAS_coloc_m6A_2023.Rmd
    Untracked:  code/test_gviz.ipynb
    Untracked:  code/twas_genes_PP4_0.3_immune_traits_trackplots.pdf
    Untracked:  data/.ipynb_checkpoints/
    Untracked:  data/ADCY7_gwas_input.tsv
    Untracked:  data/ADCY7_qtl_input.tsv
    Untracked:  data/Allergy_full_coloc.txt
    Untracked:  data/Asthma_full_coloc.txt
    Untracked:  data/CAD_full_coloc.txt
    Untracked:  data/Eosinophil_count_full_coloc.txt
    Untracked:  data/GSE125377_jointPeakReadCount.txt
    Untracked:  data/G_list.Rd
    Untracked:  data/HMGCR_ctwas_dat.Rd
    Untracked:  data/IBD_full_coloc.txt
    Untracked:  data/JointPeaks.bed
    Untracked:  data/Li2022_dsRNAs.xlsx
    Untracked:  data/Lupus_full_coloc.txt
    Untracked:  data/RA_full_coloc.txt
    Untracked:  data/TABLE1_hg19.txt
    Untracked:  data/TABLE1_hg19.txt.zip
    Untracked:  data/__MACOSX/
    Untracked:  data/coloc_blood_traits.csv
    Untracked:  data/crohns_disease_full_coloc.txt
    Untracked:  data/ctwas_m6a_joint_top_PIP.txt
    Untracked:  data/edit_sites_and_GE_neg_correlated.txt
    Untracked:  data/edit_sites_and_GE_pos_correlated.txt
    Untracked:  data/features
    Untracked:  data/human_EERs.csv
    Untracked:  data/human_EERs.txt
    Untracked:  data/lymph_full_coloc.txt
    Untracked:  data/m6A_TWAS_results.csv
    Untracked:  data/m6a_TWAS_genes.txt
    Untracked:  data/m6a_joint_calling_peaks.csv
    Untracked:  data/nasser_2021_ABC_IBD_genes.txt
    Untracked:  data/nat_sense_pairs.csv
    Untracked:  data/plt_full_coloc.txt
    Untracked:  data/rbc_full_coloc.txt
    Untracked:  data/rdw_full_coloc.txt
    Untracked:  data/reported_AS_targets_S1.txt
    Untracked:  data/reported_AS_wanowska.txt
    Untracked:  data/sig_coloc_results/
    Untracked:  data/test_locuscomparer.pdf
    Untracked:  data/ulcerative_colitis_full_coloc.txt
    Untracked:  data/wbc_full_coloc.txt
    Untracked:  data/zhao_silver_genes.csv
    Untracked:  output/.ipynb_checkpoints/
    Untracked:  output/HMGCR_gene_track_plot.pdf
    Untracked:  output/HMGCR_locus_plot.pdf
    Untracked:  output/IBD_DHX38_plot.pdf
    Untracked:  output/IBD_DHX38_plot_genetrack.pdf
    Untracked:  output/all_m6a_sites_with_cisNATs.csv
    Untracked:  output/all_m6a_sites_with_paired_cisNATs_summary.csv
    Untracked:  output/all_m6a_sites_with_paired_cisNATs_summary_PP40.3.csv
    Untracked:  output/all_m6a_sites_with_paired_cisNATs_summary_PP40.5.csv
    Untracked:  output/all_m6a_sites_with_paired_cis_NATs.csv
    Untracked:  output/ctwas_m6a_joint_pip_above_0.6.RData
    Untracked:  output/fine_mapped_m6AQTLs_TWAS_genes_highPP4.rds
    Untracked:  output/gene_summary.csv
    Untracked:  output/immune_related_m6A_targets.csv
    Untracked:  output/lupus_MIR210HG_plot.pdf
    Untracked:  output/lupus_MIR210HG_plot_genetrack.pdf
    Untracked:  output/m6aQTL_dsRNAs_PPP2R3C_PRORP.pdf
    Untracked:  output/m6a_QTL_genes.csv
    Untracked:  output/m6a_genes_PIP_0.6_blood_immune.csv
    Untracked:  output/m6a_genes_PIP_0.6_blood_immune.txt
    Untracked:  output/m6a_peaks_nearby_dsRNAs.csv
    Untracked:  output/m6a_sites_near_all_dsRNAs_twas.csv
    Untracked:  output/m6a_sites_near_dsRNAs_coloc.csv
    Untracked:  output/m6a_sites_near_dsRNAs_twas.csv
    Untracked:  output/m6a_sites_near_dsRNAs_twas_summary.csv
    Untracked:  output/m6a_sites_overlapping_NAT_twas.csv
    Untracked:  output/m6a_sites_overlapping_dsRNAs_coloc.csv
    Untracked:  output/m6a_sites_overlapping_dsRNAs_twas.csv
    Untracked:  output/m6a_sites_overlapping_dsRegions.csv
    Untracked:  output/m6a_sites_overlapping_dsRegions_coloc.csv
    Untracked:  output/negatively_correlated_genes.txt
    Untracked:  output/postively_correlated_genes.txt
    Untracked:  output/rs1806261_RABEP1-NUP88_focused_locusview.pdf
    Untracked:  output/rs1806261_RABEP1-NUP88_locusview.pdf
    Untracked:  output/rs3177647_MAPKAPK5-AS1-MAPKAPK5_locusview.pdf
    Untracked:  output/rs3204541_DDX55-EIF2B1_locusview.pdf
    Untracked:  output/rs7184802_ADCY7-BRD7_locusview.pdf
    Untracked:  output/rs7184802_ADCY7_locuscompare.pdf
    Untracked:  output/twas_genes_PP4_0.3_immune_traits_trackplots.pdf
    Untracked:  output/twas_genes_PP4_0.5_blood_traits_trackplots.pdf
    Untracked:  output/twas_m6a_sites_with_all_cisNATs.RDS
    Untracked:  output/twas_m6a_sites_with_cisNATs_range.RDS
    Untracked:  output/twas_m6a_sites_with_the_nearest_cisNAT.RDS
    Untracked:  twas_genes_PP4_0.3_immune_traits_trackplots.pdf

Unstaged changes:
    Deleted:    analysis/learn_ctwas.Rmd
    Modified:   analysis/lymph_m6A_output_hg19.Rmd
    Modified:   analysis/m6A_switch_to_disease_h2g.Rmd
    Modified:   analysis/rbc_m6A_output_hg19.Rmd
    Modified:   analysis/wbc_m6A_output.Rmd
    Modified:   analysis/wbc_m6A_output_hg19.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/asthma_m6A_output_hg19.Rmd) and HTML (docs/asthma_m6A_output_hg19.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd daca415 Jing Gu 2023-08-22 ctwas on asthma
html 866865b Jing Gu 2023-08-22 Build site.
Rmd 688d15c Jing Gu 2023-08-22 immune traits

Load ctwas results

# top 1 method
res <- impute_expr_z(z_snp, weight = weight, ld_R_dir = ld_R_dir,
                         method = NULL, outputdir = outputdir, outname = outname.e,
                         harmonize_z = T, harmonize_wgt = T, scale_by_ld_variance=F,
                         strand_ambig_action_z = "recover", 
                         recover_strand_ambig_wgt = T
# lasso/elastic-net method
res <- impute_expr_z(z_snp, weight = weight, ld_R_dir = ld_R_dir,
                         method = NULL, outputdir = outputdir, outname = outname.e,
                         harmonize_z = T, harmonize_wgt = T, scale_by_ld_variance=F,
                         strand_ambig_action_z = "none", 
                         recover_strand_ambig_wgt = F

GWAS: UK Biobank GWAS summary statistics - European individuals

Weights: FUSION weights using top1, lasso, or elastic-net models were converted into PredictDB format and were not needed to do scaling when running ctwas.

Check convergence of parameters

cTWAS analysis on m6A alone

Joint analysis of expression, splicing and m6A

[1] "Check convergence for the lasso model when jointly analyzing expression, splicing and m6A:"
[1] "Table of group size before/after matching with UKBB SNPs:"
                          SNP      eQTL     sQTL   m6AQTL
prior_group_size    9.324e+06 2005.0000 2191.000 918.0000
group_size          7.547e+06 1998.0000 2180.000 911.0000
percent_of_overlaps 8.094e-01    0.9965    0.995   0.9924
                                SNP      eQTL      sQTL    m6AQTL
estimated_group_prior     0.0002031  0.005515 4.130e-04 5.500e-03
estimated_group_prior_var 7.0019346 46.466053 4.664e+00 6.196e+01
estimated_group_pve       0.0318723  0.001520 1.247e-05 9.219e-04
attributable_group_pve    0.9284930  0.044288 3.632e-04 2.686e-02
$lasso

Version Author Date
866865b Jing Gu 2023-08-22

cTWAS results for individual analysis with m6A

top1 model

Summing up PIPs for m6A peaks located in the same gene

Top m6A PIPs by genes

cTWAS results for joint analysis using a lasso model

Top m6A modification pip

Top expression/splicing/m6A units

For m6A or splicing QTLs, they are assigned to the nearest genes (m6A needs to be confirmed with Kevin).

Top SNPs or genes with PIP > 0.6

$eQTL
     genename susie_pip group region_tag
1982   ORMDL3    0.9960  eQTL      17_23
1973    AP5B1    0.9119  eQTL      11_36
1992 SLC25A19    0.8696  eQTL      17_42

$m6AQTL
     genename susie_pip  group region_tag
5082   GPR183         1 m6AQTL      13_50
5071    ZMAT2         1 m6AQTL       5_83

$sQTL
[1] genename   susie_pip  group      region_tag
<0 rows> (or 0-length row.names)

Top m6A modification pip

   genename region_tag susie_pip      z
1    GPR183      13_50   1.00000 -7.249
2     ZMAT2       5_83   1.00000  3.616
3     TOMM5       9_28   0.17931  3.404
4     TLR10       4_31   0.13819  7.394
5    LETMD1      12_31   0.10604  3.440
6     P2RX5       17_3   0.08888 -2.719
7   PIP4K2A      10_17   0.08783  3.211
8    DNAJB1      19_12   0.08778  2.806
9     TMUB1       7_94   0.07304 -2.706
10     EPC1      10_24   0.07202  3.462

Summing up PIPs for m6A peaks located in the same gene

Top 10 m6A PIPs by genes

# A tibble: 818 × 2
   genename total_susie_pip
   <chr>              <dbl>
 1 GPR183            1     
 2 ZMAT2             1.00  
 3 TOMM5             0.179 
 4 TLR10             0.138 
 5 LETMD1            0.106 
 6 P2RX5             0.0889
 7 PIP4K2A           0.0878
 8 DNAJB1            0.0878
 9 EPC1              0.0758
10 TMUB1             0.0730
# ℹ 808 more rows

Top splicing PIPs

                    peak_id genename       pos region_tag susie_pip      z
1      chr1:8021853-8022823    PARK7   7961294        1_6   0.03740 -3.725
2      chr1:8021795-8022823    PARK7   8009763        1_6   0.03180  3.684
3      chr7:6624891-6628248   ZDHHC4   6527973        7_9   0.02119  3.326
4   chr18:33747185-33751010     ELP2  33686857      18_19   0.02088  4.034
5  chr3:156271570-156271782     SSR3 156239308       3_97   0.02022  3.507
6  chr8:103842446-103845284    AZIN1 103775175       8_69   0.01772  2.928
7   chr10:74823089-74828604    P4HA1  74741181      10_49   0.01708  3.272
8      chr7:6624891-6628027   ZDHHC4   6527973        7_9   0.01627 -3.193
9    chr7:74197382-74197868     NCF1  74097488       7_48   0.01618  2.995
10   chr7:74197404-74197868     NCF1  74097488       7_48   0.01616 -2.981

Summing up PIPs for spliced introns located in the same gene

Top 10 splicing PIPs by genes

# A tibble: 10 × 2
   genename total_susie_pip
   <chr>              <dbl>
 1 PARK7             0.0707
 2 MRPL43            0.0455
 3 ZDHHC4            0.0390
 4 IMMP1L            0.0325
 5 NCF1              0.0323
 6 EZR               0.0315
 7 WARS1             0.0313
 8 SEMA4G            0.0305
 9 MCOLN2            0.0291
10 ALDH3A2           0.0278

Top genes by combined PIP

     genename combined_pip expression_pip splicing_pip  m6A_pip region_tag
1424   GPR183        1.000        0.00000     0.000000 1.000000      13_50
3317    ZMAT2        1.000        0.00000     0.000000 1.000000       5_83
2122   ORMDL3        0.996        0.99605     0.000000 0.000000      17_23
182     AP5B1        0.912        0.91186     0.000000 0.000000      11_36
2705 SLC25A19        0.870        0.86962     0.000000 0.000000      17_42
2579    RPS26        0.587        0.58660     0.000000 0.000000      12_36
735      DMPK        0.547        0.54692     0.000000 0.000000      19_34
1563   IL10RB        0.252        0.24554     0.000000 0.006712      21_15
2290   POLR3H        0.234        0.23371     0.000000 0.000000      22_17
595     COQ8B        0.205        0.20532     0.000000 0.000000      19_28
3053    TOMM5        0.179        0.00000     0.000000 0.179314       9_28
1899   MRPL43        0.168        0.07241     0.045547 0.049718      10_64
2509  RNASET2        0.157        0.15712     0.000000 0.000000      6_109
184   APBB1IP        0.152        0.15233     0.000000 0.000000      10_19
2979    TLR10        0.147        0.00000     0.008448 0.138192       4_31
2657 SERPINH1        0.138        0.13765     0.000000 0.000000      11_42
740    DNAJB1        0.135        0.04699     0.000000 0.087775      19_12
1771  MARCHF3        0.124        0.12445     0.000000 0.000000       5_77
1626   KDELR2        0.121        0.12122     0.000000 0.000000        7_9
360  C19orf54        0.114        0.07918     0.007764 0.026720      19_28
Loading required package: grid
Warning: replacing previous import 'utils::download.file' by
'restfulr::download.file' when loading 'rtracklayer'

Locus plots for specific examples


R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
 [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
 [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
[10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
[1] grid      stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] biomaRt_2.52.0       Gviz_1.40.1          cowplot_1.1.1       
 [4] ggplot2_3.4.3        GenomicRanges_1.48.0 GenomeInfoDb_1.32.2 
 [7] IRanges_2.30.1       S4Vectors_0.34.0     BiocGenerics_0.42.0 
[10] ctwas_0.1.38         dplyr_1.1.2          workflowr_1.7.0     

loaded via a namespace (and not attached):
  [1] colorspace_2.1-0            deldir_1.0-6               
  [3] rjson_0.2.21                rprojroot_2.0.3            
  [5] biovizBase_1.44.0           htmlTable_2.4.0            
  [7] XVector_0.36.0              base64enc_0.1-3            
  [9] fs_1.6.3                    dichromat_2.0-0.1          
 [11] rstudioapi_0.15.0           farver_2.1.1               
 [13] bit64_4.0.5                 AnnotationDbi_1.58.0       
 [15] fansi_1.0.4                 xml2_1.3.3                 
 [17] codetools_0.2-18            logging_0.10-108           
 [19] cachem_1.0.8                knitr_1.39                 
 [21] Formula_1.2-4               jsonlite_1.8.7             
 [23] Rsamtools_2.12.0            cluster_2.1.3              
 [25] dbplyr_2.3.3                png_0.1-7                  
 [27] compiler_4.2.0              httr_1.4.7                 
 [29] backports_1.4.1             lazyeval_0.2.2             
 [31] Matrix_1.6-1                fastmap_1.1.1              
 [33] cli_3.6.1                   later_1.3.0                
 [35] htmltools_0.5.2             prettyunits_1.1.1          
 [37] tools_4.2.0                 gtable_0.3.3               
 [39] glue_1.6.2                  GenomeInfoDbData_1.2.8     
 [41] rappdirs_0.3.3              Rcpp_1.0.11                
 [43] Biobase_2.56.0              jquerylib_0.1.4            
 [45] vctrs_0.6.3                 Biostrings_2.64.0          
 [47] rtracklayer_1.56.0          iterators_1.0.14           
 [49] xfun_0.30                   stringr_1.5.0              
 [51] ps_1.7.0                    lifecycle_1.0.3            
 [53] ensembldb_2.20.2            restfulr_0.0.14            
 [55] XML_3.99-0.14               getPass_0.2-2              
 [57] zlibbioc_1.42.0             scales_1.2.1               
 [59] BSgenome_1.64.0             VariantAnnotation_1.42.1   
 [61] ProtGenerics_1.28.0         hms_1.1.3                  
 [63] promises_1.2.0.1            MatrixGenerics_1.8.0       
 [65] parallel_4.2.0              SummarizedExperiment_1.26.1
 [67] AnnotationFilter_1.20.0     RColorBrewer_1.1-3         
 [69] yaml_2.3.5                  curl_5.0.2                 
 [71] memoise_2.0.1               gridExtra_2.3              
 [73] sass_0.4.1                  rpart_4.1.16               
 [75] latticeExtra_0.6-30         stringi_1.7.12             
 [77] RSQLite_2.3.1               highr_0.9                  
 [79] BiocIO_1.6.0                foreach_1.5.2              
 [81] checkmate_2.1.0             GenomicFeatures_1.48.4     
 [83] filelock_1.0.2              BiocParallel_1.30.3        
 [85] rlang_1.1.1                 pkgconfig_2.0.3            
 [87] matrixStats_0.62.0          bitops_1.0-7               
 [89] evaluate_0.15               lattice_0.20-45            
 [91] htmlwidgets_1.5.4           GenomicAlignments_1.32.0   
 [93] labeling_0.4.2              bit_4.0.5                  
 [95] processx_3.8.0              tidyselect_1.2.0           
 [97] magrittr_2.0.3              R6_2.5.1                   
 [99] generics_0.1.3              Hmisc_5.1-0                
[101] DelayedArray_0.22.0         DBI_1.1.3                  
[103] pgenlibr_0.3.6              pillar_1.9.0               
[105] whisker_0.4                 foreign_0.8-82             
[107] withr_2.5.0                 KEGGREST_1.36.2            
[109] RCurl_1.98-1.7              nnet_7.3-17                
[111] tibble_3.2.1                crayon_1.5.2               
[113] interp_1.1-4                utf8_1.2.3                 
[115] BiocFileCache_2.4.0         rmarkdown_2.14             
[117] jpeg_0.1-10                 progress_1.2.2             
[119] data.table_1.14.8           blob_1.2.4                 
[121] callr_3.7.3                 git2r_0.30.1               
[123] digest_0.6.33               httpuv_1.6.5               
[125] munsell_0.5.0               bslib_0.3.1