Last updated: 2021-10-04

Overall Goal

Identify disease-relevant cell types and genes for Asthma

Enrichment estimates for individual annotation

Adult-onset asthma

  • GWAS: Zhu et al. 2019 (case N=22296, control N=347481)
  • Annotations: Zhang et al.2021 - single-cell ATAC-Seq data for adult lungs
    • A union set of cCRE accessibility across cell types was provided in the following format: chr:start-end cell_type1 cell_type2...
    • Entropy based method used to identify cell-type specific peaks

As neither the number of peaks identified for each cell type nor the list of cell-type specific peaks was available, I followed their way of identifying cell-type restricted peaks. First, I took a subset of accessibility scores for lung cells only. Then I modeled the null distribution of the log2-based cell-type dependent fold changes from the avg level using a normal distribution, with mean 0 and standard deviation estimated by that of top 50% least variable peaks. By sampling from the null distribution, we can compute p-values for each site and for each cell type. A 0.1% FDR cutoff was set to decide which peaks are considered to be cell-type specific.

Cell compositions for adult lung tissues

peak summary for adult lung tissues

Check overlaps in peaks across cell types
We want to know to what extent are the cell-type restricted peaks across cell types overlapped with each other. The table shows the majority of the identified cell types overlap with each other by less than 15%. As Msc has the highest amount of peaks, it's expected the most overlapped cell-type is Msc for almost all other cell types. We also see high percent of overlaps between subclusters of one cell type, such as fibroblast, endothelial cells. Lastly, B cells share 9% of peaks with T-lymphocyte subcluster 1.

Table 1-Summary of peak overlaps between a pair of cell types
most_overlapped_per most_overlapped_celltype second_most_overlapped_per second_most_overlapped_celltype
Agb 0.12 Msc 0.08 Mac.2
Bly 0.10 Msc 0.07 Mac.1
End.1 0.16 End.2 0.14 Msc
End.2 0.15 Msc 0.12 Mac.2
Fib.1 0.15 Msc 0.14 Fib.2
Fib.2 0.18 Fib.1 0.15 Fib.4
Fib.3 0.16 Msc 0.12 Fib.1
Fib.4 0.13 Msc 0.13 Fib.1
Gbl.2 0.12 Msc 0.07 Mac.2
Grn 0.09 Fib.2 0.09 Fib.1
Mac.1 0.12 Msc 0.07 Fib.4
Mac.2 0.16 Msc 0.06 End.2
Mes 0.15 Msc 0.09 Mac.2
Mfb.2 0.16 Msc 0.10 Mac.2
Msc 0.06 Mac.2 0.03 Fib.3
Mst 0.10 Msc 0.09 Mac.1
Pal 0.15 Msc 0.10 Mac.2
Smm.1 0.14 Msc 0.12 Fib.1
Tly.1 0.09 Msc 0.09 Bly
Tly.2 0.12 Msc 0.12 Mac.2
Vsm.2 0.15 Msc 0.13 Fib.1

Enrichment results

  • Tly.1(T_lymphocyte1) and Tly.2(T_lymphocyte2) cannot be distinguished simply based on known immune marker genes. Both of them show differential expression among genes such as CD3D, GZMA, IFNG.

Log2 scaled enrichment estimates for each cell type, which was individually run using Torus. The highlighted bars show significant enrichment, while the bars in grey color do not. The x-axis tickmark labels consist of the annotation name and percent of all SNPs that have that annotation. Asthma-associated variants are significantly enriched in certain sub-clusters of macrophages and fibroblasts, but not in others. The cell types that show the highest enrichment in the chromatin accessibility peaks are immune-related cells.

Children-onset Asthma

  • GWAS: Zhu et al. 2019
  • cell-type specific Annotations: Domcke et al.2020 - single-cell ATAC-Seq data for fetal lungs

