Last updated: 2020-05-24

Install LD Score Regression (LDSC) software

Please see the detail instructions: LD Score Regression (LDSC)

Download ldsc software

git clone
cd ldsc

Create a conda environment with LDSC’s dependencies

You might need to update numpy (and other packages) to a newer version

conda env create --file environment.yml

Activate the conda environment with LDSC’s dependencies

conda activate ldsc

Test installation

If these commands fail with an error, then something as gone wrong during the installation process.

cd ldsc

python ./ -h
python ./ -h

ldsc Wiki

The wiki has tutorials on estimating LD Score, heritability, genetic correlation and the LD Score regression intercept and partitioned heritability.

Tutorials for partitioning heritability using S-LDSC (stratified LD score regression)

ldsc FAQ

Common issues are described in the FAQ


  • Partitioned heritability: Finucane, HK, et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nature Genetics, 2015.

  • Stratified heritability using continuous annotation: Gazal, S, et al. Linkage disequilibrium–dependent architecture of human complex traits shows action of negative selection. Nature Genetics, 2017.

Download LDSC reference data

You may need to download some of the following datasets:

Most of the data can be downloaded from the Price lab LDSCORE website

Download the baseline model LD scores

  • Readme of different versions of baseline models

  • 1000G Phase3 baseline model v1.1

mkdir 1000G_Phase3_baseline_v1.1_ldscores
tar -xvzf 1000G_Phase3_baseline_v1.1_ldscores.tgz -C 1000G_Phase3_baseline_v1.1_ldscores
  • 1000G Phase3 baselineLD model v1.1
mkdir 1000G_Phase3_baselineLD_v1.1_ldscores
tar -xvzf 1000G_Phase3_baselineLD_v1.1_ldscores.tgz -C 1000G_Phase3_baselineLD_v1.1_ldscores
  • 1000G Phase3 baselineLD model v2.2
mkdir 1000G_Phase3_baselineLD_v2.2_ldscores
tar -xvzf 1000G_Phase3_baselineLD_v2.2_ldscores.tgz -C 1000G_Phase3_baselineLD_v2.2_ldscores

Download regression weights

# wget
# tar -xvzf weights_hm3_no_hla.tgz

# European of Phase 3 of 1000 Genomes
tar -xvzf 1000G_Phase3_weights_hm3_no_MHC.tgz

Download allele frequencies (European of Phase 3 of 1000 Genomes)

tar -xvzf 1000G_Phase3_frq.tgz

The authors recommend only keeping HapMap3 SNPs. You can download the HapMap3 related files:

Download 1000 genomes reference genotypes at HapMap3 loci

tar -xvzf 1000G_Phase3_plinkfiles.tgz

Download HapMap3 SNPs

tar -xvzf hapmap3_snps.tgz

Download a concatenated list of HapMap3 SNPs

bzip2 -d w_hm3.snplist.bz2

# Extract the list of HapMap 3 SNPs rsIDs
awk '{if ($1!="SNP") {print $1} }' w_hm3.snplist > listHM3.txt

Compute LD score from functional annotations

Compute LD scores for binary annotations

  • Step 1: prepare annotation in UCSC BED format
  • Step 2: compute LD scores using annotation BED files

  • Example scripts: thin-annot annotation format using

for chrom in {1..22}
  echo ${chrom}

  ## Step 1: Creating an annot file
  echo "Make ldsc-friendly annotation files for ${ANNOT}.bed"
  python \
  --bed-file ${ANNOT}.bed \
  --bimfile 1000G_EUR_Phase3_plink/1000G.EUR.QC.${chrom}.bim \
  --annot-file ${ANNOT}.${chrom}.annot.gz

  ## Step 2: Computing LD scores with an annot file
  echo "Computing LD scores with the annot file ${ANNOT}.${chrom}.annot.gz"
  python \
  --l2 \
  --bfile 1000G_EUR_Phase3_plink/1000G.EUR.QC.${chrom} \
  --print-snps listHM3.txt \
  --ld-wind-cm 1 \
  --annot ${ANNOT}.${chrom}.annot.gz \
  --thin-annot \
  --out ${ANNOT}.${chrom}

full annotation or thin-annot format using this R script

for chrom in {1..22}
  echo ${chrom}

  ## Step 1: Creating an annot file
  echo "Make ldsc-friendly annotation files for ${ANNOT}.bed"
  Rscript code/make_ldsc_binary_annot.R \
  ${ANNOT}.bed \
  1000G_EUR_Phase3_plink/1000G.EUR.QC.${chrom}.bim \
  ${ANNOT}.${chrom}.annot.gz "full-annot"

  ## Step 2: Computing LD scores with an annot file
  echo "Computing LD scores with the annot file ${ANNOT}.${chrom}.annot.gz"
  python \
  --l2 \
  --bfile 1000G_EUR_Phase3_plink/1000G.EUR.QC.${chrom} \
  --print-snps listHM3.txt \
  --ld-wind-cm 1 \
  --annot ${ANNOT}.${chrom}.annot.gz \
  --out ${ANNOT}.${chrom}

Compute LD scores for continuous annotations

  • Step 1: prepare annotation in UCSC BED format
  • Step 2: compute LD scores using annotation BED files

  • Example script (full annotation format using this R script )

for chrom in {1..22}
  echo ${chrom}

  ## Step 1: Creating an annot file (using make_ldsc_continuous_annot.R)
  echo "Make ldsc-friendly annotation files for ${ANNOT}.bed"
  Rscript code/make_ldsc_continuous_annot.R \
  ${ANNOT}.bed \
  1000G_EUR_Phase3_plink/1000G.EUR.QC.${chrom}.bim \
  ${ANNOT}.${chrom}.annot.gz "full-annot"

  ## Step 2: Computing LD scores with an annot file
  echo "Computing LD scores with the annot file ${ANNOT}.${chrom}.annot.gz"
  python \
  --l2 \
  --bfile 1000G_EUR_Phase3_plink/1000G.EUR.QC.${chrom} \
  --print-snps listHM3.txt \
  --ld-wind-cm 1 \
  --annot ${ANNOT}.${chrom}.annot.gz \
  --out ${ANNOT}.${chrom}


Partitioned heritability (binary annotation)

Prepare GWAS summary stats in LDSC .sumstats format

  • Convert GWAS summary stats to the LDSC .sumstats format using
  • ldsc wiki “Summary-Statistics-File-Format”

  • Some of the processed GWAS summary stats (.sumstats format) can be found on RCC: /project2/xinhe/kevinluo/GWAS/GWAS_summary_stats/GWAS_collection/ldsc_format/

Run ldsc on your GWAS summary statistics using baseline-LD model annotations and your new annotation

  • Compute the annotation conditional on baselineLD model: controlling for the annotation categories of the full baselineLD model, using a comma-separated list of annotation file prefixes with --ref-ld-chr
python \
--h2 ${TRAIT}.sumstats.gz \
--ref-ld-chr baselineLD.,${ANNOT}. \
--frqfile-chr 1000G_Phase3_frq/1000G.EUR.QC. \
--w-ld-chr 1000G_Phase3_weights_hm3_no_MHC/weights.hm3_noMHC. \
--overlap-annot --print-cov --print-coefficients --print-delete-vals \
--out ${TRAIT}_${ANNOT}_baselineLD
  • Joint model: you could include multiple annotations file prefixes to run multiple annotations jointly
python \
--h2 ${TRAIT}.sumstats.gz \
--ref-ld-chr baselineLD.,${ANNOT_1}.,${ANNOT_2}. \
--frqfile-chr 1000G_Phase3_frq/1000G.EUR.QC. \
--w-ld-chr 1000G_Phase3_weights_hm3_no_MHC/weights.hm3_noMHC. \
--overlap-annot --print-cov --print-coefficients --print-delete-vals \
--out ${TRAIT}_joint_baselineLD

Partitioned heritability (continuous annotation)

  • ldsc allows taking continuous annotations as inputs for both –l2 and –h2 options. The pipeline is similar to that using binary annotation. However, some result outputs of –h2 option are not meaningful anymore. Computing the proportion of heritability explained by each quantile of a continuous annotation provides a more intuitive interpretation of the magnitude of a continuous annotation effects. You can use their R script quantile_h2g.r and follow their wiki tutorial to compute the proportion of heritability explained by each quintile.

  • Please follow the ldsc wiki “Partitioned Heritability from Continuous Annotations”

