# Libraries
library(DT)

In this section, we aimed to annotate the vitamin D GWAS results (obtained in the GWAS tab). Briefly, we:

  • Uploaded the GWAS summary statistics in FUMA
  • Conducted partitioned SNP-based heritability analyses by:
    • functional category
    • cell type


Methods

FUMA

FUMA is an online platform that can be used to annotate, prioritize, visualize and interpret GWAS results. It has 2 main functions:

  • SNP2GENE: takes GWAS summary statistics as input and provides extensive functional annotation for all SNPs in genomic areas identified by lead SNPs
  • GENE2FUNC: takes a list of gene IDs (identified with SNP2GENE or manually curated) and annotates them in biological context

Data upload details
We uploaded the GWAS results to FUMA’s in the SNP2GENE section. With the upload, we used the following parameters:

    1. Parameters for lead SNPs and candidate SNPs identification - Sample size (N): column N for fastGWA
    1. MAGMA analysis - Gene window: 35,10, i.e. set window sizes to 35kb upstream and 10kb downstream of the genes.

All other parameters were left as default.

Partitioned SNP-based heritibility

Using the LDSC software, we conducted partitioned SNP-based heritability analyses by functional category. A total of 53 functional categories (described in Finucane et al. 2018) were used.

In addition, we conducted cell-type-specific SNP-based heritability analyses (as implemented in the LDSC software). These analyses were performed with the “Multi_tissue_gene_expr” gene set, which includes both GTEx data and Franke lab data, as described in Finucane et al. 2018.

# ############################
# Prepare data for partitioned heritability analyses
# ############################

# ----- Download files for partitioned heritability by functional category
  cd $bin/ldsc/
  ## Baseline model LD scores 
  wget https://data.broadinstitute.org/alkesgroup/LDSCORE/1000G_Phase3_baseline_ldscores.tgz
  tar -xzf 1000G_Phase3_baseline_ldscores.tgz
  ## Regression weights
  wget https://data.broadinstitute.org/alkesgroup/LDSCORE/weights_hm3_no_hla.tgz
  tar zxf weights_hm3_no_hla.tgz
  ## Allele frequencies
  wget https://data.broadinstitute.org/alkesgroup/LDSCORE/1000G_Phase3_frq.tgz
  tar zxf 1000G_Phase3_frq.tgz
  
# ----- Download files for cell type specific analyses
## https://github.com/bulik/ldsc/wiki/Cell-type-specific-analyses
## Available gene sets: Multi_tissue_gene_expr, Multi_tissue_chromatin, GTEx_brain, Cahoy, ImmGen, or Corces_ATAC
  ## LD scores
  cts_name=Multi_tissue_gene_expr
  wget https://data.broadinstitute.org/alkesgroup/LDSCORE/LDSC_SEG_ldscores/${cts_name}_1000Gv3_ldscores.tgz
  tar -xzf ${cts_name}_1000Gv3_ldscores.tar.gz


# ############################
# Partitioned Heritability by functional category
# ############################
# Directories
ldscDir=$bin/ldsc
results=$WD/results/fastGWA

# Files/settings
prefix=vitD_fastGWA
ldsc_py=$ldscDir/ldsc.py
sumstats=$results/ldscFormat/$prefix.sumstats.gz
baseline=$ldscDir/1000G_EUR_Phase3_baseline/baseline.
weight=$ldscDir/weights_hm3_no_hla/weights.
freq=$ldscDir/1000G_Phase3_frq/1000G.EUR.QC.
output=$results/ldsc/functAnnot.$prefix # Output file

cd $results
module load ldsc
$ldsc_py --h2 $sumstats \
           --ref-ld-chr $baseline \
           --w-ld-chr $weight\
           --overlap-annot \
           --frqfile-chr $freq \
           --out $output

# ############################
# Cell type specific analyses
# ############################

# Directories
ldscDir=$bin/ldsc
results=$WD/results/fastGWA

# Files/settings
prefix=vitD_fastGWA
ldsc_py=$ldscDir/ldsc.py
baseline=$ldscDir/1000G_EUR_Phase3_baseline/baseline.
weight=$ldscDir/weights_hm3_no_hla/weights.
cts_name=Multi_tissue_gene_expr
sumstats=$results/ldscFormat/$prefix.sumstats.gz
output=$results/ldsc/$prefix.$cts_name # Output file

cd $results
tmp_command="module load ldsc;
             $ldsc_py --h2-cts $sumstats \ 
                      --ref-ld-chr $baseline \
                      --ref-ld-chr-cts $ldscDir/$cts_name.ldcts \
                      --w-ld-chr $weight \
                      --out $output"
qsubshcom "$tmp_command" 1 10G cts.`basename $output` 24:00:00 ""

# ############################
# Copy results to local computer
# ############################
scp $deltaID:$WD/results/fastGWA/ldsc/vitD_fastGWA.Multi_tissue_gene_expr.cell_type_results.txt $local/results/fastGWA/ldsc/

scp $deltaID:$WD/results/fastGWA/ldsc/functAnnot.vitD_fastGWA.results $local/results/fastGWA/ldsc/

Results

FUMA

To see the results, log in to FUMA and follow these steps:

  • Click on SNP2GENE at the top
  • Click on My Jobs on the left panel
  • On the line corresponding to the results of interest, click on Go to results to see the SNP2GENE results, OR on GENE2FUNC to see the GENE2FUNC results

Partitioned SNP-based heritability

# Load and format results
df=read.table("results/fastGWA/ldsc/functAnnot.vitD_fastGWA.results", h=T, stringsAsFactors=F)
df=df[order(df$Enrichment_p),]
df[,-1]=apply(df[,-1],2,function(x){format(x,digits = 3)})

# Present table
datatable(df, 
          rownames=FALSE, 
          extensions=c('Buttons','FixedColumns'),
          options=list(pageLength=5,
                       dom='frtipB',
                       buttons=c('csv', 'excel'),
                       scrollX=TRUE),
          caption="Partitioned SNP-based heritability by functional annotation")

Columns are: Category, functional annotation; Prop._SNPs, Proportion of SNPs; Prop._h2, proportion of heritability explained by the functional annotation; Prop._h2_std_error, standard error of the heritability; Enrichment, enrichment, calculated as (Prop. heritability) / (Prop. SNPs); Enrichment_std_error, standard error of the enrichment; Enrichment_p, P-value of the enrichment.

# ############################
# Cell type specific SNP h2
# ############################

# Load and format results
cts=read.table("results/fastGWA/ldsc/vitD_fastGWA.Multi_tissue_gene_expr.cell_type_results.txt", h=T, stringsAsFactors=F)
cts=cts[order(cts$Coefficient_P_value),]
cts[,-1]=apply(cts[,-1],2,function(x){format(x,digits = 3)})

# Present table
datatable(cts, 
          rownames=FALSE, 
          extensions=c('Buttons','FixedColumns'),
          options=list(pageLength=5,
                       dom='frtipB',
                       buttons=c('csv', 'excel'),
                       scrollX=TRUE),
          caption="Cell-type-specific SNP-based heritability")