Once we have discovered our variants and evaluated our dataset, we might be interested in ANNOTATE such variants. This means that we will add biological and functional information to variants.

There are several programs to do so:

  • ANNOVAR - it uses local scripts using local data dowloaded from UCSC Genome Browser and can provide information on exonic splicing, HGVS format, distance to nearest gene and indels. Note that ANNOVAR is freely available to personal, academic and non-profit use only.
  • SeattleSeq - it uses an external server and can provide information on conservation, HapMap freq, PolyPhen, clinical association, and on a limited amount of indels.
  • snpEff - it is a downloadable software that install ocally and uses Java and local data files. It provides information on the effects of every SNP and indels. The most important feature of this software is that it has integration with GATK and Galaxy, and that it cna read and write VCF files.

We'll use snpEff following indications from its manual.

