3

If I'm looking for a specific SNP in my SNP-Chip data and it isn't there, are there any tools that let me quickly impute that SNP from surrounding SNPs rather than running a lengthy 'whole chromosome' imputation job?

If so, roughly how many upstream and downstream SNPs are generally required to predict the haploblock 'accurately'?

user438383
  • 1,389
  • 1
  • 7
  • 20
Dan Bolser
  • 440
  • 2
  • 9

1 Answers1

3

I would take something like 1.5Mb either side of the SNP of interest (so a 3Mb chunk), subset that out from the target and reference panel and then perform imputation on that chunk alone. I think that size will give you more than enough LD, since LD decays by about 500kB in most human populations by then.

shapeit4 has an option to just impute a single region:

shapeit4 \
    --input unphased.bcf \
    --map chr20.b37.gmap.gz \
    --region chr21:1231213-1413143 \ ## change this bit
    --reference reference.bcf \
    --output phased.bcf
user438383
  • 1,389
  • 1
  • 7
  • 20