What Is Imputation?

Imputation is a process that predicts non-genotyped variants based on genotyped variants. Our imputation is cutting edge and uses AI and machine learning to predict SNP genotypes and is based on many other variants. See this video for more information about imputation, or keep reading to learn more!

The One-Minute Explanation

Differences in DNA (genetic variants) are the reason every one of us is one of a kind. A commercial DNA kit can measure these differences. These kits test for ~500 thousand genetic variants. That sounds like a lot, but they still miss lots of variants that can impact your health. 

SelfDecode uses cutting-edge A.I./machine learning to “fill in the blanks.” First, we take these ~500 thousand variants and convert them into ~83 million. Then we figure out which of these variants affect your health. Next, we translate this into a genetic risk score that tells you how your genetic risk compares to others. All this information is in our Health Reports. 

Finally, each report provides you with personalized lifestyle and health recommendations to counteract your risk. Our expert team puts all this together by going through hundreds to thousands of studies. And our algorithm predicts the best recommendations for you based on your DNA.

The Long Explanation

What exactly is imputation, and how does it work? In order to understand this, we first have to quickly review some basic biology.

Everyone gets half of their DNA from each of their parents. In this sense, every person’s genome is a “blend” of their parent’s DNA. While this mix is always new and unique, the billions of individual base pairs in your genome can (theoretically) be directly traced back to either your mom or your dad.

However, when a person gets their DNA from their parents, they rarely inherit just one single base pair at a time. Instead, most parts of the DNA are passed down in larger “chunks,” which scientists call haplotypes.

A useful metaphor is to think of the human genome as a book. When DNA is passed down between generations, nature doesn’t write a new book by copying individual letters from the parents, one at a time — instead, it copies whole words, which are specific combinations of many different letters together, in a specific order.

In essence, haplotypes are the “words” in your genome — and this is also the key that allows imputation to work.

To illustrate this with an example, see if you can guess the missing letter in the following sequence:

T E C H N O _ O G Y

If you correctly guessed that the missing letter is an “L”, then you just understood the essence of how genetic imputation works! Basically, you use the pieces of information you do have — some letters — to figure out the word they are meant to spell out. Once you know the word, you can go back and “fill in” any letters that were originally missing.

The basic idea is the same when it comes to your DNA: the individual base pairs are the letters, and the haplotypes are the words. Once we figure out the haplotype, we can fill in the base pairs that we were “missing” before.

Of course, when it comes to the human genome, this process is often significantly more complicated than this simple example makes it look! Doing this with genetic data would be more like trying to guess the same word as before, but starting with fewer letters to work with:

_ E _ _ N _ _ _ _ Y

As you can see, it would be very hard for a human to correctly guess the word from just these few clues! Luckily, computers can figure it out — if you know how to program them the right way.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.