AncestryDNA(R) White Papers

Monday, October 29th, 2018

Here, we augment these DNA and pedigree-based insights even further with our new Genetic Communities feature (Figure 1.1). Instead of considering the IBD connection between each pair of customers in isolation, we simultaneously analyze more than 20 billion connections identified among over 2 million AncestryDNA customers as a large genetic network (described below in Section 3). Intuitively, because the estimated IBD connections between individuals are likely due to recent shared ancestry (within the past 10 generations), broader patterns in this large network likely represent recent shared history. The result is that we can identify clusters of living individuals that share large amounts of DNA due to specific, recent shared history. For example, we identify groups of customers that likely descend from immigrants participating in a particular wave of migration (e.g. Irish fleeing the Great Famine)
Ethnicity estimates are not an exact science. The percentage AncestryDNA reports to a customer is the most likely percentage within a range of percentages. In this section, we discuss how we calculate this range. It is important to keep in mind that here at AncestryDNA we continue to build upon our previous work to offer ever more accurate results to our customers.

So, for example, we might report someone as 40% England, Wales and Northwestern Europe with a confidence range of 30-60%. This means that they are most likely 40% England, Wales and Northwestern Europe but they could be anywhere between 30% and 60% England, Wales and Northwestern Europe.

As illustrated in Figure 4.1, our updated ethnicity estimation process, or algorithm, performs significantly better than our previous process for nine European regions. Since we are analyzing
single-origin people, a perfect algorithm would report back 100% for all of these cases. While not quite perfect, in each case, the updated algorithm is closer to 100% compared to the previous method. The trend is similar for the majority of the other regions (data not shown). …

Transition probabilities are really just the odds that an ethnicity will change from one window to the next.

The final ethnicity estimates customers receive are calculated by counting the proportion of the Viterbi path (weighted by recombination distance) that are assigned to a particular population in the reference panel.

23andMe ancestry composition white paper

