Haplotype Mapping Strategy

Introduction

Our Beatty DNA Project provides more opportunity than simply waiting for DNA results to come in and comparing one with another.  It also offers researchers a powerful tool for proactively targeting selected lineages to more clearly describe them and identify relationships with other lineages. This is particularly useful when studying a lineage in a geographic area and period of time where numerous other lineages of the same surname were found.  Available records such as deeds, wills, tombstones, and censuses seldom identify more than two generations.  It is often impossible to determine which record pertains to which lineage.

This problem arose several years ago, when some Beatty researchers joined forces to document BP-2000 Lineage 20 (called the Francis Beaty Research Project). The earliest ancestor of this lineage was Francis Beaty who was born in about 1710 and died in North Carolina in 1773.  A major obstacle to documenting that lineage stems from at least 29 Beatty lineages that occurred in North Carolina in the 1700s and early 1800s.  Nine of those lineages are thought to be related to some of the other 20 lineages. Numerous records from that period exist, but many are not sufficient to distinguish among these 29 overlapping lineages.

Although the strategy described below was developed for use in the Francis Beaty Research Project, it should be useful to other researchers who are dealing with similarly overlapping and conflicting lineages in other geographic areas (such as Tennessee and Northern Ireland).

First Steps in the Strategy

The strategy begins with identifying the various presumed unrelated lineages in the geographic area in question. Because of the extensive documentation of lineages in BP-2000, the identification of lineages has already been largely completed.  What remains is to search these lineages to identify those that occurred in the area in question and during the relevant period of time. In other words, if the focus of the project is Tennessee in the 1700s, a lineage which moved through Tennessee in 1850 would not be relevant.

Once the relevant lineages are identified, the really hard work begins.  That involves finding living Beatty male descendants in each of the identified lineages. That is a time-consuming process.

The best hope of shedding light on the identity of earlier ancestors is often through collaboration with other families with unknown, but related, ancestors.  Each of these families may have information valuable to the other families.  But, since these families might not know of each other, such collaboration does not occur.

When faced with these potential benefits, a question that quickly surfaces is which males should be tested.  The 25-marker DNA tests are expensive.  It is not economically feasible, or necessary, to test ALL males. The testing should be selective to yield the greatest value for the least cost.  Using the selective testing approach, several members of a lineage can share the cost of a test that would then yield information valuable to them all.

Establishing the Earliest Ancestor's Haplotype

Elsewhere in this website we have discussed the haplotype and how it changes slowly over time.  If you missed that discussion, see the
Table II Discussion.  For purposes of this strategy, we use the 25-marker test.

Each of the lineages mentioned above begins with a named ancestor. If the haplotype of each of those ancestors were known, a comparison of those haplotypes could point to relationships among the various lineages.  They would indicate which are closely related and which might not be related at all.

Of course, we cannot take a cheek swab from those ancestors for laboratory analysis. Instead, we must locate living male descendants from whom to obtain samples. Those samples can then be viewed as surrogates for the earliest ancestors.

The goal of this strategy is to either validate or represent the haplotype of each earliest ancestor in each lineage.  There is a difference between 'validation' and 'representation.'

A Hypothetical Example

The earliest ancestor of Lineage XX died in 1860.  He left two sons and both sons produced male offsprings. This male offspring production continued until today there are known male Beatty descendants of each son.  So, there are two lineages comprising Lineage XX. If a Y-chromosome test is conducted on one living Beatty male in each of these two lineages and the results are identical, there would be an extremely high probability that those two haplotypes were also the haplotype of the earliest ancestor who died in 1860.  This would be a validated haplotype.  The validation occurs when an exact match from two separate lines of descent are obtained.

The above example might be a somewhat ideal circumstance.  Owing to the natural changes that occur in haplotypes from one generation to another, two samples may not yield a match, even though the two males are descended from the common ancestor. A difference between the two haplotypes could indicate that a mutation has occurred in one lineage. Of course, the test results would not reveal when or in which lineage the change occurred.  So, a reasonable next step would be to seek a third sample from a living Beatty male descendant elsewhere in the family tree, preferably from as high up (or low down) in the tree as possible.  The idea here is to try to "get around" the mutation and secure a match.

Statistically, it is fairly unlikely that a mutation would occur within five or six generations as described above. But, it can happen.

A real problem with validation arises if that earliest ancestor only had one son (or only living male Beatty descendants of a single son can be located). Since it is not possible to test males descended from separate sons, a validation of the earliest ancestor is not possible. However, if the haplotype of that son can be validated (as described above), that haplotype could be said to represent the earliest ancestor with 95% certainty. Statistically speaking, that's a pretty safe bet.

Persons in each lineage would need to sit down with their lineage and think about the best way to go about Y-chromosome testing.  The idea is to test male descendants as remote from each other in the lineage tree as possible. Testing a father and son would not be useful in this strategy since there is a 95% chance they would be identical.

Another Example

At the risk of being redundant, here is another example to illustrate how this validation might work.  Let's say that John Beatty knows (has evidence) that his great-great-grandfather was Robert J. Beatty, born in Illinois in 1835 and died in Arkansas in 1895. But, he doesn't have a clue of who Robert's father was.

John had his Y-chromosome tested, so he knows his haplotype. But, establishing the haplotype for Robert is more complicated than determining only John's haplotype. With only one test, there is no way of knowing that the single haplotype was passed down unchanged from Robert to John. Mutations of the Y-chromosome do occur.  However, such mutations are random and unpredictable.  The only way to validate the haplotype of Robert is by additional testing.

John knows that Robert had three sons, James, William, and Alfred. John is descended from son James. To validate Robert's haplotype John must locate a male Beatty descendent of either son William or son Alfred.  Let's say that he finds Samuel Beatty so descended from Alfred, and Samuel agrees to have his Y-chromosome tested. The results produce a haplotype identical to John's. Statistically, it can be assumed that the haplotype of John and Samuel is also the haplotype of their great-great-grandfather, Robert J. Beatty. Why? Because the chances of a mutation occurring in any one of the 25 markers within five generations is relatively small. Furthermore, if a mutation had occurred in John's lineage from Robert to him, it is extremely unlikely that an identical mutation would have occurred in Samuel's lineage from Robert to Samuel. 

In the above example, John has validated the haplotype of his great-great-grandfather using only two DNA samples. 

However, if the haplotypes of John and Samuel do not match, then John must seek a sample elsewhere in Robert's descendency, possibly from a male Beatty descendant of son William.  What he is looking for is a match.

Again, the illustration above is somewhat idealistic.  What if Robert only had one son, who only had one son, and so on down to John.  In that case, John's haplotype could be considered to represent the haplotype of his great-great-grandfather.  And, although it cannot be validated, it can still be useful for comparison with other lineages. 

Haplotype Mapping and Analysis

Using the approach described above, the objective of this strategy is to identify (and validate, if possible) haplotypes for each of the lineages in question.  From that information, a map, or phylogenetic chart, can be constructed which illustrates the variances among these haplotypes.  Based on experience already gained with such haplotype analysis in the Beatty DNA Project, those haplotypes of related lineages will tend to group together.

The analysis of variations among haplotypes is primarily statistical and requires an understanding of the nature of genetic mutation of the Y-chromosome. Fortunately, we have Beattys in our midst who are experienced in both areas.

The Dreaded NPE

Of course, the 'monkey wrench' in this strategy is the NPE, or non-paternity event.  These are the infrequent cases where a son receives his Y-chromosome from a male other than his documented or assumed father.  For example, there could be an adoption somewhere in the paternal ancestry about which there is no documentation and no present day knowledge.  These events become a possible explanation when a DNA sample yields results that are totally unexpected.

 

<Jump to Top of Page>