Surname Frequency and YDNA Studies

Watchers
Share
Return to The Tapestry Homepage
Enlarge
Return to The Tapestry Homepage
Return to Methods
Enlarge
Return to Methods

Contents

This is one of a series of articles on Genealogical Methods, prepared in association with The Tapestry. See Index for a list of related articles.
__________________________


Documentation

Background

This is an analysis of the frequency with which certain certain appear in the 1810 census, and the number of participants in FTDNA YDNA projects. Ancestry's US Census data were used to estimate the frequency in the 1810 census, of five surnames that have been examined on the Tapestry. Data was extracted for a "Standard Match" of "All" variants of the surname( as picked up by Ancestry's search engine), and for "Exact matches" only. In two cases, the "Houston/Huston" and "Cowan/Cowan" surnames, where there are known common variants in use in 1810, exact matches for both variants were recorded. (Note the 1850 hits for the standard match for the snoddy surname, compared to the 14 hits for the exact match, indicates great variability in the Ancestry "Standard match" algorhythm m for variants of this surname. That suggests that Ancestry's algorhthym for this surname is weak. Direct inspection of this search showed that the results included numerous hits for names (such as "Smith") which would not be normally considered to be variants for "Snoddy".

In addition to surname frequency data were extracted from the FTDNA site with respect to the number of persons they show for the various surname YDNA projects. It is assumed that these data include all reasonable variants of the surname.

Data

SurnameNumber of Persons
1810 Census
Number of Tests
Total HitsExact Hits
Cowan1850109151
Cowen101
Houston94112586
Huston167
Kilgore754031
Snoddy12,3561410
Walker20071777789
Image:1810 Surname Frequency and YDNA  Tests.jpg

Discussion

The above data represents a small sampling of the universe of surname and YDNA data. The Walker surname, for example, is among the most common surnames in the United States, while the others are all relatively rare. This study would benefit from increasing the sample size to include additional surnames at the intermediate and large scale range of surname frequencies. There may also be a need to expand the number of "surname equivalents" (such "Huston" vs "Houston") used for data capture in the 1810 census. Doing so would probably improve the precision of the results; for present purposes the data appear sufficiently consistent to not require this.

These data show a surprizing degree of consistency in terms of the frequency of a given surname in the 1810 population, and the frequency with which descendants in today's population take the FTDNA YDNA test. Thus, uncommon names seem to receive no greater or lesser attention on the part of descendants than very common names. Whether this conclusion would hold-up with more data, particularly for the mid and large scale surname frequencies, is something that might be further explored.