YDNA Marker Mutation Rate vs Modes

Watchers
Share


Contents

Source

Based on:

FTDNA R1b Haplogroup Project
Source:Chandler, 2006

Related

WeRelateOther
R1b Marker distribution of values, non-aligned
R1b Modal Marker Values for 4060 Kits extracted November 2912
YDNA R1b Haplogroup Marker Modes
YDNA Marker Mutation Rate vs Modes
Zipf's Law is Everywhere
Source:Wentian Li, 2002
Wikipedia:Zipf's Law

Background

The following is based on 4060 kits extracted from the FTDNA R1b Project database, November 2012. See YDNA R1b Haplogroup Marker Modes for details. The data show the relationship between the marker mutation rates as given by Source:Chandler, 2006, plotted against the mode for the respetive markers considered by Chandler. The four value, multicopy marker DYS 464 has been excluded from these data though all two value multi copy markers have been retained. The data have been plotted and curve fitted using Excell's graphing utility.

Discussion

Comparison of marker mutation rates (µ) with the modes of the respective marker distributions shows a non-random pattern. That pattern appears to be approximated by a log function, as shown in the graph. There are several alternative functions that also fit the data (e.g., power functions) which might fit the data, but which are not shown. There is no intent to say that one of these functions best fits the data, only that it seems likely that one of these functions could be used to predict mutation rates for individual markers from the modal distribution of values for these markers. Chandler, 2006, considers only the 37 markers in the first three panels of FTDNA's testing regime. Excluding DYS 464 multicopies, only 33 markers are represented. This leaves 70 markers for which Chandler makes no estimates. It may be possible to approximate the mutation rates for these markers using the relationship shown here for the 33 markers of the first three panels.


Image:YDNA Marker Mutration Rate vs Modes2.tiff


Note that while the relationship shown above seems consistent with a log function (for instance), there is considerable variance apparent in the data. This might be due to issues related to sample size or perhaps the precision of the Chandler approximations. Also, it may be that a portion of the data data identified as R1b in the FTDNA database, are not in fact R1b.