y-Haplogroup I1 STR "Cluster" Analysis

With data based on a large sample of I1 y-Haplogroup people tested at FTDNA (see http://www.familytreedna.com/public/yDNA_I1/), and also using some other data, I have done a mathematical "cluster analysis" to determine any clusters within the I1 y-Haplogroup based solely on STR marker values. The clusters that are found are then linked to the geographic origin of the most distant male-line ancestor reported by each cluster member.

Here are the results, which are presented in the form of a "Decision Tree":

The clusters that are identified do, in most cases, have a well defined geographic origin.

The relatively high mutation rate of some STR markers (compared to the slower mutation rate of SNP's) would in theory make identifying clusters not so easy, but luckily y-Haplogroup I1 is a young haplogroup, so it is possible to find clusters based solely on STR marker values. And as shown some of those clusters do nicely correlate with a geographic region.

The word "cluster" is used here in the mathematical sense relating to a clustering of points in the 67-dimensional space of the STR marker values. It is not necessarily the same as finding a geographic cluster, but that sometimes happens which makes the approach useful for people wanting to know where their male-line ancestors might have come from.

To emphasise the inherited nature of the clusters, one perhaps could call them "clans" or some similar word rather than clusters. So, for example, "I1 STR Clan-BBA" might be a better example name, and with members of that "clan" tending to come from Norway/Sweden as suggested in the computed plots.

Anyway, I hope you find the approach useful. The main idea here is using a simple decision tree to quickly get an idea of ones geographic origin based solely on STR marker values for I1 people. The histograms provide better information for anyone who might have obtained an independent STR mutation that would otherwise confuse the decision tree.

Terry, December 2010


UPDATE1: y-Haplogroups I1 and R1b in European Countries, plus Ancient Migrations

Here are some new results: Terry, February 2011


UPDATE3: European y-Haplogroup Locations circa 5,000 BC

Below is a link to a simple map showing some guesswork as to where various y-Haplogroups in Europe might been located at around 5,000 BC. That date, and the areas indicated in the map, are not to be taken too literally. There probably are other sub-Haplogroups whose male descendants didn't survive to our present time - and those extinct y-Haplogroups could fill-in additional areas of the map.
Terry, June 2011


UPDATE4: y-Haplogroups I1 and R1b Dispersal/Expansion

Here are some new results: One should keep in mind that the range and distribution of all haplogroups in Europe have been complicated by the comparatively recent Migration of "Barbarians" (before about 500 AD) and the Migration of "Vikings" (around 800 AD to 1100 AD). The "Barbarians" were mainly Germanic tribes from east of the Rhine and north of the Danube, comprising of the Goths (Visigoths and Ostrogoths), Vandals, Lombards, Burgundians, Franks, and Suebi etc. Also the Angles, Saxons, and Jutes; plus the non-Germanic Huns from Central Asia.

Here is a medium resolution version of the Possible I1 Dispersal/Expansion Map:

But open the document y-Haplogroups I1 and R1b Dispersal/Expansion to get the full resolution graphic.

Terry, August 2011


UPDATE6: y-Haplogroup I1 and Ancient European Migrations

To understand how y-Haplogroup I1 and its various clusters/clans were dispersed around Europe, one needs to understand the movements, dispersals/expansions, and migrations of people back many thousands of years. The link below summarises, in map form, the movements that influenced European people from 13,000 BC onwards to 1000 AD.
Terry, September 2011


UPDATE7: y-Haplogroups I1 and I2 Tree (Preliminary Results)

Here is a tree, based on real STR haplotypes from over four thousand people in y-Haplogroup I1 and I2:


[See next update, which shows an improved tree layout after some correction to the code. Also more data included in the next version.]

Terry, November 2011


UPDATE8: y-Haplogroups I1 and I2 Tree Branch Codes

Enter an FTDNA Kit Number, or a Ysearch ID:

See y-Haplogroups I1 and I2 STR Branches for what you can do with that "Branch Code".

Terry, February 2012


UPDATE10: TMRCA of Y-Haplogroups - based of 1000 Genomes Project data (Preliminary Results)

By simply counting the nucleotide differences between Y-chromosomes, the following tree can be constructed from the 1000 Genomes Project data:



With more data, there may exist people whose Y-Chromosome branches off at an earlier time in the various haplogroups. Also be aware that only the count of nucleotide differences is used to construct the tree - if an inferred ordering of nucleotide differences was instead used, then such a tree would exactly represent the order in which branches occurred but at the cost of not being able to easily compute a timeline. So the above tree uses the count method which, subject to the error bars associated with the branch splits, may place a branch in slightly the wrong order.

Terry, April 2012



tracker