Tuesday, December 30, 2008

Could Y-DNA J1-M267 possibly have an African Origin? Taking a look at the DYS458 .2 Locus

This was the subject of an interesting board discussion that the present author thought would be nice to add to the collection of postings here.

Molecular characterisation and population genetics of the DYS458 .2 allelic variant

G. Ferri, C. Robino, M. Alu, D. Luiselli , S. Tofanellid, L. Caciaglid, V. Onofri, S. Pelotti, C. Di Gaetano, F. Crobu, G. Beduschi, C. Capelli


We recently found a number of intermediate DYS458 alleles, indicated as .2. This allelic variant is distributed in several populations, but currently no information is available regarding the molecular structure and the genealogical correlation of chromosomes with this variant. The molecular characterisation of such allele, its worldwide distribution and the correlated evolutionary history are the subject of the present paper. Molecular and genealogical data are suggestive of a single origin for the .2 variant. Phylogeographic analysis points to either a Middle East or East African origin, but additional data is necessary to clarify this point. Our results suggest that the .2 variants is a stable polymorphism and that it could be used for population studies.

Copyright 2008 Elsevier Ireland Ltd. All rights reserved.

From the full text:

"The molecular organisation of allelic variant was investigated by sequencing a number of different DYS458 .2 alleles from individuals having different geographic origin (Table 1). These variant alleles show an incomplete repeat caused either by a AA insertion or GA deletion in front the third repeat form the last. Initial SNPs analysis identified these chromosomes as derived at the M267 markers, placing them on the J1* cluster. J1 sub-lineages were additionally tested (J1a–e) and in all cases the .2 chromosomes resulted ancestral at these additional markers. The DYS458 .2 Y chromosomes were then consequently identified as part of the J1 branch (Fig. 1). The shared molecular structure and the inclusion in the same Y chromosome genealogy branch were considered as supportive to a common origin for the .2 allelic variants."

"Network analysis was conducted as described in Section 2. Fig. 1 shows that two main clusters can be identified: one composed by individuals from the Caucasus (having DYS458*20.2 as modal allele) and a more heterogeneous one containing a well defined North-African clade (DYS458*18.2 modal allele) and other minor clades with European or Ethiopian origin. Only the North-African clade shows a star-like structure, signature of an associated demic expansion. Notably, within each meta population no haplotype structure can be identified except for the Caucasus, due to the rigid apportionment of these populations in groups with different patrilineal descent (data not shown). Some controversies exist about J1 coalescent times [8,9]. However, there is general agreement in recognizing a recent phase of expansion to North Africa that well fit our data: the star-like pattern in the network with Galilee and Palestinian Modal Haplotypes [15] as central nodes."

"The .2 variant shows its frequency peaks in Africa (North and East) and Caucasus. Data from the middle East is scanty and we are currently investigating various populations from this region to gather more information on the distribution in this area (data not shown). The presence in Europe is limited and the occurrences in both US and Asia (India and Malaysia) can be considered as the result of a recent introgression of African and/ or European haplotypes. Given the current set of data it is difficult to establish the ultimate place of origin of such mutation. However, the limited genetic diversity shown by either the Caucasus and North Africa suggest a combination of drift and founder effect (followed by rapid population expansion) in these areas."

So in summation, they are essentially basing their intro-reckoning at this time...

1)on frequency peak of the paragroup of DYS458 .2 alleles in East Africa, as do North Africa and the Caucasus, but [see point #2 below about North Africa and the Caucasus]...

2)on ruling out a European origin on the one hand, due to the general relative rarity therein of both the microsatellite DYS458 .2 allelic variants and the haplogroup [J1-M267] that they belong to, and on the other hand, ruling North Africa and the Caucasus out due to their relatively more rigid microsatellite-haplotype allocations into clusters within their respective paraphyletic units than the case is for the other population samples. In the North African network, the main recurring sequence is the DYS458*18.2 allele, while that of the Caucasus is reported to be DYS458*20.2. This phenomenon occurs in tandem with the relatively 'lower diversity' of both North African and the Caucasus paraphyletic units than those observed for the other samples.

3)on observations 1 & 2 leaving the so-called Middle East out as the alternative origin to East Africa, even though data available to these authors was limited, a priori extrapolation on the authors' part, likely due to the region's ("Middle East") "reputation" as a host to frequency peaks for this haplogroup.

In the course of the discussion, it was noted:

What we do know, which has already been confirmed numerous times, is that J1-M267 in North Africa represents a more recent introgression from so called Arabs, but maybe a small one because founder effect and drift could have elevated its frequency, which would also explain why its far less diverse in North Africa. - Charlie Bass

To this, it was emphasized [by present author of the blog]: Which is why the authors have ruled out origin in both the North African and the Caucasus populations, even though the lineages in question are part of a paraphyletic ensemble [but respective to both]. The difference here between North Africa and the Caucasus, is that the North African paraphyletic unit displays a star-like arrangement when phylogenetically reconstructed, while—as it appears from the authors' language—the Caucasus pharaphyletic unit displays discernable within-paragroup monophyletic relationships not necessarily from a single node, but a few discrete nodes. It appears that the paraphyletic units of the other sampled populations were relatively more phylogenetically scattered at the sub-clade level than the aforementioned two, i.e. showing less discernable within-paragroup monophyletic clustering between the chromosomes.

While in this study, the paraphyetic family of DYS458 .2 chromosomes showed frequency peaks in North African, East African and the Caucasus samples, they differ in their within-paragroup phylogenetic arrangement. The East African pattern is likened to the latter pattern just mentioned, while North African and the Caucasus paraphyletic arrangements are just as mentioned above respectively, showing relatively more discernable within-paragroup [sub-clade] monophyletic relationships. The point of inquiry now, is to see if any potential "Middle Eastern" paraphyletic family of this haplogroup compares with that of the East African family both in terms of frequency peak and loose within-paragroup monophyletic relationships between the chromosomes.