May 4, 2016

Large Paleoeuropean DNA survey

An unprecedented survey of ancient DNA from Paleolithic Europe has been just published:

Qiaomei Fu et al., The genetic history of Ice Age Europe. Nature 2016. Pay per viewLINK [doi:10.1038/nature17993]

The supplemental materials (PDF) are freely accessible, as are the figures and tables (HTML). 

Quick highlights:
  1. Oldest Y-DNA R1b1 (and therefore R1b and R1) ever documented (Villabruna, Veneto, 14 Ka ago, Epigravettian cultural context). Also more Japan and La Braña related C1!
  2. Oldest mitochondrial DNA H (H7) may be in Gravettian Moravia, also oldest U6 may not be in Iberia or North Africa but in Gravettian Romania.
  3. Very important insights in autosomal DNA: a distinct Paleoeuropean population since Gravettian, two different late UP/Epipaleolithic populations. 
  4. Still very important gaps, notably SW France (the core of Paleolithic Europe) and most of Iberia. Also still missing West Asian sequences altogether, except for the rather anomalous Caucasus population and whatever may be inferred from Early European Farmers, whose ancestry was mostly (aprox. 3/4) West Asian.

A good synthesis of the scope and some of the findings of this study is in fig. 1:

(click to expand)


The survey confirms (supp. materials 4) that haplogroup I used to be the most common patrilineage in Paleolithic Europe. But it was not the only one:

The oldest ones (pre-Villabruna, c. 14 Ka BP) were largely C1:
  • Kostenki 14 (Russia, Gravettian): C1b
  • Goyet Q116-1 (France, Aurignacian): C1a
  • Vestonice 16 (Moravia, Gravettian): C1a2
Also in this oldest group (arbitrarily defined as pre-Villabruna), there was some I* or maybe pre-I (some markers are missing in many individuals), including: Pavlov 1 (Gravettian, Moravia), Paglicci 133 (Gravettian, South Italy), Hohle Fels 49 (Magdalenian, Swabia), Goyet Q2 (Magdalenian, France) and Bukhardtshohle (Magdalenian, Swabia). Notice that its prevalence and clarity as "I proper" increases after the LGM; the Gravettian ones seem to be pre-I rather than true I.

Other oldest lineages are BT* (Vestonice 15), CT* (Ciclovina 1, Kostenki 12, Vestonice 13), F* (Vestonice 43). Notice that in most cases not all the ideal SNP testing was performed, so it is still possible and even probable, I'd think, that BT* and CT* are actually F*.

In the more recent "post-Villabruna" group:

The revelation of the group is of course Villabruna, which carried R1b1

There are also two I* (Cuiry Les Chardaudres 1 and Berry Au Bac), one I2 (Rochedane) and one F* (Falkenstein).

I must also mention that previous studies found mostly I2 in Epipaleolithic samples, excepted La Braña, which carried C* (maybe some sort of C1 but unconfirmed). R1a1* was found in Karelia as well.

Synthesis: I and R1b1, the most common lineages of Europe West of the Elbe, only show up after the Last Glacial Maximum, at least as far as we know. I probably coalesced in the subcontinent, the issue of where R1b, the most common modern patrlineage of Western Europe, coalesced and how it expanded remains open but the Villabruna data point defines a terminus ante quem for this haplogroup, which MUST be older than 14,000 years necessarily, discarding some of the most outrageous recentist chronologies altogether. The great initial diversity of CT-derived lineages suffered bottlenecks with the LGM and probably also later, pruning most of them (although rare instances of some of those lines such as F* or C1 are still found among modern Europeans).

Mitochondrial DNA

Lots of interesting stuff in this issue of the matrilineages, but also some strange issues in the data that do raise eyebrows quite a bit. The full dataset is in the supplemental materials section 2. 

However they do not provide clear data on how the tests were performed, just a generic listing. This is very problematic, notably when they state that El Mirón is U5b, when Hervella (with more clear methodology) classified her as H just a year ago. Another similar issue is the apparent H7 (H7a1?) in Vestonice 14, which is first classified as "damaged" (based apparently on X-chr contamination, the CI for H7 is 0.9-1) and then listed as "U" in the extended table 1, with no reasoning whatsoever for the change. 

Rumor is already around about a mysterious H-hater "black hand" being at play here. I can't neither confirm nor reject it but I do think that the authors should explain themselves more clearly on this most important matter, which is beginning to be more than just annoying, fueling conspiracy theories and what-not.

Another interesting issue is a possible U6 in Muierii (Gravettian Romania, CI 0.88-0.97), labeled as "damaged" again and refurbished as mere amorphous "U". This is a very important issue and is directly related with the presence of mtDNA H in Paleolithic Europe and the origin of these lineages in North Africa. 

Northwestern Africa (not counting Cyrenaica) did not experience any sort of Upper Paleolithic (UP) until c. 22 Ka BP, when a new culture of very likely Iberian Solutrean affinity, the Iberomaurusian or Oranian expanded from Taforalt (Arif, North Morocco). In my understanding this is the most likely origin of mtDNA H (H*, H1, H3, H4 and H7) in North Africa and maybe also of mtDNA V, and also should be related to the bicontinental distribution of mtDNA U6 (in North Africa but also and quite diversely in Iberia) and the surely related distribution of Y-DNA E1b-M81. 

While it's easy to imagine mtDNA H (and maybe also V) migrating from Europe to North Africa in this context, less clear has been so far the issue of U6 origins: as U-derived lineage it must ultimately derive from the early UP populations of West Asia but then again the first UP in the region must have arrived from SW Europe in the Last Glacial Maximum (LGM) period. So something I've been wondering all this time, particularly since the crucial, rare and basal, U6c lineage was discovered to exist not just in Morocco but also in Andalusia, is if U6 actually arrived to NW Africa from Europe and not, as is often assumed, vice-versa. 

So you will understand how this issue of properly identifying ancient mtDNA H and U6 lineages is important not only for the understanding of the roots of Europeans but also for those of North Africans. There are interests at play here because many geneticists have made a personal issue of "molecular clock" age estimates (whose actual scientific, empirical, value is often close to zero but are "sold" as "scientific" instead) and also of exaggerating the West Asian Neolithic influence in Europe beyond reason, leading to true quasi-ideological "DNA wars" that are totally out of place. 

Please, let's be serious: there is no room for childish games on these matters, you guys and gals are grown ups with a PhD!

Otherwise a lot of U (as usual: U*, U5, U2), notable is U8c (CI 0.91-1 but declared "damaged" in spite of extremely low X-chr contamination), which, if confirmed, could offer clues about the origins of the rare Italo-Jordanian U8c (and indirectly about Basque U8a and the quite common but surely Neolithic haplogroup K). Also discarded are several samples that initially produced lineages under macro-haplogroup M, however Goyet Q116-1 was labeled as "pass" with this lineage. So there is Paleoeuropean M, or at least there was once upon a time, this one beyond any doubt.

Autosomal DNA

This last part is most interesting as well. As you can see in the figure 1 above, the authors described three Paleoeuropean clusters: blue (aka Vestonice), green (aka El Mirón, however El Mirón is actually green-red admixed) and red (aka Villabruna, equivalent to the WHG grouping seen in some recent studies). Black-marked samples are out of any group and the Siberian (Mal'ta) and Caucasus (Satsurbilia) clusters are not too relevant here. 

Annotated by me: in green approx. dates for reference, in gray approx. reconstruction of the ancestry of late Paleoeuropeans

First of all it is clear that all or most Paleoeuropeans form a unique macro-cluster (orange shaded) to the exclusion of the Mal'ta and Satsurbilia clusters and also of Early Neolithic Stuttgart (~3/4 West Asian). This macro-cluster is comparable in affinity to that of Han-Dai-Karitiana, so even the word "race" can be used. Some people have argued that "there was no Europe" back then, because the Bosporus was an isthmus, but from the genetic data it seems clear that Europe was more distinctive then than it is now, after the Neolithic massive admixture event that spanned from Europe to India with West Asian centrality. 

Then we see an older "Gravettian" or blue or Vestonice cluster, that is clearly pre-LGM and that does not include however peripheral Gravettians such as Mal'ta, Kostenki or Goyet Q53-1.

But the most interesting feature is that two different populations existed at the end of the Paleolithic period: the green one (El Mirón) is strictly Magdalenian and vanishes with the Epipaleolithic (at least for this sample, which has mayor gaps), instead the red one (Villabruna or WHG) was initially less common in Magdalenian and spans beyond its cultural borders into Epigravettian Italy too, however it becomes the only thing around in the Epipaleolithic, suggesting the expansion of a single population in that late period, maybe with the geometric microlithism which precedes in most areas the arrival of Neolithic and may well have expanded from France. 

Looking at the orange range of less obvious affinities, I tried to pinpoint tentative origins for those two populations. The green one relates best with GoyetQ116-1 (Aurignacian), while the red one does with GoyetQ53-1 (Gravettian). This is also somewhat apparent in the PCA and I tried to indicate it with the annotated arrows. 

Especial thanks for his insights to Jean Lohizun.

Back to work

My apologies to readers for being for so long in "lazy mode". Actually I got interrupted largely by a request to provide a quality article on Basque, Sardinian and European origins for a soon to be published collective book in Basque language. This took me a lot of time and energies in late March and early April, so basically I put everything else on hold. The last weeks I've been resting indeed, what may be aggravated by a declining health that makes me sleep irregularly and often for much longer than most of you do. Being fed up with Internet information feeds and a quite active political reality also drain my energies to other endeavors, not to mention paperwork.

In this sense I want to announce that I have begun recently a new multi-purpose blog in Spanish language: Bagauda. Most of it is politics, I warn you, but I have also included the unedited raw article for that book I mention in the previous paragraph (prior to translation to Basque and corrections). I'm reasonably sure that those of you who have Spanish as primary or even secondary language will be interested in having a look (→ here).

Another relevant entry was the announcement of the upcoming congress on Iruña-Veleia to be held on May 7 in Vitoria-Gasteiz. You can still register but hurry up.

I will now proceed to comment in a separate entry on the news of the week, the Fu et al. study of a large array of Paleoeuropean ancient DNA. But, before I get to that, I must mention some interesting studies that I have not been able to get time to even properly read, let alone discuss:

  • K. Voskarides, S. Mazières et al., Y-chromosome phylogeographic analysis of the Greek-Cypriot population reveals elements consistent with Neolithic and Bronze Age settlements. Investigative Genetics 2016. Open accessLINK [doi:10.1186/s13323-016-0032-8]
  • B. Vernot et al., Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals. Science 2016. Freely accessible (with registration?)LINK [doi:10.1126/science.aad9416]
  • Y.Y. Waldman, A. Biddanda et al., The Genetics of Bene Israel from India Reveals Both Substantial Jewish and Indian Ancestry. PLoS ONE 2016. Open access → LINK [doi: 10.1371/journal.pone.0152056]

Another intriguing new independent paper by a regular visitor and commenter to this blog, Olympus Mons, that I have not yet read is:

→ R1b from Sulaweri-Shomu to Bell Beaker, available as PDF or in blog format.

He seems to argue for a Caucasus origin of both the lineage and Bell Beaker phenomenon. I have no opinion as of yet, because, simply put, I have not been able to read it in full.

Another regular visitor here to have put an independent paper online, also on the issue of R1b origins, is Paul Conroy:

→ Anatole A. Klyosov and Paul M. Conroy, Origins of the Irish, Scottish, Welsh and English R1b-M222 population. Available at Paul's account.

Again I have not yet got the opportunity to read it, so no opinion. 

Feel free to use this entry to comment on any of the aforementioned studies or articles or to provide info about stuff I may have missed.