paper in press: “Genomic basis and evolutionary potential for extreme drought adaptation in Arabidopsis thaliana”

This is a linked post from “behind the paper” section of the Nature Ecology and Evolution community:


By Moises Exposito-Alonso

Our Nature Ecology and Evolution paper is here:

This study is based on the first experiment of my PhD that aimed at identifying genetic variants (i.e. old mutations) related to survival under climate change scenarios. I looked at many different individuals of thale cress, Arabidopsis thaliana,and discovered some hundred genetic variants that, when present in a plant, increased its survival under drought conditions. We also found that such variants are more common in Mediterranean and Scandinavian populations — populations that, by living at the edge of the distribution range of the species, have probably already experienced more extreme environments than those at the center of Europe.

Growing up in a semi-arid area of Spain (Alicante), I was amused to observe how during months-long droughts there were some plants that still miraculously survived — their scientific names became engraved in my brain as part of my undergraduate training in biology. In those undergraduate classes we were taught the general strategies of how plants deal with drought, long described by ecologists, but whose genetic underpinnings were mostly still a mystery.

Three years ago, I embarked on my PhD at the Max Planck Institute of Developmental Biology with the goal of identifying genes that could help plants to survive under future climate change, which will almost certainly see extreme drought situations much more often than today. While I had experience with field experiments, I had not performed any drought experiments. However, two postdocs, François Vasseur and George Wang, were very knowledgeable in this area of ecology, and in image acquisition and processing. Their help, along with that of my two supervisors, Detlef Weigel and Hernán Burbano, was invaluable in getting off to a fast start with my PhD.

To have a good representation of all known genotypes of A. thaliana, I searched the databases, which contains genetic information for one thousand A. thaliana strains, and chose a set of individuals, over 200, that were broadly distributed across the geographic range where A. thaliana can be found, including ones from extreme environments. After extensive experimental and image monitoring design, I then took the seeds of ‘my’ populations and planted them in the greenhouse.

The results quickly became obvious: After over two weeks without watering, all the soil was completely dry, but some plants looked astonishingly healthy (see Figure 1). “These little things are tougher than people can imagine”, I thought.


Figure 1. Arabidopsis thaliana individuals after several weeks of drought. Few plants were still green, as those from central Spain (left) and mid to north-Sweden (center), but most were completely dry (right; example from south-west Sweden). Notice also how dry and brittle is the soil that even separated from the walls of the pots. Credit: Moises Exposito-Alonso

As some Swedish populations were, besides Spanish individuals, among those that coped best with my drought treatment, I became extremely curious about these Scandinavian Arabidopsis. Thanks to Magnus Nordborg — a long-time collaborator of the Weigel lab — I could visit some of these coastal populations in south-east Sweden (see Figure 2.). My jaw almost dropped when I saw them growing in the sand.


Figure 2. Natural populations of A. thaliana along the Swedish coast. Sand beaches are a tough environment for plants, as sand does not retain water. Credit: Moises Exposito-Alonso

Based on all previous observations, the next obvious question was: if climate change will increase droughts, as the IPCC and others strongly predict, what is the consequence of populations apparently being more or less adapted to such conditions? Can we predict their fate? The premise was that if different individuals of a species are genetically adapted to different environments, i.e., they vary in their sensitivity to environmental stresses as we saw, they might respond differently to future climates and might even be partially pre-adapted to future environments. That is, they could escape extinction through evolution by natural selection of advantageous genetic variants.


Figure 3. Map of presence of important genetic variants related to higher drought survival (left) and predictions of areas where populations might be genetically maladapted by 2070 and thus threatened to locally die. Credit: Figure 3 in our paper

I used a powerful machine learning algorithm (Random Forests) for predictions of potential geographic distributions of genetic variants (Fig. 3). These models make use of the current match between the distribution of genetic variants, as inferred from the locations in the database, and different climate variables, such as minimum temperature in winter, precipitation in summer. This technique is typically used for predictions in combination with presence or absence of a species from different geographic regions, but I adapted them for presence and absence of multiple genetic variants; to account for the heterogeneity within a species. Using our models, we transformed maps of projections of the climate in 2070 into predictions of what genetic variants must be present in 2070 for local A. thaliana populations to survive. Doing this we discovered that, because Europe will get drier, plants in Central Europe will need more of these ‘survival’ variants than they currently have.


What if one day we can use evolutionary theory to reliably tell where to find the genotypes that might save a threatened species? Or what if we could demarcate geographic areas that require immediate action because they are “genetically poor”?



This work was funded by the Max Planck Institute and an ERC grant to Detlef Weigel. I also want to thank my supervisors Detlef Weigel and Hernán Burbano for advice, and my coworkers and friends for their support.



code mnemonics

Compilation of useful programming tips that I sometimes need but always forget about.

Stag all removed files in git

git rm $(git ls-files –deleted)

Roxygen skeleton

To add the roxygen2 skeleton to document a function:




 Correspondance between reshape2::melt and tidyr::gather, reshape::dcast and tidyr::spread


mini_iris <- iris[c(1, 51, 101), ]

# melt
melted1 <- mini_iris %>% melt(id.vars = "Species", = 'dimension','trait')
melted2 <- mini_iris %>% gather(key='trait', value='dimension', -Species)

# cast
melted1 %>% dcast(Species ~ trait, value.var = "dimension")
melted2 %>% spread(key='trait', value='dimension')



Insert or overwriting mode shortcut

Not really a command, but super annoying for a linux computer using a mac keyboard.

Shortcut: fn + return

my reference papers

A list of great papers on the disciplines I am most interested in: evolution, ecology, quantitative & population genetics, statistics, bioinformatics.


Is adaptation possible in self-fertilizing species?

Who will adapt to climate change?

What is a mutation accumulation line?

What is the animal model? (LMM)

Under what circumstances SNPs cannot be identified in GWA?

What is heritability?

What is missing heritability?

How to GWA in plants?

What is population stratification correction in GWA?

What is a meta-GWA?

Population structure? F statistics?

What is spatial autocorrelation in the data?

What is an environmental niche model?

What is (really) a selection coefficient?

What is phenotypic selection?

How linkage, inheritance, and sex interfere evolution at multiple locus?

What are the footprints of selection on the genome?

How does multiple testing correction work?

What is principal component analysis?

SNP imputation in association studies

What is a hidden Markov model?

What is a support vector machine?

What is the expectation maximization algorithm?

What are DNA sequence motifs?

What are decision trees?

What is dynamic programming?

What are artificial neural networks?

How to map billions of short reads onto genomes?

Some of these papers come from a blog I came across with and could not find again. Thanks to that unknown blog.