A happier scientific writing: Putting a plug in for Paperpile (+Google Docs)

A couple of years now, I discovered Paperpile reference manager and I thought it was awesome. Although there were decent free managers like Mendeley (which I actually liked) or Endnote (which I didn’t), there was nothing really effective. My other concern was that I wanted to use Google Docs, as it would allow me to work with colleagues as well as to seemingly move writing from my Mac and Linux box (or any other computer).

I shared the document below with my colleagues at the Max Planck Institute for Dev. Biology, and Google Docs + Paperpile ended up being the default writing combo for our department (and I think is permeating to the whole institute).


[Written in 2016 with minor editions. Some features might have improved]

I had to share this awesome discovery with all you guys! If you are too often annoyed with your regular reference manager … keep reading.

So far I have worked with Mendeley (+2 years) and Endnote (6 months) and I haven’t been fully satisfied until one month ago that found Paperpile (https://paperpile.com/). It is a private reference manager that works online and is fully integrated with Google Drive, Google Docs, Google Scholar, etc …

Below I explain the pros and cons and attach some screenshots showing how it works.


  • No need to install anything, it runs in Google chrome browser as an extension. This means no incompatibility with your operating system! Mac, Windows and Linux can use it.
  • Since it is integrated with Google search engine, you can add a paper to your reference library (including pdf) with just one click (see screenshots below).
  • Super-accurate metadata recognition. It gets the details of the paper with basically no errors, because you import it directly from the journal web page or from a reference database and search engine (Mendeley tried to do this but most of the time I had to edit it manually).
  • With your gmail account (and the free 15 G of space in Google Drive), you can synchronize all the pdfs of your papers and you can have them in your desktop (to fill it up you probably need over 10.000 papers).
  • Online high quality integrated pdf editor. You can highlight, add notes, etc.
  • The annotations are also synchronized. That means that online annotations appear in your pdfs, but also the edits in your pdf appear also in the online pdf editor (for instance if you are editing with Preview software in Mac).
  • Perfectly renamed pdfs.
  • It is INTEGRATED WITH GOOGLE DOCS! Now, we can write entire papers in Google Docs.
  • [UPDATE 2017] I just learned that you can download a Google Doc (with citations inserted with Paperpile) into an office word document, modify text, and re-upload it into Google Docs without any loss of information (the Paperpile citations are kept even after the two transformations as long as you do not remove the hyper links in the text)
  • Easy import/export Paperpile<->Endnote. Not only the reference database, but also you can transform a Google Doc with Paperpile citations into a word document with endnote citations.
  • You can edit citations collaboratively in Google Docs (that is, two people can add citations in the Google Doc, even if they do not have a Paperpile library. It is done via Google scholar search in the Google Doc screen).
    • So far Google Dosc could be used to write papers collaboratively, with comments, suggestions, edits. But the citations were the last step and had to be edited by a single individual. Now several people with their Paperpile accounts (or maybe a shared account of a lab), can add independently citations to the Google docs.
    • If you click in a citation from another person, you get the citation information and, if that person had the PDF, also view it.
  • Paperpile has shared folders. This might be particularly useful for labs, where the supervisor can share his/her library and all students can search for important references there.
    • You can also share a single paper you like with somebody only with two clicks: >click share >type email > click ok  (instead of finding the pdf or copy the journal url and attach them to an email).
  • NO LEARNING CURVE. I think Mendeley and Endnote are more difficult to use (particularly the second). With Paperpile, I basically didn’t have to learn anything, just uploaded my library file (from Mendeley or Endnote) and started using it. Now I mainly add papers with one click when I see them in internet.
  • It will soon support microsoft word online.
  • [UPDATE 2017] So far Google Docs missed the feature of putting line numbering into a document. This is 100% required for submitting to a journal, so we had to download the paper written in Google Docs and import it to MS word to put the lines. Now there is a Google chrome extension that allows putting line numbers in gdocs. Make sure you scroll throughout the document and that you export it as “print->save as pdf”.


  • I can only think about a big one: Paperpile works exclusively ONLINE. .
    • Still, Google Docs does work offline, so when writing a paper in a plane or the countryside, you just need to put a comment to remember to add a reference later.
    • Because your papers are stored as pdfs in your Google Drive (which can be sync in your computer), you can actually keep reading them and annotating with another editor (preview or adobe … ) offline.
  • Private software. You have 1 month trial for free, then you have to pay. However, it is only 2.9 $ / month = 32 euros per year. I think that is something more affordable than other options ( Endnote X7 $249.95). I tried 1 month and it was awesome, so I paid it. Of course, they offer shared licences for Universities or Institutions (some affiliated institutions are Berkeley, Max Planck, MIT, Cambridge… )




The Paperpile interface. It is just in a Google chrome tab. Folders in the left, the papers in the middle, and some statistics of all your library in the right

When you click in one of these papers, it opens it in another tab with this pdf editor. You can search, annotate in 4 colors, jump between annotations, see the sections outline…

All the pdfs are renamed and sorted in your Google Drive automatically, so you can edit outside Paperpile, or search directly in a folder …

Many ways to import to your library. My favourite is just click the button in the upper right of your browser when you are in the page of a journal, like this paper of Science. When you click add to your Paperpile, it gets all the metadata and also downloads the pdf for you.

When you are searching in Google scholar or even just Google,  a small button of “Paperpile” appears when there is a paper or a book. If you click on it, it will add it to your library, if it is already there it appears in green, and if you also have the pdf, there is a little blue icon in the side.

[Update 2018] —

screen shot 2019-01-18 at 11.31.37 am

You can add papers to Paperpile directly in Google scholar profiles!

— [end update]

You can import stuff in Paperpile platform. As I mentioned, different uploading formats are available: PDFs, bibtex, or RIS (one of the formats of your Endnote library) …

…. or you can just search in reference engines from inside Paperpile, and add with the +.

Add the Google docs plugin (2 clicks). Adding a citation is as easy as to search in the right hand Paperpile bar. Find the paper and click on it.

image2016-3-28 20_13_14

Citation appears in the format you want, and at the end of the document a bibliography is appended when you click update bibliography. Each cite is actually a link to the paper, so if somebody else (e.g. working collaboratively in the paper) clicks on the link, it opens a tab with the full citation and the pdf ! The opposite also holds. If somebody is reviewing your paper and does not have Paperpile, is OK. If there is one reference missing, you can also add it searching in the right bar. This time it goes to google scholar and when you pick a reference, it cites it in the same way. Now if the first author wants, can click on the link and add it to his personal library (that is pretty cool … I tested it with a different account).

image2016-3-28 20_14_42.png

In the same bar there is a button and you can export the whole document into a word.docx file and a endnote.RIS file . I tried and it works.

image2016-3-28 20_15_30.png




Preprints & morale of PhD students

The validity and usefulness of preprints such as https://biorxiv.org or https://arxiv.org is a topic of lovers and haters within the life sciences community. I should disclose that I am among those lovers, but here I am not going to develop the many arguments in favor, as these have been discussed elsewhere (for example see this, this, this, or Bonus section at the end of this post). In this brief post I highlight an argument I think has been overlooked:

“preprints positively impact the morale of PhD students

Competition in research is very high (It has increased, or so I have been told by multiple senior scientists). While putting a number of a metric of competition is difficult, what it is certain is that there are many more new PhDs graduating every year than academic jobs opening (https://www.nature.com/news/2011/110420/full/472276a.html; although there are other paths outside academia after a PhD). This not only means that early career researchers are competing for jobs but they are also competing to get their papers in those limited number of “slots” available at the “fancy scientific journals”. A consequence of this is that publishing can be a very (VERY) slow process, where your manuscripts get rejected multiple times at different journals and you need to submit again and again. My slowest personal experience was with my first paper, which was almost 650 days from submission to publication. I was lucky that my supervisors were very supportive (in many ways) and were happy to post the manuscript in bioRxiv, so I did not feel my work was completely useless and ignored ;-). As a result of these and other idiosyncrasies the science world, demoralized PhD students are all over the place in academia. In fact, given the high rates of anxiety and depression, some authors say there is a “mental health crisis” in grad school (https://www.nature.com/articles/nbt.4089). There are so many hard-working, intelligent students that struggle to get their papers out in journals for years, even when the papers are well written and the science is solid!

Avoiding this demoralizing experience might require structural changes in the publication process or our culture of super-production. However, I believe just accepting preprints as one step in the current publication dogma can soften the consequences of the system in the morale of PhD students. So for the supervisors out there reading: consider posting on bioRxiv (see some benefits in the BONUS section below) and, please, don’t tell your students that “this paper could be published in a X months in Y journal”. You know that is very unlikely and will put pressure on the naïve idealistic student. The reality is that although many scientists are optimistic by nature, a timely publication depends on many factors external to the student such as journal processing times, or the mood of editors and reviewers [when a prediction of publication date of a manuscript in preparation has ever become true?]. Instead, when a manuscript is becoming a solid and polished piece, say that “it can be bioRxived”. That is totally feasible and it mostly depends on the hard work of the student and the supervisor. Putting all the hopes of a student in seeing his/her research published on a journal’s website teaches a wrong lesson of the final aim of science, promotes vanity search, and most likely will erode self-confidence of the student, which also lowers productivity!.

Going one step further: Some (I think many) universities ask to have manuscript(s) published to be able to graduate. Again, because of external factors during publication this can lead, at least, to frustration and, sometimes, to problematic life consequences such as non-renewed working contracts, visa problems, etc. I understand that universities need some kind of accessible scientific output to award a doctorate title. To reduce unnecessary complication, the university could establish that to graduate you need x number of bioRxived manuscripts that are undergoing the publication process or that have been publicly peer-reviewed (e.g. https://f1000.com, https://peercommunityin.org).

With the hope this will change some minds and help some fellow graduate students,




For those interested, I enumerate the most common arguments for preprints, seasoned with my own opinions:

  • Science advances faster. The day after you submit your preprint, it goes online and can be read (it only gets screened automatically and by “scrolling check”).
  • When it is first archived, a unique digital object identifier (DOI) is generated with an associated date of publishing. This allows to clearly determine chronological priority.
  • Preprints have typically a Creative Commons licence. You must give credit (cite the paper), but basically research is universally available.
  • You get feedback from a large community in social media platforms or directly in the dedicated section of the preprint website. From experience and chatting with colleagues, private feedback is typically more than public feedback.
  • As the work is publicly available, you can put preprints in your CV and now in many international grant applications preprints “count”.
  • Probably the strongest (rightful) critique of preprints is that research should undergo a peer-review process. For this I refer to a very nice initiative to peer-review preprints: https://evolbiol.peercommunityin.org. Now there is no excuse!
  • And to conclude with a mischievous smile: After all, a preprint and a published article might only differ in that only 2 people more have read the latter.




Evolution Montpellier — symposium & poster

Very excited about the upcoming international evolution meeting in Montpellier (evolutionmontpellier2018.org). This year extra happy to host a  symposium on “Rapid Evolutionary Responses to Global Change” (https://www.evolutionmontpellier2018.org/symposia) with Dmitri Petrov as invited speaker.

I will also be contributing a poster to another symposium, “Evolution on the edge”, based on our latest bioRxiv preprint: “A map of climate change-driven selection in Arabidopsis thaliana“. Sunday August 19  5.30pm –7.30pm.




Paper in press at Evolution: “Spatio‐temporal variation in fitness responses to contrasting environments in Arabidopsis thaliana”

Cross-posted press release

Exposito-Alonso et. al (2018) Evolution https://doi.org/10.1111/evo.13508


Global climate change can have catastrophic consequences for many species, but a few might have the ability to survive or even thrive. Plants (and all organisms) have two options for responding to climate change: move to cooler environments (higher latitudes or altitudes), or stay and adapt to warmer temperatures. If a population stays, does it have the genetic variation needed to adapt, or does it have to wait for new mutations, thereby risking extinction in the meantime? This question was examined in a study published today in Evolution.

Biologists Moises Exposito-Alonso at the Max Planck Institute for Developmental Biology, Xavier Picó at the Doñana Biological Station, and collaborators tested the ability of the plant mustard cress (Arabidopsis thaliana) to adapt to hotter conditions using existing genetic variation. They found that offspring from some populations from the northwestern Iberian Peninsula performed well at two more southern, naturally warm sites. This suggests that for A. thaliana, and perhaps other plants, existing genetic variation can allow organisms to withstand new environmental conditions predicted by climate change projections.

The researchers planted 174,000 seeds of plants from the northwestern Iberian Peninsula in two locations in southern Spain, one at low altitude with higher temperature and higher precipitation, and one at high altitude with mild temperatures and low precipitation. In each of nine experiments, the researchers measured the proportion of seeds that sprouted, the time of flowering, and the number of seeds that each plant produced. They found that while many northern plants survived, those that came from warmer source locations flowered fastest, and as a result performed best overall.

“It is a global trend that plants are flowering earlier in spring as a response to climate change. Here we present evidence that flowering earlier has a fitness advantage, and that it has to do with the plants’ genetics,” said Exposito-Alonso.

In rainy years, however, this trend reversed—plants that flowered later were more successful. “This suggests that populations of genetically diverse plants might be favored in the long run if climate variability increases, and that responses to global warming are more complex than we thought,” added Exposito-Alonso.

These findings can help researchers better predict how species will respond to climate change. Genetic diversity within a species may be key in facilitating evolutionary adaptation to changing conditions. Knowing which genotypes do best in each environment can help inform conservation strategies such as assisted gene flow, where adapted seeds are imported to improve the local gene pool and help local populations adapt.


Mustard cress, Arabidopsis thaliana, genotypes flowering at different rates in the low altitude experiment. Credit: Moises Exposito-Alonso, Max Planck Institute for Developmental Biology.

Cover at PLOS Genetics: “Measuring the rate of evolution in hitchhiking Arabidopsis thaliana”

cross-post from PLOS website:


Screen Shot 2018-03-01 at 12.02.32 AM


2014-10-17 14.24.42

Despite its modest appearance and small size, Arabidopsis thaliana has proven to be a successful colonizer. It is found today over much of the continental US since its first arrival there only a few hundred years ago. The species thrives in wild, rural as well as urban settings, and the photo shows an A. thaliana plant thriving in the cracks of a sidewalk. Our study suggests that about four hundred years ago, A. thaliana seeds from a single plant were unknowingly being carried by Europeans to the Eastern US. Who would have guessed that several centuries later, scientists would take advantage of its North American exile and of pressed plants that botanists have collected over the past couple of centuries to calculate its genomic mutation rate, that is, the speed at which it evolved in the New World?

Photo credit: Moises Exposito-Alonso


Paper in press at PLOS Genetics: “The rate and potential relevance of new mutations in a colonizing plant lineage”


Scientists create ‘Evolutionwatch’ for plants


Scientists are giving plant collections from museums a new lease of life with ‘Evolutionwatch’ – a new way to study evolution in action.

Using a hitchhiking weed, scientists from the Max Planck Institute for Developmental Biology reveal for the first time the mutation rate of a plant growing in the wild.

They compared 100 historic and modern genomes of the tiny plant Arabidopsis to measure precisely the rate at which it evolves in nature. The oldest plant, preserved in a herbarium, was from 1863. At this time, the scientists estimate the species had already more than 200 years in the New World behind it. Two different methods gave the same result, that Arabidopsis had been introduced by Europeans who arrived on the US East Coast around the year 1600. It was almost certainly introduced there by chance, perhaps carried on the boots of Europeans, or mixed in with the seeds of edible plants.

The team focused on samples from North America, because they knew that one particular genetic family of Arabidopsis was very widespread, presenting an opportunity to observe newly-acquired mutations. The comparison of 100 complete genomes revealed 5000 new mutations, some of which could have given the plant an adaptive advantage as it colonised its new environment. The plant moved inland alongside human settlers, gradually diverging from the European ancestor from which it originated. Samples of the species along the same path today reveal increasingly deep and fast-growing roots, perhaps evidence that it adapted during its hitchhiking trip.

“Collections of invasive populations sampled from different times in history enable us to observe the ‘live’ process of evolution in action,” says Moises Exposito-Alonso, first author of the paper published in PLOS Genetics.

They sequenced the genomes of 100 plants collected by botanists between 1863 and 2006. All samples from before 1990 came from museum collections of dried plants. The oldest dried plants, preserved in time 150 years ago, show how much they had evolved by that time. The youngest plants continued to change and evolve. By comparing genomes of plants that had diverged from a common ancestor for different amounts of times, the scientists calculated how many mutations the plant acquires a year.

This in turn enabled the team to deduce that the last common ancestor of the lineage must have lived at the end of the 16th or beginning of the 17th century, coinciding with the time that many people were arriving by boat from Europe, particularly the southern UK, west coast of France and the Netherlands. This was very surprising, since a previous estimate, which had not made use of genetic information from dried herbarium samples, suggested that the colonizing Arabidopsis plants had only arrived in the 19th century.

Arabidopsis is not a harmful weed, but the findings help reveal some of the fundamental evolutionary processes behind the ability of invasive species to colonise new environments. In particular, they unlock some of the secrets of the “genetic paradox of invasion”. This occurs when a colonizer with low genetic diversity is nevertheless surprisingly successful in a new environment.

To determine the effect of new mutations, the scientists grew some of the plants in the lab to identify any differences in growth. The fact that such differences were found suggests that some of the mutations that appeared during the past 400 years conferred an advantage during colonisation.

“We were very surprised, since scientific dogma suggests that evolution normally proceeds at a much slower pace,” said Hernán Burbano, one of the supervisors of this study.

“Accurate evolutionary rates for plants and animals will be fundamental to reconstruct their past history and to predict the opportunity of novel advantageous traits to arise. Our results show that herbarium and animal specimens can be the source of a great new branch of genetics in future,” Exposito says.


This press release was written by Zoe Dunford on behalf of Max Planck Institute for Developmental Biology.


Citation: Exposito-Alonso M, Becker C, Schuenemann VJ, Reiter E, Setzer C, Slovak R, et al. (2018) The rate and potential relevance of new mutations in a colonizing plant lineage. PLoS Genet 14(2): e1007155. https://doi.org/10.1371/journal.pgen.1007155

Image Credit: Moises Exposito-Alonso, Claude Becker and colleagues

Funding: This study was supported by the President’s Fund of the Max Planck Society (project “Darwin”) to HAB and by an ERC grant (AdG IMMUNEMESIS) and core funds of the Max Planck Society to DW. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing Interests: The authors have declared that no competing interests exist.


Paper in press at Nature Ecology & Evolution: “Genomic basis and evolutionary potential for extreme drought adaptation in Arabidopsis thaliana”

This is a linked post from “behind the paper” section of the Nature Ecology & Evolution community.


By Moises Exposito-Alonso

Our Nature Ecology & Evolution paper is here: dx.doi.org/10.1038/s41559-017-0423-0.

This study is based on the first experiment of my PhD that aimed at identifying genetic variants (i.e. old mutations) related to survival under climate change scenarios. I looked at many different individuals of thale cress, Arabidopsis thaliana, and discovered some hundred genetic variants that, when present in a plant, increased its survival under drought conditions. We also found that such variants are more common in Mediterranean and Scandinavian populations — populations that, by living at the edge of the distribution range of the species, have probably already experienced more extreme environments than those at the center of Europe.

Growing up in a semi-arid area of Spain (Alicante), I was amused to observe how during months-long droughts there were some plants that still miraculously survived — their scientific names became engraved in my brain as part of my undergraduate training in biology. In those undergraduate classes we were taught the general strategies of how plants deal with drought, long described by ecologists, but whose genetic underpinnings were mostly still a mystery.

Three years ago, I embarked on my PhD at the Max Planck Institute of Developmental Biology with the goal of identifying genes that could help plants to survive under future climate change, which will almost certainly see extreme drought situations much more often than today. While I had experience with field experiments, I had not performed any drought experiments. However, two postdocs, François Vasseur and George Wang, were very knowledgeable in this area of ecology, and in image acquisition and processing. Their help, along with that of my two supervisors, Detlef Weigel and Hernán Burbano, was invaluable in getting off to a fast start with my PhD.

To have a good representation of all known genotypes of A. thaliana, I searched the 1001genomes.org databases, which contains genetic information for one thousand A. thaliana strains, and chose a set of individuals, over 200, that were broadly distributed across the geographic range where A. thaliana can be found, including ones from extreme environments. After extensive experimental and image monitoring design, I then took the seeds of ‘my’ populations and planted them in the greenhouse.

The results quickly became obvious: After over two weeks without watering, all the soil was completely dry, but some plants looked astonishingly healthy (see Figure 1). “These little things are tougher than people can imagine”, I thought.


Figure 1. Arabidopsis thaliana individuals after several weeks of drought. Few plants were still green, as those from central Spain (left) and mid to north-Sweden (center), but most were completely dry (right; example from south-west Sweden). Notice also how dry and brittle is the soil that even separated from the walls of the pots. Credit: Moises Exposito-Alonso

As some Swedish populations were, besides Spanish individuals, among those that coped best with my drought treatment, I became extremely curious about these Scandinavian Arabidopsis. Thanks to Magnus Nordborg — a long-time collaborator of the Weigel lab — I could visit some of these coastal populations in south-east Sweden (see Figure 2.). My jaw almost dropped when I saw them growing in the sand.


Figure 2. Natural populations of A. thaliana along the Swedish coast. Sand beaches are a tough environment for plants, as sand does not retain water. Credit: Moises Exposito-Alonso

Based on all previous observations, the next obvious question was: if climate change will increase droughts, as the IPCC and others strongly predict, what is the consequence of populations apparently being more or less adapted to such conditions? Can we predict their fate? The premise was that if different individuals of a species are genetically adapted to different environments, i.e., they vary in their sensitivity to environmental stresses as we saw, they might respond differently to future climates and might even be partially pre-adapted to future environments. That is, they could escape extinction through evolution by natural selection of advantageous genetic variants.


Figure 3. Map of presence of important genetic variants related to higher drought survival (left) and predictions of areas where populations might be genetically maladapted by 2070 and thus threatened to locally die. Credit: Figure 3 in our paper dx.doi.org/10.1038/s41559-017-0423-0.

I used a powerful machine learning algorithm (Random Forests) for predictions of potential geographic distributions of genetic variants (Fig. 3). These models make use of the current match between the distribution of genetic variants, as inferred from the locations in the 1001genomes.org database, and different climate variables, such as minimum temperature in winter, precipitation in summer. This technique is typically used for predictions in combination with presence or absence of a species from different geographic regions, but I adapted them for presence and absence of multiple genetic variants; to account for the heterogeneity within a species. Using our models, we transformed maps of projections of the climate in 2070 into predictions of what genetic variants must be present in 2070 for local A. thaliana populations to survive. Doing this we discovered that, because Europe will get drier, plants in Central Europe will need more of these ‘survival’ variants than they currently have.


What if one day we can use evolutionary theory to reliably tell where to find the genotypes that might save a threatened species? Or what if we could demarcate geographic areas that require immediate action because they are “genetically poor”?



This work was funded by the Max Planck Institute and an ERC grant to Detlef Weigel. I also want to thank my supervisors Detlef Weigel and Hernán Burbano for advice, and my coworkers and friends for their support.