Genome Resequencing for Mutant Identification

As most biologists will be aware, the cost of DNA sequencing has been falling well in advance of the costs predicted by Moores law (although argued by Neil Hall a few years ago, this might not have been the best thing to happen, intellectually at least).

Instead of simply sequencing many genomes for the sake of it, this also offers opportunities for researchers to use this technology to ‘do-science’ that might previously have been prohibitively laborious or expensive. One such area where this is true is in the identification of novel mutations in plants, especially in Arabidopsis.

Classic approaches to identity the location of an EMS mutation involved mutant identification, backcrossing, selection, rough mapping by PCR or CAPS markers, probably more crossing and then a little guesswork toward the end..…..before using Sanger sequencing to identify what you hope is the causative mutation. Even with a strong following wind this process could take upwards of a year……. many a 1990s PhD thesis was written off the back of mutant identification. In contrast it is now relatively cheap to resequence the Arabidopsis genome so a lot of time can be taken out of this process. In addition, resequencing can remove some of the difficulty involved with selective of mutants that have a subtle phenotypes wherein inaccurate selection of putative mutants would significantly set back the process.

Back in 20111, Anthony Hall’s group in Liverpool University used resequencing in parallel with classic genetics to identify the lesion in the novel early bird1 gene (ebi1), which has a defect in function of the circadian clock. In this case ebi1, which was generated using EMS, was backcrossed 4 times to reduce the number of EMS-induced SNPs not associated with phenotype, and then sequenced alongside the original wildtype plant (from the WS ecotype). The critical part of the protocol came in the power of the software they used to detect homozygous SNPs in the ebi1 line. Indeed the researchers ran into some difficulties due to a high number of SNPs they initially identified. However, when they combined altering the stringency of SNP-calling together with classical rough mapping they were left with approximately 30 SNPs to finally assess. Using a priori knowledge of proposed gene function and by investigating expression changes in these candidates they ultimately identified a novel mutant. Although this process was ultimately successful, it took some extra time due to the difficulty of mutant selection, optimization of the SNP-calling software and subsequent analysis of gene expression.

A recent paper from the lab of Lucia Strader at Washington University in St Louis shows how powerful resequencing can be if you are using a robust method of mutant selection. In their case they isolated mutants with a defect in the root growth response to ABA, which is an unequivocal phenotype to score. They backcrossed their initial mutants, selected for ABA resistance in F2 generation before resequencing these resistant plants. Using this process the authors report that they narrowed their search to between 3-10 candidate genes and that they have subsequently identified novel (unpublished) genes using this method. In addition, as an exemplar of their protocol they used it to isolate novel alleles of known ABA-resistant mutants.

Schematic for mutant identification using NGS. Reproduced from Taylor and Francis PSB
In parallel they used a similar protocol to the Hall lab where they resequenced non-backcrossed plants and then selected SNPs that only lay within exons.Using this approach they identified between 100-200 homozygous SNPs, a potentially fifty-fold increase compared to their other method. Therefore when you are working with a strong robust phenotype it is probably worth the extra time to obtain a back-crossed population in order to have greater confidence you are isolated your mutant of interest.

The authors importantly note that one limitation of this protocol is that by only selecting for exonic mutations, they are removing the possibility of identifying mutants with splicing or non-coding defects, which may in turn rule out a number of candidate genes.


For me the take-home message from this second study is that if you have a robust phenotype to select for and are confident that your mutation is novel then use of ever-improving NGS is now a time and cost effective way of mutant identification.

In fact this technology might inspire a return to the forward genetic screens of the 80s and 90s , with the aim of identifying novel genes involved in well characterised signaling pathways……..except that PhD students might now have to characterise 10 novel genes prior to graduation….

