Opening up the Arabidopsis Proteome

In 2000 the sequencing of the Arabidopsis thaliana genome was the starting gun for a new era in plant biology research. The remarkable community-facing tools that have been developed with a focus on this model organism include TAIR, the BAR and the 1001 Genomes project and have greatly benefited many researchers around the world.


Numerous studies have been investigated the Arabidopsis proteome in various tissues, cells or organelles taken from plants exposed to a range of biotic and abiotic challenges. However a new publication builds upon existing research with their analysis of the linkages that exist between Arabidopsis transcriptomes, proteomes and phosphoproteomes, each of which had be analysed from 30 different tissues types.

Mergner, J., Frejno, M., List, M. et al. (2020) Mass-spectrometry-based draft of the Arabidopsis proteome. Nature https://doi.org/10.1038/s41586-020-2094-2


Tissue map and multi-omics dataset

This German-led research team were able to estimate that there are 18000 translated proteins in Arabidopsis, which are phosphorylated at 43000 sites and that their absolute expression ranges over six-orders of magnitude.

Importantly they have provided community-access to this entire resource through the ProteomicsDB and ATHENA databases.


Data exploration in ATHENA and ProteomicsDB

Compared to a now-12 year old whole plant proteomic study, this work significantly increases predicted levels of protein abundance to include 18,210 out of the 27,655 protein-coding genes. No doubt these increases are due to technical improvements as well as improved Arabidopsis genome annotation (from TAIR7 to Araport 11). They confirmed previous information held within the PhosPhAT database that 47% of the proteome is phosphorylated, although the authors think this is likely an underestimation.

They identified fewer proteins that arise from low abundance transcripts, indicating that there are additional findings to be made in the lower protein abundance range and that they may be functionally important as many of these proteins could be involved in key processes of cell signaling and regulation of gene expression regulation.


Interestingly they find that most proteins do not vary in absolute tissue specificity but rather that plant morphological differences are most likely controlled through the differences in the relative abundance of those proteins. The storage protein CRA1 and photosynthetic RuBisco complex were found to have the expectedly high abundance in seeds and green tissues respectively, whilst both the transcriptome and proteome from pollen were found to be the most tissue diverse.

Overall the primary determinant of protein level was the level of transcript yet many other molecular factors were involved. Perhaps most frustratingly, the authors showed that 48% of variation in protein abundance was unexplained, highlighting the enormous amount of research that is still required even in this most well studied of plants.

By investigating the abundance of proteins arising from paralogous genes they showed that often a particular copy showed higher abundance and therefore might be a better target for future mutant studies, although the paper says nothing about any compensatory effects that might occur in mutant plants.


Examining the Phosphoproteome

The amount of phosphorylation within individual proteins was extremely variable with at one end of the spectrum the LEA protein family phosphorylated on every available serine, threonine or tyrosine residue. The authors speculate that this level of phosphorylation in these seed proteins may be important
 to regulate conformational state or phase transi
tion. The authors are careful to state that phosphorylation of a particular amino acid is not necessarily linked to function. However they did generate phosphomimetic mutants
 in the abscisic acid (ABA) receptor RCAR10, demonstrating a previously uncharacterised function for these amino acids during ABA signaling.


Ascribing function to protein phosphorylation

Overall this research provides evidence, if it were needed, that the co-analysis of multiple ‘omic databases can provide experimental insights that will guide future research. The authors hope that their online database and analysis resources will be useful tools for the community. Of course there is plenty to do and as one eminent plant scientist suggested, this is a nice starting-point toward a full 1001 Proteomes project 🙂

….What have you done for me lately. ?

Leave a Reply


 © 2024 - Weeding the Gems