Last Monday the Arabidopsis community gathered for the Arabidopsis Information Portal workshop at PAG XXIII. The Arabidopsis Informatics Portal (AIP) was funded by NSF and BBSRC to move beyond the Arabidopsis genome resource provided by TAIR toward linking the genome to the epigenome, proteome, transcriptome and interactome.
The first talk was a short update from Eva Huala, formerly of TAIR and now of Phoenix Bioinformatics, the nonprofit company she started in order to keep TAIR going. Huala explained that after TAIR’s NSF funding ended, the pay-to-access model was chosen over the alternative pay-to-submit (open access) approach. This means TAIR is focussed on ensuring the subscribers get the best possible value for money by providing the best possible database curation, manual annotation and user experience. Most TAIR subscription fees are paid by libraries, as if it was a journal, but researchers from institutions whose libraries do not pay the fee will be able to access TAIR’s manual annotation after a year’s embargo.
Next, Sean May (NASC, University of Nottingham) explained that NASC is a module of AIP and is currently integrating with the ABRC. He is consulting the community about the development of NASC, so make sure you have your say in the NASC Strategy Survey:
Chia-yi Cheng (JCVI) gave an overview of Araport, the online home of the AIP. Araport federates diverse datasets from other places, for example TAIR, UniProt and BAR, and maintains the Col-0 ‘gold standard’ annotation. It uses JBrowse as the default genome browser and hosts datasets including the CoGe epigenomics resource, which I blogged about last week.
Araport will eventually host a number of analysis modules, or ‘apps’. Currently the only app is the BAR interaction module, a powerful data visualisation tool from the makers of EFP browser (which looks rather quaint in comparison!). Apps will be contributed by members of the community, and two new app developers fresh from a recent AIP developers workshop presented their apps to delegates at the workshop.
First, Marie Bolger presented some tools that she and the rest of the Usadel group at RWTH Aachen have developed using MapMan (Thimm et al., Plant J 2004). These include graphical wizard RobINA and Mercator, a tool to enable comparison of large FASTA files. Uploading these tools to Araport will give them a friendly user-interface and greater global exposure than their current online home.
Justin Preece explained that Pankaj Jaiswal’s Plant Ontology (PO) data already have a large user base, butthe PO web services see limited community use. Putting them on Araport and allowing other apps on Araport to use them will increase the likelihood of them being used.
Both Bolger and Preece mentioned how easy the AIP developer platform is to use. There is a Python-based API template on GitHub, and straightforward instructions for the ubiquitous initial steps, including registering your namespace and choosing and creating your app scaffold.
Jason Miller (AIP) explained that the apps developed at the workshop, including those from Bolger and Preece, will be launched online soon. The AIP team are finishing up the security settings that will make sure user-developed apps are bug and malware free. He also explained that all AIP apps must be shared on GitHub or another online repository – this ensures transparency and encourages upgrades, and also guarantees that if AIP is discontinued the tool will still be accessible through the code shared elsewhere.
There will be another AIP developer workshop this year so if you’re interested in putting your hard-earned analysis tool online, keep your eyes peeled! And if you work with Arabidopsis data, I highly recommend you try and get to grips with Araport’s existing services.