GEO for plant scientists: Sharing data

Comments: No Comments
Published on: February 13, 2014

There is currently no microarray service provider in the UK that uploads your plant science microarray data to GEO on your behalf, but publication requires your data to be shared. The most common request from journals is that it is shared on GEO.

GEO has this information page about data submission. While the high-throughput sequence submission guidelines are a still little complicated, microarray experiments have well-established (and enforced!) minimum information requirements and the four main microarray chip providers have customized information pages. An email address is provided for users to email enquiries and ask for help from GEO’s curators.

The Affymetrix page is probably the most useful for UK plant sciences. Spreadsheet-based submission is recommended for Affymetrix deposits, so users should submit an Excel metadata worksheet, CEL files, and processed data for example a Tiling Array. The page gives advice on how to find certain information is given on finding GEO-specific information, and there are template and example spreadsheets.

Once submitted, your dataset becomes a GEO accession and can be identified with a unique accession number. The accession number should be used when you or anyone else references or links to your dataset, which seems like an easy means of tracking its usage within the community.

GEO for plant scientists: How to find Arabidopsis microarray data

Comments: No Comments
Published on: February 13, 2014

Submission of gene expression data to the Gene Expression Omnibus is now a requirement of publication in most journals, so it is an extremely valuable resource. It is also extremely big, and full of data that isn’t relevant to your question or task at hand – but it is easy to find the right data using the search bar if you follow a few rules. There are example searches on the GEO homepage.

To find data relating to Arabidopsis thaliana, search: (Arabidopsis thaliana[organism])

To find Arabidopsis microarray data, search: (Arabidopsis thaliana[organism]) AND “expression profiling by array”

The easiest way to find other Arabidopsis datasets is to search: (Arabidopsis thaliana[organism]). On the left hand side of the window, there is a ‘Study type’ section. If you click on ‘More…’ a list of study types pops up from which you can select the data type you are looking for (see screen shot below).

You can add any search term you like to the search bar. For example, you could specify author, publication time, types of tissue or stress… or any combination of these. Just keep adding AND in between each term. For example: (Arabidopsis thaliana[organism]) AND “expression profiling by array” AND leaf

GEO provides an informative guide to how to download original records or curated datasets individually or in bulk. You can download data directly from Accession Viewer pages (eg this one) in SOFT, MINiML or TXT formats. Raw data is also available in TAR. You can also do bulk downloads via GEO’s FTP site. All files are compressed using gzip.

It’s also possible to access GEO programmatically in order to, for example, quickly retrieve CEL files from Arabidopsis stress experiments. Again, GEO provide a guide to this, although this is probably something better tackled with some pre-existing knowledge of programming.

GEO post

page 1 of 1

Follow Me
TwitterRSS
GARNetweets
Categories
February 2014
M T W T F S S
 12
3456789
10111213141516
17181920212223
2425262728  

Welcome , today is Wednesday, May 1, 2024