Javascript must be enabled to continue!

ArrayWiki: an enabling technology for sharing public microarray data repositories and meta-analyses

Abstract Background A survey of microarray databases reveals that most of the repository contents and data models are heterogeneous (i.e., data obtained from different chip manufacturers), and that the repositories provide only basic biological keywords linking to PubMed. As a result, it is difficult to find datasets using research context or analysis parameters information beyond a few keywords. For example, to reduce the "curse-of-dimension" problem in microarray analysis, the number of samples is often increased by merging array data from different datasets. Knowing chip data parameters such as pre-processing steps (e.g., normalization, artefact removal, etc), and knowing any previous biological validation of the dataset is essential due to the heterogeneity of the data. However, most of the microarray repositories do not have meta-data information in the first place, and do not have a a mechanism to add or insert this information. Thus, there is a critical need to create "intelligent" microarray repositories that (1) enable update of meta-data with the raw array data, and (2) provide standardized archiving protocols to minimize bias from the raw data sources. Results To address the problems discussed, we have developed a community maintained system called ArrayWiki that unites disparate meta-data of microarray meta-experiments from multiple primary sources with four key features. First, ArrayWiki provides a user-friendly knowledge management interface in addition to a programmable interface using standards developed by Wikipedia. Second, ArrayWiki includes automated quality control processes (caCORRECT) and novel visualization methods (BioPNG, Gel Plots), which provide extra information about data quality unavailable in other microarray repositories. Third, it provides a user-curation capability through the familiar Wiki interface. Fourth, ArrayWiki provides users with simple text-based searches across all experiment meta-data, and exposes data to search engine crawlers (Semantic Agents) such as Google to further enhance data discovery. Conclusions Microarray data and meta information in ArrayWiki are distributed and visualized using a novel and compact data storage format, BioPNG. Also, they are open to the research community for curation, modification, and contribution. By making a small investment of time to learn the syntax and structure common to all sites running MediaWiki software, domain scientists and practioners can all contribute to make better use of microarray technologies in research and medical practices. ArrayWiki is available at http://www.bio-miblab.org/arraywiki.

Springer Science and Business Media LLC

Todd H Stokes JT Torrance Henry Li May D Wang

BMC Bioinformatics

2008

Title: ArrayWiki: an enabling technology for sharing public microarray data repositories and meta-analyses

Description:

Abstract Background A survey of microarray databases reveals that most of the repository contents and data models are heterogeneous (i.

, data obtained from different chip manufacturers), and that the repositories provide only basic biological keywords linking to PubMed.

As a result, it is difficult to find datasets using research context or analysis parameters information beyond a few keywords.

For example, to reduce the "curse-of-dimension" problem in microarray analysis, the number of samples is often increased by merging array data from different datasets.

Knowing chip data parameters such as pre-processing steps (e.

, normalization, artefact removal, etc), and knowing any previous biological validation of the dataset is essential due to the heterogeneity of the data.

However, most of the microarray repositories do not have meta-data information in the first place, and do not have a a mechanism to add or insert this information.

Thus, there is a critical need to create "intelligent" microarray repositories that (1) enable update of meta-data with the raw array data, and (2) provide standardized archiving protocols to minimize bias from the raw data sources.

Results To address the problems discussed, we have developed a community maintained system called ArrayWiki that unites disparate meta-data of microarray meta-experiments from multiple primary sources with four key features.

First, ArrayWiki provides a user-friendly knowledge management interface in addition to a programmable interface using standards developed by Wikipedia.

Second, ArrayWiki includes automated quality control processes (caCORRECT) and novel visualization methods (BioPNG, Gel Plots), which provide extra information about data quality unavailable in other microarray repositories.

Third, it provides a user-curation capability through the familiar Wiki interface.

Fourth, ArrayWiki provides users with simple text-based searches across all experiment meta-data, and exposes data to search engine crawlers (Semantic Agents) such as Google to further enhance data discovery.

Conclusions Microarray data and meta information in ArrayWiki are distributed and visualized using a novel and compact data storage format, BioPNG.

Also, they are open to the research community for curation, modification, and contribution.

By making a small investment of time to learn the syntax and structure common to all sites running MediaWiki software, domain scientists and practioners can all contribute to make better use of microarray technologies in research and medical practices.

ArrayWiki is available at http://www.

bio-miblab.

org/arraywiki.

Back

Identifying, finding and gaining a sufficient overview of the functions and characteristics of data repositories and their catalogues is essential for users of data repositories an...

Data Sharing in Psychology

Narrowly defined, data sharing is the practice of making scientific research data available to other researchers. However, the term is often used to include a variety of open-scien...

PF.01 Prenatal Chromosomal Microarray Use: A Prospective Cohort of Fetuses and a Systematic Review and Meta-Analysis

Background Chromosomal microarray testing (CMA) is utilised in prenatal diagnosis to detect chromosomal abnormalities not visible by full, conventional karyotypin...

IRUS-UK: Improving understanding of the value and impact of institutional repositories

>> See video of presentation (21 min.) Many educational institutions have repositories for research outputs. The number of items available through institutional repositories ...

Small Cell Lung Cancer and Tarlatamab: A Meta-Analysis of Clinical Trials

Abstract Introduction Tarlatamab is a Delta-like ligand 3 (DLL3) -directed bispecific T-cell engager recently approved for use in patients with advanced small cell lung cancer (SCL...

Crosswalk among Prominent Open Research Data Repositories

Open Access is a synergised global movement using Internet to provide equal access to knowledge that once hid behind the subscription paywalls. Many new models for scholarly commun...

FAIRness of research data in humanities

Research data are considered the primary result and output of scientific research, and sharing and reusing data are key aspects of the transition to open science on a European leve...

Meta-Representations as Representations of Processes

In this study, we explore how the notion of meta-representations in Higher-Order Theories (HOT) of consciousness can be implemented in computational models. HOT suggests that consc...

Email:
Password:

Email:

ArrayWiki: an enabling technology for sharing public microarray data repositories and meta-analyses

Related Results