Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Biomappings: Community curation of mappings between biomedical entities

View through CrossRef
Many related biomedical resources propose their own identifiers for genes, proteins, chemicals, biological processes, and other entities of biological interest. The integration of data and knowledge fundamentally relies on mappings between equivalent entities across resources. While some maintain and distribute mappings to external namespaces (e.g. HGNC provides mappings to Entrez Gene identifiers), there exist systematic gaps in the availability of mappings between widely used resources. Automated approaches including lexical and structural alignment have been used to find missing mappings, but most do not store important metadata like mapping confidence nor important curation artifacts including proposed mappings curated to be (nontrivially) incorrect, nor provide interfaces for curating (reviewing, and confirming or rejecting) predicted mappings. We introduce Biomappings, an open repository for making expert-curated mappings (currently 5,700), and predicted mappings (currently 28,000) available with their associated metadata in an intuitive tab-delimited format. It is licensed under the permissive CC0 license to encourage community contributions and restriction-free integration back into primary resources. Biomappings provides a web-based interface for curating predictions and adding manually constructed mappings, a Python package for interacting with the mappings, and several workflow examples for generating new mappings. We applied the Biomappings curation workflow to missing mappings between the Medical Subject Headings and several other ontologies including the Disease Ontology, ChEBI, and the Gene Ontology. We also used Biomappings to curate an exhaustive set of proposed mappings (constructed automatically based on lexical overlap) between entries representing cancer cell lines across three previously unmapped resources: the Cancer Cell Line Encyclopedia, the Experimental Factor Ontology, and Cellosaurus. All data and code are available at https://github.com/biopragmatics/biomappings .
Title: Biomappings: Community curation of mappings between biomedical entities
Description:
Many related biomedical resources propose their own identifiers for genes, proteins, chemicals, biological processes, and other entities of biological interest.
The integration of data and knowledge fundamentally relies on mappings between equivalent entities across resources.
While some maintain and distribute mappings to external namespaces (e.
g.
HGNC provides mappings to Entrez Gene identifiers), there exist systematic gaps in the availability of mappings between widely used resources.
Automated approaches including lexical and structural alignment have been used to find missing mappings, but most do not store important metadata like mapping confidence nor important curation artifacts including proposed mappings curated to be (nontrivially) incorrect, nor provide interfaces for curating (reviewing, and confirming or rejecting) predicted mappings.
We introduce Biomappings, an open repository for making expert-curated mappings (currently 5,700), and predicted mappings (currently 28,000) available with their associated metadata in an intuitive tab-delimited format.
It is licensed under the permissive CC0 license to encourage community contributions and restriction-free integration back into primary resources.
Biomappings provides a web-based interface for curating predictions and adding manually constructed mappings, a Python package for interacting with the mappings, and several workflow examples for generating new mappings.
We applied the Biomappings curation workflow to missing mappings between the Medical Subject Headings and several other ontologies including the Disease Ontology, ChEBI, and the Gene Ontology.
We also used Biomappings to curate an exhaustive set of proposed mappings (constructed automatically based on lexical overlap) between entries representing cancer cell lines across three previously unmapped resources: the Cancer Cell Line Encyclopedia, the Experimental Factor Ontology, and Cellosaurus.
All data and code are available at https://github.
com/biopragmatics/biomappings .

Related Results

Prediction and Curation of Missing Biomedical Identifier Mappings with Biomappings
Prediction and Curation of Missing Biomedical Identifier Mappings with Biomappings
Abstract Motivation Biomedical identifier resources (ontologies, taxonomies, controlled vocabularies) commonly overlap in scope...
Prediction and curation of missing biomedical identifier mappings with Biomappings
Prediction and curation of missing biomedical identifier mappings with Biomappings
AbstractMotivationBiomedical identifier resources (such as ontologies, taxonomies, and controlled vocabularies) commonly overlap in scope and contain equivalent entries under diffe...
Digital Curation and Doctoral Research
Digital Curation and Doctoral Research
This article considers digital curation in doctoral study and the role of the doctoral supervisor and institution in facilitating students’ acquisition of digital curation skills...
Assembly and reasoning over semantic mappings at scale for biomedical data integration
Assembly and reasoning over semantic mappings at scale for biomedical data integration
Motivation: Hundreds of resources assign identifiers to biomedical concepts including genes, small molecules, biological processes, diseases, and cell types. Often, these resources...
Digital Curation and its Costs: A Study of Practices and Insights
Digital Curation and its Costs: A Study of Practices and Insights
Introduction - Regarding the concerns that prompted the emergence of digital curation, this study takes as an example the context of the production of large volumes of scientific i...
Evolution of Antimicrobial Resistance in Community vs. Hospital-Acquired Infections
Evolution of Antimicrobial Resistance in Community vs. Hospital-Acquired Infections
Abstract Introduction Hospitals are high-risk environments for infections. Despite the global recognition of these pathogens, few studies compare microorganisms from community-acqu...
The BEL Information Extraction Workflow (BELIEF): updates and evaluation
The BEL Information Extraction Workflow (BELIEF): updates and evaluation
Ever-increasing scientific literature enhances our understanding on how toxicants impact biological systems. In order to utilize this information in the growing field of systems bi...
The BEL Information Extraction Workflow (BELIEF): updates and evaluation
The BEL Information Extraction Workflow (BELIEF): updates and evaluation
Ever-increasing scientific literature enhances our understanding on how toxicants impact biological systems. In order to utilize this information in the growing field of systems bi...

Back to Top