Javascript must be enabled to continue!
Chemical information presentation in the Crystallography Open Database
View through CrossRef
Crystallography Open Database (COD, http://www.crystallography.net/) is the largest to date curated open-access collection of small to medium sized unit cell crystal structures [1,2]. Over 11 years of development, COD has accumulated over 1/4 million structures from the peer reviewed press and personal communications. COD has an automated data submission Web site, performs routine automatic quality checks on all incoming structures and is now recommended as a database for crystallographic deposition by several scientific journals. To facilitate automatic use and discoverability of COD data, and to increase usefulness of our database for chemists, two steps were undertaken. COD was now supplemented with software and data from the CrystalEye data aggregator. The new software permits extracting chemical data and presenting them as structural formula, unique moieties, and chemically significant fragments. We have also implemented search of crystal structures by the structural chemical formulae of the target compounds. The search is first of all performed among 70 000 hand-curated chemical structure descriptors, and can be extended to automatically generated descriptors. To facilitate data curation, a new software platform for data review is being developed. All COD structures will be evaluated using statistical distributions of observed geometrical and chemical properties (bond lengths, angles, dihedrals, planarities). The most statistically unusual structures will be forwarded to a COD reviewer Internet forum, where qualified reviewers will be asked whether they find provided evidence for a particular structure convincing or not. In this way, a set of human review indicators (convincing/unconvincing) will be available along with the match against the bulk of data (usual structure/unusual). Such indicators would be especially useful for deciding which COD records require special attention and which subsets of COD should be selected for reliable scientific inferences.
International Union of Crystallography (IUCr)
Title: Chemical information presentation in the Crystallography Open Database
Description:
Crystallography Open Database (COD, http://www.
crystallography.
net/) is the largest to date curated open-access collection of small to medium sized unit cell crystal structures [1,2].
Over 11 years of development, COD has accumulated over 1/4 million structures from the peer reviewed press and personal communications.
COD has an automated data submission Web site, performs routine automatic quality checks on all incoming structures and is now recommended as a database for crystallographic deposition by several scientific journals.
To facilitate automatic use and discoverability of COD data, and to increase usefulness of our database for chemists, two steps were undertaken.
COD was now supplemented with software and data from the CrystalEye data aggregator.
The new software permits extracting chemical data and presenting them as structural formula, unique moieties, and chemically significant fragments.
We have also implemented search of crystal structures by the structural chemical formulae of the target compounds.
The search is first of all performed among 70 000 hand-curated chemical structure descriptors, and can be extended to automatically generated descriptors.
To facilitate data curation, a new software platform for data review is being developed.
All COD structures will be evaluated using statistical distributions of observed geometrical and chemical properties (bond lengths, angles, dihedrals, planarities).
The most statistically unusual structures will be forwarded to a COD reviewer Internet forum, where qualified reviewers will be asked whether they find provided evidence for a particular structure convincing or not.
In this way, a set of human review indicators (convincing/unconvincing) will be available along with the match against the bulk of data (usual structure/unusual).
Such indicators would be especially useful for deciding which COD records require special attention and which subsets of COD should be selected for reliable scientific inferences.
Related Results
Cometary Physics Laboratory: spectrophotometric experiments
Cometary Physics Laboratory: spectrophotometric experiments
<p><strong><span dir="ltr" role="presentation">1. Introduction</span></strong&...
Evolution of circular depressions at the surface of JFCs
Evolution of circular depressions at the surface of JFCs
<p>&#160;</p>
<p><strong><span dir="ltr" role="presentation">Conte...
High Expression of AMIGO2 Is an Independent Predictor of Poor Prognosis in Pancreatic Cancer
High Expression of AMIGO2 Is an Independent Predictor of Poor Prognosis in Pancreatic Cancer
Abstract
Background.The AMIGO2 extracellular domain has a leucine - rich repetitive domain (LRR) and encodes a type 1 transmembrane protein , and is a member of the AMIGO g...
Development of Malaysian Mammal Online Database
Development of Malaysian Mammal Online Database
Malaysia is one of 17 mega diverse countries in the world. Despite this fact, there is still no online standalone database that focuses on Malaysian mammals. Creation of a mammali...
BraggNet: integrating Bragg peaks using neural networks
BraggNet: integrating Bragg peaks using neural networks
Neutron crystallography offers enormous potential to complement structures from X-ray crystallography by clarifying the positions of low-Z elements, namely hydrogen. Macromolecular...
Structure Determination as it has been Developed through X-ray Crystallography
Structure Determination as it has been Developed through X-ray Crystallography
Abstract
This chapter gives an overview of how x-ray crystallographyis used to determine the structure of proteins and other biological macromolecules at high resolu...
Development of Malaysian Ethnobotanical Online Database
Development of Malaysian Ethnobotanical Online Database
The existing Malaysian ethnobotanical database is not sufficiently comprehensive and may hinder sharing of ethnobotanical knowledge. The lack of interest and documentation especial...
CO-19 PDB 2.0: A Comprehensive COVID-19 Database with Global Auto-Alerts, Statistical Analysis, and Cancer Correlations
CO-19 PDB 2.0: A Comprehensive COVID-19 Database with Global Auto-Alerts, Statistical Analysis, and Cancer Correlations
Abstract
Biological databases serve as critical basics for modern research, and amid the dynamic landscape of biology, the COVID-19 database has emerged as an indisp...

