Javascript must be enabled to continue!
AI-Driven Solutions for Optimising Drawer Expansion and Storage Management of Entomology Collections
View through CrossRef
The digitisation of natural history collections at scale raises a range of logistical and curatorial challenges. One concern is the physical expansion of storage infrastructure. As part of the Distributed System of Scientific Collections UK (Smith et al. 2022), the Natural History Museum London (NHM) will rehouse a significant proportion of its 36 million pinned entomological specimens (~130,000 drawers). Digitisation involves imaging every specimen and attaching a physical, unique, human- and machine-readable identifier (a small card label with a Data Matrix barcode). These barcodes must be readable from above, and their addition during digitisation can increase drawer occupancy if they exceed the existing footprint. Manual estimation of the footprint increase is impractical at this scale, given the range of circumstances associated with different specimens and drawers.
To address this, the NHM has developed an AI-driven approach to estimate specimen drawer expansion and support resource planning. The deep learning pipeline can automatically detect drawer objects, referred to here as classes, including specimens, labels, barcodes, notes, unit trays, and drawers, from high-resolution images. Our dataset includes 11,090 digitised pinned Coleoptera drawers from the Index Lot collection (Natural History Museum 2014), representative of typical entomological collections. The AI pipeline supports both "pre-" and "post-" digitisation analysis by calculating the specimen bounding areas, handling overlaps, and estimating the net footprint change.
Our object detection models achieved high performance under real-world conditions, with a mean average precision (mAP*1) of 85.45% across all classes. Barcode detection reached 86.63% mAP, while the standard unit tray detection and unit tray type classification model achieved 99.5% accuracy. (An example of the detection outputs is shown in Fig. 1)
Three different calculation methods were tested and evaluated to estimate the drawer expansion area, corresponding to three specimen expansion rates for specimens with Data Matrix barcodes attached.
Rate 1 (Per-Drawer Area via Polygon Masks*2): calculates expansion rates per drawer by measuring the total area occupied by specimens and their labels, ensuring overlaps are counted only once using polygon masks, as shown in Fig. 2.
Rate 2 (Total-Based Expansion Rate): calculates expansion rates by averaging the occupancy areas of barcode-attached specimens across the entire dataset.
Rate 3 (Per-Specimen Expansion Rate): averages the expansion rate per specimen using only bounding box*3 areas. Every specimen contributes equally, regardless of its size, to the final value.
Rate 1 (Per-Drawer Area via Polygon Masks*2): calculates expansion rates per drawer by measuring the total area occupied by specimens and their labels, ensuring overlaps are counted only once using polygon masks, as shown in Fig. 2.
Rate 2 (Total-Based Expansion Rate): calculates expansion rates by averaging the occupancy areas of barcode-attached specimens across the entire dataset.
Rate 3 (Per-Specimen Expansion Rate): averages the expansion rate per specimen using only bounding box*3 areas. Every specimen contributes equally, regardless of its size, to the final value.
Table 1 compares different expansion rates. Both per-drawer Rate 1 and per-specimen Rate 3 have confidence intervals. Fig. 3 and Fig. 4 show their distributions: Rate 1 is right-skewed, indicating modest area increases from overlap, while Rate 3 is bimodal*4, with a small peak near 0 (large specimens unchanged) and another around 0.3 (barcode additions on smaller specimens). The global total-based Rate 2 is the mean of the ratios without confidence intervals, providing a macro-level view.
In addition to estimating drawer expansion and supporting budget planning, this tool has been designed in a modular fashion, allowing individual components to be used independently at different stages of the digitisation and curation workflow. For example, specific models can be used to flag missing labels or barcodes during digitisation, assist in tracking specimen relocation, and support downstream re-curation decisions. Importantly, the pipeline is designed to be reusable across other entomological collections, with all code to be made openly available alongside a forthcoming publication, promoting scalability within the community.
Our findings demonstrate that automated spatial analysis not only improves accuracy and speed in collection management but also lays the groundwork for predictive infrastructure modelling across large-scale digitisation efforts.
Title: AI-Driven Solutions for Optimising Drawer Expansion and Storage Management of Entomology Collections
Description:
The digitisation of natural history collections at scale raises a range of logistical and curatorial challenges.
One concern is the physical expansion of storage infrastructure.
As part of the Distributed System of Scientific Collections UK (Smith et al.
2022), the Natural History Museum London (NHM) will rehouse a significant proportion of its 36 million pinned entomological specimens (~130,000 drawers).
Digitisation involves imaging every specimen and attaching a physical, unique, human- and machine-readable identifier (a small card label with a Data Matrix barcode).
These barcodes must be readable from above, and their addition during digitisation can increase drawer occupancy if they exceed the existing footprint.
Manual estimation of the footprint increase is impractical at this scale, given the range of circumstances associated with different specimens and drawers.
To address this, the NHM has developed an AI-driven approach to estimate specimen drawer expansion and support resource planning.
The deep learning pipeline can automatically detect drawer objects, referred to here as classes, including specimens, labels, barcodes, notes, unit trays, and drawers, from high-resolution images.
Our dataset includes 11,090 digitised pinned Coleoptera drawers from the Index Lot collection (Natural History Museum 2014), representative of typical entomological collections.
The AI pipeline supports both "pre-" and "post-" digitisation analysis by calculating the specimen bounding areas, handling overlaps, and estimating the net footprint change.
Our object detection models achieved high performance under real-world conditions, with a mean average precision (mAP*1) of 85.
45% across all classes.
Barcode detection reached 86.
63% mAP, while the standard unit tray detection and unit tray type classification model achieved 99.
5% accuracy.
(An example of the detection outputs is shown in Fig.
1)
Three different calculation methods were tested and evaluated to estimate the drawer expansion area, corresponding to three specimen expansion rates for specimens with Data Matrix barcodes attached.
Rate 1 (Per-Drawer Area via Polygon Masks*2): calculates expansion rates per drawer by measuring the total area occupied by specimens and their labels, ensuring overlaps are counted only once using polygon masks, as shown in Fig.
2.
Rate 2 (Total-Based Expansion Rate): calculates expansion rates by averaging the occupancy areas of barcode-attached specimens across the entire dataset.
Rate 3 (Per-Specimen Expansion Rate): averages the expansion rate per specimen using only bounding box*3 areas.
Every specimen contributes equally, regardless of its size, to the final value.
Rate 1 (Per-Drawer Area via Polygon Masks*2): calculates expansion rates per drawer by measuring the total area occupied by specimens and their labels, ensuring overlaps are counted only once using polygon masks, as shown in Fig.
2.
Rate 2 (Total-Based Expansion Rate): calculates expansion rates by averaging the occupancy areas of barcode-attached specimens across the entire dataset.
Rate 3 (Per-Specimen Expansion Rate): averages the expansion rate per specimen using only bounding box*3 areas.
Every specimen contributes equally, regardless of its size, to the final value.
Table 1 compares different expansion rates.
Both per-drawer Rate 1 and per-specimen Rate 3 have confidence intervals.
Fig.
3 and Fig.
4 show their distributions: Rate 1 is right-skewed, indicating modest area increases from overlap, while Rate 3 is bimodal*4, with a small peak near 0 (large specimens unchanged) and another around 0.
3 (barcode additions on smaller specimens).
The global total-based Rate 2 is the mean of the ratios without confidence intervals, providing a macro-level view.
In addition to estimating drawer expansion and supporting budget planning, this tool has been designed in a modular fashion, allowing individual components to be used independently at different stages of the digitisation and curation workflow.
For example, specific models can be used to flag missing labels or barcodes during digitisation, assist in tracking specimen relocation, and support downstream re-curation decisions.
Importantly, the pipeline is designed to be reusable across other entomological collections, with all code to be made openly available alongside a forthcoming publication, promoting scalability within the community.
Our findings demonstrate that automated spatial analysis not only improves accuracy and speed in collection management but also lays the groundwork for predictive infrastructure modelling across large-scale digitisation efforts.
Related Results
Anterolateral Drawer Versus Anterior Drawer Test for Ankle Instability
Anterolateral Drawer Versus Anterior Drawer Test for Ankle Instability
Background:
The addition of unconstrained internal rotation to the physical examination could allow for detection of more subtle degrees of ankle instability. W...
Query expansion by relying on the structure of knowledge bases
Query expansion by relying on the structure of knowledge bases
Query expansion techniques aim at improving the results achieved by a user's query by means of introducing new expansion terms, called expansion features. Expansion features introd...
Cryo-Expansion Microscopy of C. elegans and Tardigrades v1
Cryo-Expansion Microscopy of C. elegans and Tardigrades v1
Expansion microscopy (ExM) improves imaging resolution through sample-level physical expansion, complementing optical resolution improvements and enabling the two to compound (1). ...
ASSESSING THE POTENTIAL OF ENERGY STORAGE SOLUTIONS FOR GRID EFFICIENCY: A REVIEW
ASSESSING THE POTENTIAL OF ENERGY STORAGE SOLUTIONS FOR GRID EFFICIENCY: A REVIEW
Energy storage solutions play a pivotal role in enhancing grid efficiency and reliability, offering a multitude of benefits for grid operators, utilities, and consumers alike. This...
Switching control strategy for an energy storage system based on multi-level logic judgment
Switching control strategy for an energy storage system based on multi-level logic judgment
Energy storage is a new, flexibly adjusting resource with prospects for broad application in power systems with high proportions of renewable energy integration. However, energy st...
Biological collections and ecological/environmental research: a review, some observations and a look to the future
Biological collections and ecological/environmental research: a review, some observations and a look to the future
Housed worldwide, mostly in museums and herbaria, is a vast collection of biological specimens developed over centuries. These biological collections, and associated taxonomic and ...
CULTURAL ENTOMOLOGY AND THE INSECT FORM IN EGYPTION ART
CULTURAL ENTOMOLOGY AND THE INSECT FORM IN EGYPTION ART
This research paper aims to explain Cultural Entmology, which is a branch not mostly researched in neither biology nor fine arts, over insect forms in Egyptian Art. In the paper fi...
Projection of future direct and indirect impacts of urban expansion on carbon storage: A case study in Hubei, China
Projection of future direct and indirect impacts of urban expansion on carbon storage: A case study in Hubei, China
<p>Urban expansion encroaches on natural habitat, which seriously affects carbon storage which plays an important role in global climate change. The projection of fut...

