Converging Digital and Extended Specimens: Towards a global specification for data integration
Over the past year several exciting conversations about the possibilities of digital representations of the billions of specimens currently held in the world’s natural history collections. Two concepts—the Digital Specimen proposed by the Distributed System of Scientific Collections (DiSSCo) in Europe and the Extended Specimen emerging from the Biological Collections Network (BCoN) in the United States—are now aligning towards a shared vision that connects all information related to a specimen, creating in effect “digital twins” for the materials held in scientific collections.
Discussions about these approaches spurred discussions, most recently at the Biodiversity Information Standards (TDWG) conference, about how the global collections community can work together to build a system to achieve a fully integrated digital system. While the Digital and Extended Specimens differ slightly, it may be better to identify a single robust global solution.
A new clustering algorithm developed by GBIF—the Global Biodiversity Information Facility—already gives us a taste of the possibilities of fully integrated specimen data, as it brings together potentially related records by matching similar entries in individual fields across different datasets by applying standards like TDWG’s Darwin Core and the Biological Collections Access Service (BioCASe).
A growing number of organizations and individuals have signed a Letter of Intent signaling their interest in collaborating on a global specification that ensures interoperability for the Digital and Extended Specimens. These signatories are now preparing an online community consultation under the umbrella of the alliance for biodiversity knowledge, mixing virtual meetings and online discussions that builds on the example of the collection catalogue consultation held in early 2020.
Beginning in February, the consultation will seek to engage the wider community on a handful of topics that have technical, financial, social, governance and professional implications that require broader discussion and consensus. The consultation aims to expand participation in the process, build support for further collaboration, identify key use cases, and develop an initial roadmap for community adoption and implementation.
Topics of the consultation will include:
Digitizing/mobilizing FAIR data for specimens
Extending, enriching and integrating data
Annotating specimens and related data
Crediting and attributing tasks like data and material curation
Analysing/mining specimen data for novel applications
The alliance for biodiversity knowledge invites other parties to participate in this process, to review the draft consultation outline and to sign the Letter of Intent. More details about the virtual consultation will become available after the holiday. If you have ideas, want to ask questions, or would like to co-moderate a topic, please contact us at firstname.lastname@example.org.
Photo of Cortinarius lux-nymphae 2020 Torbjørn Borge via Danish Mycological Society, fungal records database.