Manage data about fossil specimens using Symbiota
This guide is intended to complement introductory documentation for Symbiota data providers, as well as the official Symbiota user documentation, Symbiota Docs. Symbiota Docs provides general guidance for working in Symbiota-based data portals and should be referenced for basic functions and workflows. This manual expands on this resource to provide discipline-specific information for fossil collections.
Introduction
There are two ways specimen records are typically entered into a Symbiota portal: 1) as a bulk data import or 2) directly using the Occurrence Editor interface. Additional methods exist, but these are the most commonly used options by collections that actively (“live”) manage their specimen data using Symbiota. In all cases, the end goal is to make your data more easily managed, discovered, and used for research; thus, data providers are strongly encourged to follow the steps outlined in this how-to guide.
Regardless of data entry method, it is important that all data providers become familiar with the Darwin Core data standard, which forms the basis for the majority of Symbiota’s Data Fields.
A set of example records has been created to guide the capture of fossil specimen data using Symbiota.
Bulk data import
This section outlines actions you can take to prepare and import (ingest) existing digital catalog records from your fossil collection into a Symbiota portal.
1. Prepare your records for import:
- If you maintain existing catalog records to be imported into Symbiota, perform some data cleaning to align your records to Symbiota’s data fields and formatting requirements. The data formatting checklist and example records are intended to inform this process, and OpenRefine is free software that can be used for this purpose. Additional data cleaning can be performed once your records have been imported into Symbiota.
- If you’d like a template to follow, this spreadsheet is preformatted for use with Symbiota. Your spreadsheet must be converted to CSV format (use UTF-8 character encoding) prior to ingestion into the portal, which can be easily accomplished in a program like Microsoft Excel or Google Sheets. An expanded version of this spreadsheet can be provided upon request.
Krimmel, Erica; Walker, Lindsay J.
Poster originally presented at the annual meeting of the Society for the Preservation of Natural History Collections (SPNHC) in Edinburgh, Scotland, UK.
Metadata record last updated on 2024-07-16Krimmel, Erica
Materials for a three-part short course introducing participants with different levels of existing technical expertise to core data management concepts in Microsoft Excel/Google Sheets and OpenRefine.
Metadata record last updated on 2025-04-092. Import your data into Symbiota:
There are multiple ways to import new records into a Symbiota portal. This action can only be completed by users with Administrator permissions through the Administration Control Panel.
- To import a spreadsheet of specimen occurrence data, use the “Full Text File Import” option.
- To import a spreadsheet of extended specimen data, use the “Extended Data Import” option. See below for more information about how to extend your specimens using Symbiota.
Import one or a very small number of representative records prior to initiating a larger import, especially if you are new to this process. Doing so will allow you to assess how your records will look in the portal. Similar to bulk data ingestion, only users with Administrator permissions can delete records, and this action cannot be done in bulk; records can only be deleted one-by-one using the Admin tab interface on the Occurrence Editor.
3. Once your records are in Symbiota:
- Moving forward, make edits to your records and complete other management tasks, like managing loans, directly in Symbiota.
- Save your import spreadsheets somewhere safe, but you likely will not need them again once the records are ingested into your Symbiota portal.
- Run your portal’s built-in data cleaning tools to ingest new taxonomy and clean geographic location details.
- Further clean your data using tips in the Symbiota Data Quality Toolkit.
- Georeference your specimen records.
The last two steps can be delegated to users with Editor permissions, such as students or volunteers!
Direct data entry
The content in this section outlines recommendations for direct data entry using Symbiota’s Occurrence Editor interface, which allows users with Administrator and Editor user permissions to add and edit specimen records in Symbiota. As a reminder, the Darwin Core data standard forms the basis for the majority of Symbiota’s Data Fields, and a series of example records has been created to guide data entry.
Data maintenance
Once your records are imported or entered directly into Symbiota, some effort will be required to correct, maintain, or improve the quality of your specimen data. These activities are important to keeping your records easily managed, discoverable, and useful for research. The following recommendations are made to help you begin this process.
Prevent new errors
When training new staff or volunteers on data entry or management, it is highly recommended that you point them toward this Knowledge Hub, but more specifically, have them become familiar with the Symbiota Data Fields and the data formatting checklist.
Mitigate existing errors
Mistakes are likely to happen, even in carefully curated datasets. It is therefore recommended that you routinely assess your data using the Symbiota Data Quality Toolkit. This guide is designed to enable users with either Administrator or Editor permissions to your Collection Profile to “clean” your data–i.e. find and correct errors–using the portal’s built-in features wherever possible.
Crowdsource quality control
Symbiota maintains several built-in tools to facilitate collaborative data entry and data cleaning when enabled for your collection. For example, Administrators of a given collection can enable any portal user who is logged in with an account to suggest edits to your records in the portal. Suggestions must be reviewed by an Administrator before they become public. By default, this option is turned off, but it can be activated through your Administrator Control Panel. Review Symbiota Docs for more information about this feature.
Set up a data import profile
If you intend to routinely import data using a standard import template–for example, if you intend to cataloging using a spreadsheet method–you can set up a new data import profile based on your cleaned spreadsheet.
Appendices
Data formatting checklist
Before importing existing catalog records into Symbiota, complete this checklist to prepare your data for ingestion. The aim is to maximize interoperability between records in your dataset and other fossil collections, ultimately making your data more discoverable and useful for research. The checklist below has been compiled based on commonly encountered scenarios in fossil collections; however, it is not comprehensive and should only serve as a starting point.
When preparing your data, refer to Symbiota’s Data Import Fields for field definitions, as well as this page to learn what types of data can be imported, e.g., which fields can only contain numbers, dates, textual data, etc.
GBIFGlobal Biodiversity Information Facility. International network and data infrastructure providing open access to biodiversity data, including fossils. maintains an additional list of data requirements and recommendations to improve data quality and completeness.
Checklist Item | Recommendation |
---|---|
Catalog numbers | Every occurrence (catalog record) to be imported must have a unique catalog number. Example: USNMV18414 ASUCOB0003928 |
Basis of record | Every record corresponding to cataloged fossil material should receive the basisOfRecord value, “FossilSpecimen”. Example: basisOfRecord = FossilSpecimen |
Secondary identifiers | Parse secondary identifiers into a semicolon-delimited list of key: value pairs (i.e., tagName: identifier). Example: otherCatalogNumbers = Old Catalog Number: ASU 3541; Accession Number: WIS-L-001456 |
Delimiters | Use pipes (| ) or semicolons to separate values in lists and be consistent with formatting. Doing so will facilitate parsing of data, if ever needed, in the future. Generally avoid using commas as delimiters. Example: Element = stem | strobilus | root |
Dates | Fields containing dates should be formatted in ISO format, e.g. YYYY-MM-DD. An exception to this rule is verbatimEventDate; use this field when dates are incomplete or not ISO formatted. |
Identifications | For specimens identified above the species level, do not include sp. , indet. , or similar suffixes. Qualifiers like aff. and ? should be recorded in a separate field, identificationQualifier. Verbatim label identifications (e.g. Lepidophyllum [?] can be captured in identificationRemarks. Leave blank for specimens/specimen lots without identifications. Refer to Symbiota-specific guidance for scientificName vs. sciName. Example: scientificName= Phylledestes vorax Cockerell, 1907 or sciName=Phylledestes vorax |
Localities | If any locality data should be obscured, include a locationSecurity column in your spreadsheet and give records with sensitive locality data a value of “1”. |
Geological time | Refer to Symbiota Docs for important data field definitions and related instructions. See important notice below regarding verbatim values in geological context data. Example: earlyInterval = Late Jurassic and lateInterval = Early Cretaceous |
Geological units | Refer to Symbiota Docs for important data field definitions and see notice below regarding verbatim values in geological context data. Example: formation = Wasatch and member = Niland Tongue |
Contextual descriptions | Refer to the field definitions for Description, Element, and Individual Count for important information about how to format data related to anatomy and the physical nature of cataloged fossil material. Example: see demo record for USNMPAL144776 |
Vocabularies | If your dataset contains anatomical elements that may benefit from the use of a controlled vocabulary, refer to these examples. |
Cataloging multi-specimen lots | When multiple individuals of a single taxon exist in a given lot (i.e. isolated in one physical container), they can be cataloged as a single occurrence record. See below for advice when a lot contains multiple taxa. |
Fields containing different kinds of information | When this is unavoidable, use key:value pairs to concatenate data that must be combined into one field. Example: Occurrence Remarks = ACQUISITION DETAILS: Gift of Arthur Lakes April 1890 | NOTES: Original specimen label misplaced . |
Type specimens | Include a value in typeStatus (ICZN and IAPT values preferred). See below for information about “extending” your specimens that are referenced in literature. |
File format | Save your finalized spreadsheet in comma-separated (CSV) format. Additionally, to ensure any special or accented characters import correctly, always save your data import files using UTF-8 character encoding. |
A note on verbatim values in geological context data: Many fossil specimens are accompanied by labels, field notes, and other primary data sources containing values that are no longer accepted (e.g. “Tertiary”), informally used (e.g. “Precambrian”), or indicate uncertainty (e.g., “Upper Mio?”). This information is important and should be recorded; however, it cannot be captured using Symbiota’s earlyInterval and lateInterval fields, which map to a portal’s standardized geological time scaleStandardized chronology of Earth’s history, often linked to fossil occurrences. values (by default, these values are based on the ICS Time Scale). In the absence of an appropriate, standard-based term to record these data, this information should be captured in stratigraphicRemarks as a delimited key:value pair (example).
Example records
Fossil specimens present many scenarios that can be challenging to translate into informative catalog records. The following example records are intended to illustrate how to represent fossil specimen data using Symbiota. These records are expected to evolve as best practices for managing and publishing fossil specimen data are formalized.
If you would like guidance on how to treat your fossil specimen data, Symbiota users are strongly encouraged to ask questions in the PDWG Slack space or bring questions to PDWG meetings for assistance. You are also welcome to contact the Paleo Data Portal’s Steering Committee for input.
Some of the following records include redacted data (images and localityGeographic place where a specimen was collected; fossils often being connected to stratigraphic units. details); please contact paleoinformatics@gmail.com if you require more information about these records. Additionally, for ease of reference, these examples are organized by collection subdiscipline (e.g., “IP” = Invert Paleo, “VP” = Vert Paleo, etc.); however, many of these examples will apply across collection categories.
Category | Example Record Description | Record |
---|---|---|
IP | one taxon, one individual | USNMMO647519 |
IP | one taxon, multiple individuals | USNMPAL665453 |
IP | slab: multiple taxa, multiple individuals | USNMPAL83927 & USNMPAL188127 |
IP | slab: multiple taxa, multiple individuals (all cataloged) | USNMPAL449450 & assoc. records |
IP+PB | slab: multiple taxa, multiple individuals (some uncataloged) | USNMPAL566311 |
PB | one lot, one individual, part-counterpart pair - one number | USNMP42726 |
PB | one lot, one individual, part-counterpart pair - multiple numbers | USNMP7427 & USNMP7428 |
VP | one taxon, one individual, isolated element | USNMV8814 |
VP | one taxon, whole articulated skeleton - composite | USNMV6721 |
VP | one taxon, whole articulated skeleton - composite | USNMV10304 & assoc. records |
VP | slab: single taxon, multiple pieces of one individual | USNMV22753 |
VP | slab: single taxon, multiple pieces of one individual | USNMV2395 |
VP | slab: bone bed, multiple individuals - multiple numbers | USNMV21375 & USNMPAL606789 |
VP | cast and fossil, one individual | USNMV6720 |
VP | cast and fossil, one individual | USNMPAL215070 |
VP | cast and fossil, one individual | USNMV6527 |
VP | cast of another institution specimen | USNMPAL299545 |
VP | ichnofossil (coprolite) | USNMPAL617525 |
Extending your specimens
Associations can be created between your records in Symbiota and external resource to “extend” your specimen data. Examples include creating links between your records and digitally available literature (e.g. for published specimens) and between your records and other cataloged specimens, both within and external to your Symbiota portal. Creating these associations, or “extended specimens”, can be accomplished two ways:
1) Users with Editor or Administrator permissions can create these linkages one-by-one using the Linked Resources tab.
2) Users with Administrator permissions can additionally create these linkages in batch by uploading a CSV-formatted spreadsheet using the Extended Data Import tool. This option may contain several fields that are not available in the Linked Resources tab, such as accordingTo.
Example: Type and referred specimens
You can create linkages between occurrence recordsEvidence of a taxon at a place and time. in your Symbiota portal and digitally available publications using the fields and parameters specified below.
Examples: 1) USNMV4735 (holotypeThe single specimen designated as the type of a species when described. of Ceratosaurus nasicornis); 2) USNM P34765 (specimen of Carya libbeyii that has been referenced in several publications)
- Association Type =
Non-occurrence Resource
- Relationship Type =
isReferencedBy
Option 1: Create links directly in your portal
To create a link to a digitially available non-occurrence resource external to your portal, such as a publication, used the Occurrence Editor form’s Linked Resources tab as shown below.
When creating associations with external resources, provide a stable URLUniform Resource Locator. A type of URI that specifies the location of a resource on the internet by describing its primary access mechanism. E.g. https://…—like a DOIDigital Object Identifier. Widely used identifier format, mostly for digital publications and other documents. E.g. https://doi.org/10.5962/p.304567 or a permalink—for the resourceURL whenever possible. Otherwise, your links may eventually break.
Option 2: Create links by uploading a spreadsheet
Here is an example of what your spreadsheet (CSV) should look like. You can ingest it into your portal using the Extended Data Import tool.
subjectCatalogNumber | basisOfRecord | objectID | resourceURL |
---|---|---|---|
USNMP34765 | Reference Citation | Knowlton; 1916; Proceedings of the National Museum | https://www.biodiversitylibrary.org/page/7764079 |
USNMV4735 | Reference Citation | Carrano & Choinier; 2016; Journal of Vertebrate Paleontology | https://doi.org/10.1080/02724634.2015.1054497 |