Draft: The content on this page has not been finalized. Contributors can mark a page as complete and remove this warning by adding status: published to the front matter in the Markdown source file.

This guide is intended to complement documentation for getting started in the Symbiota Paleo Data Portal, as well as the official Symbiota user documentation, Symbiota Docs. Symbiota Docs provides general guidance for working in Symbiota-based data portals and should be referenced for basic functions and workflows. This manual expands on this resource to provide discipline-specific information for fossil collections.

Introduction

There are two ways specimen records are typically entered into a Symbiota portal: 1) as a bulk data import or 2) directly using the Occurrence Editor interface. Additional methods exist, but these are the most commonly used options by collections that actively (“live”) manage their specimen data using Symbiota. In all cases, the end goal is to make your data more easily managed, discovered, and used for research; thus, data providers are strongly encourged to follow the steps outlined in this how-to guide.

Regardless of data entry method, it is important that all data providers become familiar with the Darwin Core data standard, which forms the basis for the majority of Symbiota’s Data Fields.

📌 Exemplar catalog records have been created as a reference to guide data entry for fossil specimens in Symbiota portals. It may be helpful to bookmark this page for ease of access.

Bulk data import

Formatting data for import

This section outlines actions you can take to prepare and import existing digital catalog records from your fossil collection into a Symbiota portal.

Steps you can take to ready your records for ingestion

  1. If you maintain existing catalog records to be imported into Symbiota, perform some data cleaning to align your records to Symbiota’s data fields and formatting requirements. The data formatting checklist is intended to inform this process, and OpenRefine is free software that can be used for this purpose. Additional data cleaning can be performed once your records have been imported into Symbiota.
  2. If you’d like a template to follow, this spreadsheet is preformatted for use with Symbiota. Your spreadsheet must be converted to CSV format prior to ingestion into the portal, which can be easily accomplished in a program like Microsoft Excel or Google Sheets.
Using OpenRefine for natural history collections data

Poster originally presented at the annual meeting of the Society for the Preservation of Natural History Collections (SPNHC) in Edinburgh, Scotland, UK.

Data formatting checklist

Before importing existing catalog records into Symbiota, complete this checklist to prepare your data for ingestion. The aim is to maximize interoperability between records in your dataset and other fossil collections, ultimately making your data more discoverable and useful for research. The checklist below has been compiled based on commonly encountered scenarios in fossil collections; however, it is not comprehensive and should only serve as a starting point.

When preparing your data, refer to Symbiota’s Data Import Fields for field definitions, as well as this page to learn what types of data can be imported, e.g., which fields can only contain numbers, dates, textual data, etc.

Checklist Item Recommendation
Catalog Numbers Every occurrence (catalog record) to be imported must have a unique catalog number.
Example: USNMV18414
Basis of Record Every record corresponding to cataloged fossil material should receive the basisOfRecord value, “FossilSpecimen”.
Example: basisOfRecord = FossilSpecimen
Secondary identifiers Parse secondary identifiers into a semicolon-delimited list of key: value pairs (i.e., tagName: identifier).
Example: otherCatalogNumbers = Old Catalog Number: ASU 3541; Accession Number: WIS-L-001456
Delimiters Use pipes (|) or semicolons to separate values in lists and be consistent with formatting. Doing so will facilitate parsing of data, if ever needed, in the future. Avoid using commas as delimiters.
Example: Element = stem | strobilus | root
Dates Fields containing dates should be formatted in ISO format, e.g. YYYY-MM-DD. An exception to this rule is verbatimEventDate; use this field when dates are incomplete or not ISO formatted.
Identifications For specimens identified above the species level, do not include sp., indet., or similar suffixes. Qualifiers like aff. and ? should be recorded in a separate field, identificationQualifier. Verbatim label identifications (e.g. Lepidophyllum [?] can be captured in identificationRemarks. Leave blank for specimens/specimen lots without identifications. Refer to Symbiota-specific guidance for scientificName vs. sciName.
Example: scientificName=Phylledestes vorax Cockerell, 1907 or sciName=Phylledestes vorax
Localities If any locality data should be obscured, include a locationSecurity column in your spreadsheet and give records with sensitive locality data a value of “1”.
Geological time Refer to Symbiota Docs for important data field definitions and related instructions. See important notice below regarding verbatim values in geological context data.
Example: earlyInterval = Late Jurassic and lateInterval = Early Cretaceous
Geological units Refer to Symbiota Docs for important data field definitions and see notice below regarding verbatim values in geological context data.
Example: formation = Wasatch and member = Niland Tongue
Contextual descriptions Refer to the field definitions for Description, Element, and Individual Count for important information about how to format data related to anatomy and the physical nature of cataloged fossil matieral. Example: see demo record for USNMPAL144776
Vocabularies If your dataset contains anatomical elements that may benefit from the use of a controlled vocabulary, refer to these examples.
Cataloging multi-specimen lots When multiple individuals of a single taxon exist in a given lot (i.e. isolated in one physical container), they can be cataloged as a single occurrence record. See below for advice when a lot contains multiple taxa.
Fields containing different kinds of information When this is unavoidable, use key:value pairs to concatenate data that must be combined into one field.
Example: Occurrence Remarks = ACQUISITION DETAILS: Gift of Arthur Lakes April 1890 | NOTES: Original specimen label misplaced.
Type specimens Include a value in typeStatus (ICZN and IAPT values preferred). See below for information about “extending” your specimens that are referenced in literature.
File format Save your finalized spreadsheet in comma-separated (CSV) format. Additionally, to ensure any special or accented characters import correctly, always save your data import files using UTF-8 character encoding.

A note on verbatim values in geological context data: Many fossil specimens are accompanied by labels, field notes, and other primary data sources containing values that are no longer accepted (e.g. “Tertiary”), informally used (e.g. “Precambrian”), or indicate uncertainty (e.g., “Upper Mio?”). This information is important and should be recorded; however, it should not be captured using Symbiota’s earlyInterval and lateInterval fields, which map to a portal’s standardized geological time scale values (by default, these values are based on the ICS Time Scale). In the absence of an appropriate, standard-based term to record these data, this information should be captured in stratigraphicRemarks as a key:value pair.
Example: VERBATIM CHRONOSTRATIGRAPHY: Permian?

A note on commonly encountered scenarios: Suggested solutions to several commonly encountered cataloging scenarios–such as dealing with “part-counterpart” specimens and similar scenarios–are further detailed below.

How to import your data into Symbiota

There are multiple ways to import new records into a Symbiota portal. This action can only be completed by users with Administrator permissions through the Administration Control Panel.

  • To import a spreadsheet of specimen occurrence data, use the “Full Text File Import” option.
  • To import a spreadsheet of extended specimen data, use the “Extended Data Import” option. See below for more information about how to extend your specimens using Symbiota.

Recommendation: Import one or a very small number of representative records prior to initiating a larger import, especially if you are new to this process. Doing so will allow you to assess how your records will look in the portal. Similar to bulk data ingestion, only users with Administrator permissions can delete records, and this action cannot be done in bulk; records can only be deleted one-by-one using the Admin tab interface on the Occurrence Editor.

Steps you can take immediately after your records are in Symbiota

  • Moving forward, make edits to your records and complete other management tasks, like managing loans, directly in Symbiota.
  • Save your import spreadsheets somewhere safe, but you likely will not need them again once the records are ingested into your Symbiota portal.
  • Run your portal’s built-in data cleaning tools to ingest new taxonomy and clean geographic location details.
  • Further clean your data using tips in the Symbiota Data Quality Toolkit.
  • Georeference your specimen records.

💡 The last two steps can be delegated to users with Editor permissions, such as students or volunteers!

Direct data entry

The content in this section outlines recommendations for direct data entry using Symbiota’s Occurrence Editor interface, which allows users with Administrator and Editor user permissions to add and edit specimen records in Symbiota. As a reminder, the Darwin Core data standard forms the basis for the majority of Symbiota’s Data Fields. This guide is intentionally designed help make your data more easily managed, discovered, and used for research; data providers are thus strongly encourged to conform with the recommendations outlined in this section.

How do I keep my records clean once they’re available in Symbiota?

Prevent new errors

When training new staff or volunteers on data entry or management, it is highly recommended that you point them toward this Knowledge Hub, but more specifically, have them become familiar with the [Symbiota Data Fields] and the data formatting checklist. The content on this page

Mitigate existing errors

Mistakes are likely to happen, even in carefully curated datasets. It is therefore recommended that you routinely assess your data using the Symbiota Data Quality Toolkit. This guide is designed to enable users with either Administrator or Editor permissions to your Collection Profile to “clean” your data–i.e. find and correct errors–using the portal’s built-in features wherever possible.

Crowdsource quality control

Symbiota maintains several built-in tools to facilitate collaborative data entry and data cleaning when enabled for your collection. For example, Administrators of a given collection can enable any portal user who is logged in with an account to suggest edits to your records in the portal. Suggestions must be reviewed by an Administrator before they become public. By default, this option is turned off, but it can be activated through your Administrator Control Panel. Review Symbiota Docs for more information about this feature.

Set up a data import profile

If you intend to repeatidly import data using a standard import template–for example, if you intend to cataloging using a spreadsheet method–you can set up a new data import profile based on your cleaned spreadsheet.

Extending your specimens

Once your occurrence records are available in Symbiota, associations can be created between your specimen data in Symbiota and external resources, including digitally available literature and other occurrence records (both in and external to your Symbiota portal). This can be accomplished using two methods. Users with Editor or Administrator permissions can create these linkages one-by-one using the Linked Resources tab; additionally, users with Administrator permissions can create these linkages in batch by uploading a CSV-formatted spreadsheet using the Extended Data Import tool. The latter option contains several fields that are not available in the Linked Resources tab, such as accordingTo.

Tip: When creating associations with external resources, provide a stable URL—like a DOI or a permalink—for the resourceURL whenever possible.

Examples of “Extended Specimens” in Symbiota are available in this dataset.

Type and referred specimens

You can create linkages between occurrence records in your Symbiota portal and digitally available publications using the fields and parameters specified below.

Examples: 1) USNMV4735 (holotype of Ceratosaurus nasicornis); 2) USNM P34765 (specimen of Carya libbeyii that has been referenced in several publications)

  • Association Type = Non-occurrence Resource
  • Relationship Type = isReferencedBy
subjectCatalogNumber basisOfRecord accordingTo resourceURL
USNMP34765 ReferenceCitation Knowlton; 1916; Proceedings of the National Museum https://www.biodiversitylibrary.org/page/7764079
USNMV4735 ReferenceCitation Carrano & Choinier; 2016; Journal of Vertebrate Paleontology https://doi.org/10.1080/02724634.2015.1054497

Part-counterpart specimens and similar scenarios

Scenario A: One institution owns all pieces of a fossil specimen

You can create associations between one or more occurrence records cataloged in your Symbiota portal using the fields and parameters specified below.

Example: ANSP3472 (part) and ANSP3473 (counterpart) were cataloged as separate records within the same Symbiota portal and subsequently linked as associated records.

  • Association Type = Occurrence - Internal (this portal)
  • Relationship Type = part OR counterpart (describe the specimen being linked to)
subjectCatalogNumber objectCatalogNumber basisOfRecord
ANSP3472 ANSP4373 FossilSpecimen

Think of the “subject” as the “part” and the “object” as the “counterpart” when creating a a part-counterpart pairing in Symbiota. Both records must already exist in the portal in order to create this type of relationship.

Alternative method: If you prefer to catalog part-counterpart specimens as a single specimen record, this is also possible, as in this example.

Scenario B: Multiple institutions own different pieces of a fossil specimen

Similarly, associations can be created between specimen occurrences in your Symbiota portal and occurrences in other data portals—for example, if your collection maintains one half of a part-counterpart pair, one or more pieces of an individual cataloged by different institutions, or a specimen-cast pairing. In all of these cases, you can create linkages between your catalog records in Symbiota and records hosted in external portals.

Example: USNM PAL 603860 (cataloged in Symbiota) is a cast of YPM VP 058990 (cataloged in an external database). An association has been created between these records in Symbiota.

  • Association Type = Occurrence - External Link
  • Relationship Type = value varies depending on the association to be created
subjectCatalogNumber objectID basisOfRecord verbatimSciname resourceURL
USNMPAL603860 YPMVP058990 FossilSpecimen Goleroconus alfi https://collections.peabody.yale.edu/search/Record/YPM-VP-058990

Think of the “subject” as the piece of specimen retained in your collection (cataloged in Symbiota) and the “object” as part retained in an external collection. The verbatimSciName refers to the identification of the occurrence maintained by the external collection.

Cataloging multi-taxon specimen lots

Content forthcoming

📬 Questions? Data providers are encouraged to contact paleoinformatics@gmail.com for assistance with questions related to importing and maintaining fossil specimen data using Symbiota. Include “Symbiota” in the subject of your email, e.g. “Help with preparing my data for the Symbiota Paleo Data Portal”.

External resources

  • Cretaceous Vertebrates of Madagascar: Symbiota portal specific to a collaborative digitization project between the Denver Museum of Nature and Science and University of Antananarivo. Includes a curated taxonomic thesaurus.
  • Pteridophyte Collections Consortium Taxonomic Dictionary: Relevant for pteridophytes, both extinct and extant. Symbiota portal and associated taxonomic dictionary were orginally created for the Pteridophyte Thematic Collections Network.
  • Symbiota Docs: Documentation for users of Symbiota software.
  • Symbiota Digitization Workflows: Lesson developed as part of a conference workshop, Digitization Workflows Using Symbiota Portals, offered by the Symbiota Support Hub with the goal of introducing attendees to the elements of digitization workflows with real-world examples provided by various Symbiota portal communities.