[API-users] dataset inflated occurrence counts?

Stoner, Dan dstoner at acis.ufl.edu
Mon Feb 13 22:30:10 CET 2017

Regarding this dataset:


The count of occurrences in GBIF appears to be (at least) double what is expected.



The occurrence count endpoint in the API displays the same value as the GBIF portal:


or, 146509 occurrences.

However, the actual DwC-A seems to have only a bit over 50k rows.

The description for that dataset reads "The vertebrate fossil collection at the Royal Ontario museum consists of over 72,000 catalogued and databased fossil specimens of all vertebrate classes"

Further digging indicates that ROM changed their occurrenceID format.

A sample that appears to be the same specimen catalog number, same actual specimen, but exists twice in GBIF, once for each occurrenceID format.



Dan Stoner
iDigBio / ACIS Laboratory
University of Florida

More information about the API-users mailing list