[API-users] dataset inflated occurrence counts?

Stoner, Dan dstoner at acis.ufl.edu
Mon Feb 13 22:30:10 CET 2017


Regarding this dataset:

http://www.gbif.org/dataset/e0dbf705-cec4-4dce-a152-fc4ebe14674d

The count of occurrences in GBIF appears to be (at least) double what is expected.


Screenshot:

http://imgur.com/a/A9C8M


The occurrence count endpoint in the API displays the same value as the GBIF portal:

http://api.gbif.org/v1/occurrence/count?datasetKey=e0dbf705-cec4-4dce-a152-fc4ebe14674d

or, 146509 occurrences.


However, the actual DwC-A seems to have only a bit over 50k rows.

The description for that dataset reads "The vertebrate fossil collection at the Royal Ontario museum consists of over 72,000 catalogued and databased fossil specimens of all vertebrate classes"



Further digging indicates that ROM changed their occurrenceID format.

A sample that appears to be the same specimen catalog number, same actual specimen, but exists twice in GBIF, once for each occurrenceID format.

http://www.gbif.org/occurrence/1211615291
http://www.gbif.org/occurrence/1234628259



Thanks,

Dan Stoner
iDigBio / ACIS Laboratory
University of Florida


More information about the API-users mailing list