Regarding this dataset:
http://www.gbif.org/dataset/e0dbf705-cec4-4dce-a152-fc4ebe14674d
The count of occurrences in GBIF appears to be (at least) double what is expected.
Screenshot:
The occurrence count endpoint in the API displays the same value as the GBIF portal:
http://api.gbif.org/v1/occurrence/count?datasetKey=e0dbf705-cec4-4dce-a152-f...
or, 146509 occurrences.
However, the actual DwC-A seems to have only a bit over 50k rows.
The description for that dataset reads "The vertebrate fossil collection at the Royal Ontario museum consists of over 72,000 catalogued and databased fossil specimens of all vertebrate classes"
Further digging indicates that ROM changed their occurrenceID format.
A sample that appears to be the same specimen catalog number, same actual specimen, but exists twice in GBIF, once for each occurrenceID format.
http://www.gbif.org/occurrence/1211615291 http://www.gbif.org/occurrence/1234628259
Thanks,
Dan Stoner iDigBio / ACIS Laboratory University of Florida