Tim,
Thanks very much! Makes sense.
Scott
On Thu, Apr 9, 2015 at 12:21 AM Tim Robertson trobertson@gbif.org wrote:
Hi Scott,
2 things can cause this:
- Eventual consistency
The count service is an insanely high throughput service, while search is lower throughput - they have different backends, and a messaging bus keeps them in sync. Because of this there is often a short period (up to 1 hr but normally < 5 mins) where they can differ during indexing runs. Issues can creep in and they drift and occasionally we rebuild the count service. The search service is always the correct one.
- Geospatial issues
The isGeoreferenced only counts records with coordinates and no known geospatial issues - i.e. records we’d consider suitable for using the coordinates.
In this case it is 2. that provides the difference, and the search service should be using the &hasGeospatialIssue parameter.
http://api.gbif.org/v1/occurrence/search?taxonKey=7264332&hasCoordinate=...
http://api.gbif.org/v1/occurrence/count?taxonKey=7264332&isGeoreferenced...
Both report 4515 records.
I hope this helps - please feel free to quote me verbatim on the issue.
Cheers, Tim
On 09 Apr 2015, at 07:40, Scott Chamberlain myrmecocystus@gmail.com wrote:
Hi,
A user of the R client we make for GBIF reports different number of occurrences for the /occurrence/search endpoint and the /occurrence/count endpoint with the same taxonkey, and limiting to georeferenced data only. See https://discuss.ropensci.org/t/rgbif-occ-count-and-occ-search-results-differ... for the discussion.
I imagine there's a good explanation for this, but I'm not sure what it is right now.
Thanks! Scott
API-users mailing list API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users