Tim, 

Thanks very much! Makes sense. 

Scott

On Thu, Apr 9, 2015 at 12:21 AM Tim Robertson <trobertson@gbif.org> wrote:
Hi Scott,

2 things can cause this:

1. Eventual consistency
The count service is an insanely high throughput service, while search is lower throughput - they have different backends, and a messaging bus keeps them in sync.  Because of this there is often a short period (up to 1 hr but normally < 5 mins) where they can differ during indexing runs.  Issues can creep in and they drift and occasionally we rebuild the count service.  The search service is always the correct one.

2. Geospatial issues
The isGeoreferenced only counts records with coordinates and no known geospatial issues - i.e. records we’d consider suitable for using the coordinates.

In this case it is 2. that provides the difference, and the search service should be using the &hasGeospatialIssue parameter.

http://api.gbif.org/v1/occurrence/search?taxonKey=7264332&hasCoordinate=true&hasGeospatialIssue=false&limit=20
http://api.gbif.org/v1/occurrence/count?taxonKey=7264332&isGeoreferenced=true

Both report 4515 records.

I hope this helps - please feel free to quote me verbatim on the issue.

Cheers,
Tim

On 09 Apr 2015, at 07:40, Scott Chamberlain <myrmecocystus@gmail.com> wrote:

Hi, 

A user of the R client we make for GBIF reports different number of occurrences for the /occurrence/search endpoint and the /occurrence/count endpoint with the same taxonkey, and limiting to georeferenced data only. See https://discuss.ropensci.org/t/rgbif-occ-count-and-occ-search-results-differ/174 for the discussion. 

I imagine there's a good explanation for this, but I'm not sure what it is right now.

Thanks! Scott
_______________________________________________
API-users mailing list
API-users@lists.gbif.org
http://lists.gbif.org/mailman/listinfo/api-users