[API-users] What happens to previous data after dataset/crawl?

Tim Robertson trobertson at gbif.org
Sat Aug 27 08:12:14 CEST 2016


Hi Rod

It is not done automatically due to the fact it normally happens due to some mapping error rather than by design.

Today we trigger it manually, but do want to automate it - probably only for cases it seems genuine.

Cheers,
Tim

On 27 Aug 2016, at 08:01, Roderic Page <Roderic.Page at glasgow.ac.uk<mailto:Roderic.Page at glasgow.ac.uk>> wrote:

Just wanted to check the consequences of the following dataset operation.

Say I have a dataset with 10 occurrences with occurrence ids 1-10. In my local database I now assign those 10 occurrences new identifiers a-j. If I create a new DwCA file for my data and crawl the new archive, my expectation is:

1. Old data with ids 1-10 is deleted from GBIF index
2. New data with ids a-j is indexed

So, end result is dataset has 10 occurrences. I'm asking because I know in the past the some datasets have changed identifiers and this has resulted in records with old and new identifiers coexisting in GBIF index, resulting in duplicated data.

Obviously it would be nice to have stable, unchanging identifiers for occurrences, but the for data set I'm working with the creators have changed their minds between versions of the data :(

Regards,

Rod

Get Outlook for iOS<https://aka.ms/o0ukef>

_______________________________________________
API-users mailing list
API-users at lists.gbif.org<mailto:API-users at lists.gbif.org>
http://lists.gbif.org/mailman/listinfo/api-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gbif.org/pipermail/api-users/attachments/20160827/2aa6de2a/attachment.html>


More information about the API-users mailing list