[API-users] Some questions from a begginer
Alex Thompson
godfoder at acis.ufl.edu
Wed Sep 9 19:28:41 CEST 2015
I'm kind of seconding Rod here.
It might make more sense, depending on your use case and local computer
resources, to just get a download of Plantae *AND* Brazil from GBIF
periodically, then process that to exclude existing Brazilian datasets.
You could then use something like Apache hadoop / spark to efficiently
split the file by dataset or by institution code.
This would greatly simplify your interactions with GBIF (down to just
periodically generating a download programmatically) and you would have
an easy place to insert any additional data transformations you want.
This is the path i take for my work at least - the incremental cost of a
couple million more records is worth the reduction in complexity overall.
- Alex
On 09/09/2015 12:16 PM, Eduardo Dalcin wrote:
> Hi Rod,
>
> The real purpose is to have a list of UUID and the "source web page"
> for the data set. Thus, one way to do it is to select those resources
> that counts <> 0 for PLANTAE *AND* Brazil.
>
> I don't want to do any stats analysis, but feed up one local
> harverster / agregator.
>
> The problem is, considering the reply from Jan Legind at Sep 3, we
> have to check one by one (https://goo.gl/3wysaA) to check if it is a
> Herbarium / Preserved Specimen (Plantae) or not, from the request
> http://api.gbif.org/v1/occurrence/counts/datasets?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN.
>
> Does it make sense?
>
> Thanks for your curiosity! :)
>
> Cheers,
>
> Eduardo
>
>
> --------------------------------
> *Eduardo Dalcin
> <https://mailtrack.io/trace/link/5516ed5e4f903c6ee9bd9fb3876fb65ffffc687c?url=http%3A%2F%2Feduardo.dalc.in&signature=cda9e9bf584a828c>*
> **Instituto de Pesquisas Jardim Botânico do Rio de Janeiro - JBRJ
> e-mail: edalcin at jbrj.gov.br <mailto:edalcin at jbrj.gov.br>
> Trabalho / Work: +55 21 3204 2116
> --------------------------------
> *e-mail alternativo / **alternate email:**edalcin at jbrj.org
> <mailto:edalcin at jbrj.org>*
> --------------------------------
> Agendar reunião / Schedule a meeting: http://agendar.dalc.in
> <https://mailtrack.io/trace/link/3a5eaa1df56016285886497766577e5357ddc6c1?url=http%3A%2F%2Fagendar.dalc.in&signature=c4e8d8113c34937f>
>
> On Mon, Sep 7, 2015 at 12:33 PM, Roderic Page
> <Roderic.Page at glasgow.ac.uk <mailto:Roderic.Page at glasgow.ac.uk>> wrote:
>
> Hi Eduardo,
>
> I’m curious, is the purpose to get counts by dataset by country,
> or to get all the plant occurrences for Brazil? The later can be
> obtained by downloading all plant occurrences in Brazil
> http://www.gbif.org/occurrence/search?TAXON_KEY=6&COUNTRY=BR (you
> could then compute the per-dataset stats locally). I realise that
> this isn’t as convenient as having GBIF slice the data for you in
> the API.
>
> Regards
>
> Rod
>
> ---------------------------------------------------------
> Roderic Page
> Professor of Taxonomy
> Institute of Biodiversity, Animal Health and Comparative Medicine
> College of Medical, Veterinary and Life Sciences
> Graham Kerr Building
> University of Glasgow
> Glasgow G12 8QQ, UK
>
> Email: Roderic.Page at glasgow.ac.uk <mailto:Roderic.Page at glasgow.ac.uk>
> Tel: +44 141 330 4778 <tel:%2B44%20141%20330%204778>
> Skype: rdmpage
> Facebook: http://www.facebook.com/rdmpage
> LinkedIn: http://uk.linkedin.com/in/rdmpage
> Twitter: http://twitter.com/rdmpage
> Blog: http://iphylo.blogspot.com
> ORCID: http://orcid.org/0000-0002-7101-9767
> Citations:
> http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ
> ResearchGatehttps://www.researchgate.net/profile/Roderic_Page
>
>
>> On 4 Sep 2015, at 10:39, Eduardo Dalcin <edalcin at jbrj.org
>> <mailto:edalcin at jbrj.org>> wrote:
>>
>> Hi Markus,
>>
>> Yes, that's a shame I can't have country and "nub" together.
>> There is any hope about it?
>>
>> Eduardo
>>
>>
>> --------------------------------
>> *Eduardo Dalcin
>> <https://mailtrack.io/trace/link/bac23864202354f3789938ce352a878faa0cd8b8?url=http%3A%2F%2Feduardo.dalc.in&signature=aea58ef6f439535b>*
>> **Instituto de Pesquisas Jardim Botânico do Rio de Janeiro - JBRJ
>> e-mail: edalcin at jbrj.gov.br <mailto:edalcin at jbrj.gov.br>
>> Trabalho / Work: +55 21 3204 2116 <tel:%2B55%2021%203204%202116>
>> --------------------------------
>> *e-mail alternativo / **alternate email:**edalcin at jbrj.org
>> <mailto:edalcin at jbrj.org>*
>> --------------------------------
>> Agendar reunião / Schedule a meeting: http://agendar.dalc.in
>> <https://mailtrack.io/trace/link/db57b837be515d4b7caefe43d55b60467cd7c2c1?url=http%3A%2F%2Fagendar.dalc.in&signature=69b244942739c0f5>
>>
>> On Thu, Sep 3, 2015 at 4:29 PM, Markus Döring <mdoering at gbif.org
>> <mailto:mdoering at gbif.org>> wrote:
>>
>> Eduardo,
>>
>> as you might have seen from my issue comment the webservice
>> uses a different parameter name for taxonKey which is a bug
>> we need to fix at some point.
>> Please use nubKey for now to use the service like that:
>>
>> http://api.gbif.org/v1/occurrence/counts/datasets?nubKey=6
>>
>> The real problem for you will be that we do not support the
>> combination of the country and the taxon filter, just one of
>> the two. So you cannot search for plants in Brazil I am
>> afraid, just for datasets about Brazil and datasets with
>> plant records.
>>
>> Markus
>>
>>
>>
>> > On 03 Sep 2015, at 14:12, Eduardo Dalcin <edalcin at jbrj.org
>> <mailto:edalcin at jbrj.org>> wrote:
>> >
>> > Thanks Jan. I'll keep exploring and I'll be in touch, if I
>> need.
>> >
>> > Best,
>> >
>> > Eduardo
>> >
>> >
>> >
>> > --------------------------------
>> > Eduardo Dalcin
>> > Instituto de Pesquisas Jardim Botânico do Rio de Janeiro - JBRJ
>> > e-mail: edalcin at jbrj.gov.br <mailto:edalcin at jbrj.gov.br>
>> > Trabalho / Work: +55 21 3204 2116
>> <tel:%2B55%2021%203204%202116>
>> > --------------------------------
>> > e-mail alternativo / alternate email: edalcin at jbrj.org
>> <mailto:edalcin at jbrj.org>
>> > --------------------------------
>> > Agendar reunião / Schedule a meeting:
>> http://agendar.dalc.in
>> <https://mailtrack.io/trace/link/db57b837be515d4b7caefe43d55b60467cd7c2c1?url=http%3A%2F%2Fagendar.dalc.in&signature=69b244942739c0f5>
>> >
>> > On Thu, Sep 3, 2015 at 4:51 AM, Jan Legind [GBIF]
>> <jlegind at gbif.org <mailto:jlegind at gbif.org>> wrote:
>> > Dear Eduardo,
>> >
>> >
>> >
>> > Thanks for getting in touch with us about these issues.
>> >
>> >
>> >
>> > The first request
>> http://api.gbif.org/v1/occurrence/count?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN
>> returns the number of records located in Brazil for the
>> facets in the request.
>> >
>> > The second query
>> http://api.gbif.org/v1/occurrence/counts/datasets?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN
>> uses the Occurrence Inventories web service
>> http://www.gbif.org/developer/occurrence#inventories which
>> does not support the basis-of-record facet in the /datasets
>> request. I understand that it would be better if the API
>> response yielded an error message in this instance.
>> >
>> >
>> >
>> > Concerning the other issues – you are indeed right that the
>> counts do not make sense in the context of taxon key 6 which
>> is Plantae. Actually the API does not handle the taxonKey
>> search at all, contrary to what the documentation states:
>> >
>> >
>> >
>> > /occurrence/counts/datasets
>> >
>> > GET
>> >
>> > Counts
>> >
>> > Lists occurrence counts for datasets that cover a given
>> taxon or country.
>> >
>> > country, taxonKey
>> >
>> >
>> >
>> > As you can see here,
>> http://api.gbif.org/v1/occurrence/counts/datasets?taxonKey=6
>> , this request doesn’t return anything.
>> >
>> >
>> >
>> > The GBIF developers will handle this issue in due time.
>> >
>> > You can follow the issue in our bug tracking service here:
>> http://dev.gbif.org/issues/browse/POR-2828
>> >
>> >
>> >
>> >
>> >
>> > With best regards,
>> >
>> >
>> >
>> > Jan K. Legind
>> >
>> > Data manager, GBIF Secretariat
>> >
>> >
>> >
>> >
>> >
>> > From: API-users [mailto:api-users-bounces at lists.gbif.org
>> <mailto:api-users-bounces at lists.gbif.org>] On Behalf Of
>> Eduardo Dalcin
>> > Sent: 2. september 2015 20:06
>> > To: api-users at lists.gbif.org
>> <mailto:api-users at lists.gbif.org>; dev at gbif.org
>> <mailto:dev at gbif.org>
>> > Cc: João Monnerat Lanna; Natália Queiroz; Diogo Silva;
>> Laura; Ricardo Avancini
>> > Subject: [API-users] Some questions from a begginer
>> >
>> >
>> >
>> > Hi folks,
>> >
>> >
>> >
>> > This is my first message to the list. So, please, be nice :)
>> >
>> >
>> >
>> > I'm working here at Rio de Janeiro Botanical Garden,
>> together with the guys at the National Center for Flora
>> Conservation. We are doing the risk assessment of the
>> Brazilian flora to the government. We assess, so far, the
>> risk of ca. 6.000 species, but we still have to assess ca.
>> 35.000. Access occurrence records for Brazil is crucial, and
>> every occurrence is important.
>> >
>> >
>> >
>> > That means that we have to put together occurrence data
>> from different sources and, after the first batch of the risk
>> assessment, we realize that we need to build up our
>> aggregator. We are planning to do this with the
>> Lontra-harvester, with the help of the guys at Brazilian GBIF
>> Node.
>> >
>> >
>> >
>> > So, the one of the firsts steps was to list the available
>> resources to understand the dimension of the task and, that
>> brings me to my questions.
>> >
>> >
>> >
>> > First:
>> >
>> >
>> >
>> > The request:
>> >
>> >
>> >
>> >
>> http://api.gbif.org/v1/occurrence/count?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN
>> >
>> >
>> >
>> > returns 4.982.689 records
>> >
>> >
>> >
>> > And the request:
>> >
>> >
>> >
>> >
>> http://api.gbif.org/v1/occurrence/counts/datasets?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN
>> >
>> >
>> >
>> > returns (here) 7.406.310 records
>> >
>> >
>> >
>> > Comments?
>> >
>> >
>> >
>> > Second:
>> >
>> >
>> >
>> > The request:
>> >
>> >
>> >
>> >
>> http://api.gbif.org/v1/occurrence/count?country=BR&taxonKey=6&basisOfRecord=PRESERVED_SPECIMEN
>> >
>> >
>> >
>> > return things like this:
>> >
>> >
>> >
>> > "197908d0-5565-11d8-b290-b8a03c50a862":27629
>> >
>> >
>> > But the consult of the same dataset:
>> >
>> >
>> >
>> >
>> http://www.gbif.org/occurrence/search?TAXON_KEY=6&DATASET_KEY=197908d0-5565-11d8-b290-b8a03c50a862
>> >
>> >
>> >
>> > Returns "null" (of course, is a FishBase!)
>> >
>> >
>> >
>> > I have plenty of examples like this, on yellow here (not
>> finished!):
>> >
>> >
>> >
>> >
>> https://docs.google.com/spreadsheets/d/1msUjwMLoKwnXxJFzF20SeN_C65RIkGLbwaYyj459VTc/edit?usp=sharing
>> >
>> >
>> >
>> > Comments?
>> >
>> >
>> >
>> > I think those two questions is a good start. Please, let me
>> know if I'm doing something wrong.
>> >
>> >
>> >
>> > Cheers,
>> >
>> >
>> >
>> > Eduardo
>> >
>> > --------------------------------
>> >
>> > Eduardo Dalcin
>> >
>> > Instituto de Pesquisas Jardim Botânico do Rio de Janeiro - JBRJ
>> >
>> > e-mail: edalcin at jbrj.gov.br <mailto:edalcin at jbrj.gov.br>
>> >
>> > Trabalho / Work: +55 21 3204 2116
>> <tel:%2B55%2021%203204%202116>
>> >
>> > --------------------------------
>> >
>> > e-mail alternativo / alternate email: edalcin at jbrj.org
>> <mailto:edalcin at jbrj.org>
>> >
>> > --------------------------------
>> >
>> > Agendar reunião / Schedule a meeting:
>> http://agendar.dalc.in
>> <https://mailtrack.io/trace/link/db57b837be515d4b7caefe43d55b60467cd7c2c1?url=http%3A%2F%2Fagendar.dalc.in&signature=69b244942739c0f5>
>> >
>> >
>> >
>> >
>>
>>
>> _______________________________________________
>> API-users mailing list
>> API-users at lists.gbif.org <mailto:API-users at lists.gbif.org>
>> http://lists.gbif.org/mailman/listinfo/api-users
>
>
>
>
> _______________________________________________
> API-users mailing list
> API-users at lists.gbif.org
> http://lists.gbif.org/mailman/listinfo/api-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gbif.org/pipermail/api-users/attachments/20150909/6c8dba77/attachment-0001.html>
More information about the API-users
mailing list