Hi Eduardo,
I thought that the feedback about data improvement should be sent directly to the data provider but, please, if there is something else let me know.
In an ideal world, yes the feedback should go to the data provider, things get fixed, then GBIF gets updated. However, data providers don’t always have the resources to fix things. I’m also interested in how many of the data issues that come up are things that GBIF itself can detect and flag. In my experience, there are issues that the provider was unaware of, but become apparent once the data is exposed by GBIF.
For example, here’s a case of a data set supplied to GBIF with a serious error https://github.com/ttu-vertnet/ttu-mammals/issues/12 This was obvious in GBIF simply by looking at the map, but apparently not to the data provider (this error has now been fixed).
The more we know about the sort of errors that can happen, the better placed we are to develop tools to catch them.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.com ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 15 Sep 2015, at 14:53, Eduardo Dalcin <edalcin@jbrj.orgmailto:edalcin@jbrj.org> wrote:
Hi Rod,
As you saw in the other message, the main problem that we have now is have the same voucher represented twice because NYBG had a DIGIR source and now have an IPT source. People at NYBG said that they ask GBIF to remove DIGIR, but still there. Maybe it occurs with other sources as well.
Related with the feedback of the data cleaning process I'm indeed interested in this discussion, but I'm not sure if this list is the best forum to do it.
Here at the National Center for Flora Conservation - CNCFlora, at the risk assessments, we just use occurrences that were validated by experts, taxonomically and spatially. This information may be useful, especially if the expert made some correction or comment on the occurrence. I can see that this is related with annotation initiatives, such as AnnoSys and FilteredPush. In my ideal and fantastic world, we would have an annotation feature on GBIF occurrences, where experts can interact with the material. In our Virtual Herbarium of Repatriated Plantshttps://mailtrack.io/trace/link/b2036a078664eab467d602e1f1513c7641fadf73?url=http%3A%2F%2Fwww.herbariovirtualreflora.jbrj.gov.br%2Fjabot%2FherbarioVirtual%2FConsultaPublicoHVUC%2FResultadoDaConsultaNovaConsulta.do%3Flingua%3Den&signature=7efc3ae92fb5b099, the experts can suggest new names if they have a login.
However, what is usual is the duplication of efforts for georeferencing the legacy occurrences. For example, different efforts, methodologies and uncertainty levels have been applied in different duplicates of the same occurrence, held by different herbaria.
I thought that the feedback about data improvement should be sent directly to the data provider but, please, if there is something else let me know.
Cheers,
Eduardo
-------------------------------- Eduardo Dalcinhttps://mailtrack.io/trace/link/a5d3cb382ef00884ad61ce9e38743772edafd567?url=http%3A%2F%2Feduardo.dalc.in&signature=d9152b1fbbf0db39 Instituto de Pesquisas Jardim Botânico do Rio de Janeiro - JBRJ e-mail: edalcin@jbrj.gov.brmailto:edalcin@jbrj.gov.br Trabalho / Work: +55 21 3204 2116 -------------------------------- e-mail alternativo / alternate email: edalcin@jbrj.orgmailto:edalcin@jbrj.org -------------------------------- Agendar reunião / Schedule a meeting: http://agendar.dalc.inhttps://mailtrack.io/trace/link/3639d653caa48a1efeb08d1c342b7ffd0f5bd30b?url=http%3A%2F%2Fagendar.dalc.in&signature=07f7b0c516192bcd
On Mon, Sep 14, 2015 at 1:50 PM, Roderic Page <Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk> wrote: Hi Eduardo,
it would be interesting to have example of the kinds of problems you encounter with GBIF data, so that we can look at was to fix the problems. It would also be interesting to know whether you would be able to provide GBIF with the corrections you make to GBIF data. It seems clear that lots of people are cleaning data in their own projects, but that doesn’t filter back to GBIF.
Regards
Rod
--------------------------------------------------------- Roderic Page Professor of Taxonomy Institute of Biodiversity, Animal Health and Comparative Medicine College of Medical, Veterinary and Life Sciences Graham Kerr Building University of Glasgow Glasgow G12 8QQ, UK
Email: Roderic.Page@glasgow.ac.ukmailto:Roderic.Page@glasgow.ac.uk Tel: +44 141 330 4778tel:%2B44%20141%20330%204778 Skype: rdmpage Facebook: http://www.facebook.com/rdmpage LinkedIn: http://uk.linkedin.com/in/rdmpage Twitter: http://twitter.com/rdmpage Blog: http://iphylo.blogspot.comhttp://iphylo.blogspot.com/ ORCID: http://orcid.org/0000-0002-7101-9767 Citations: http://scholar.google.co.uk/citations?hl=en&user=4Z5WABAAAAAJ ResearchGate https://www.researchgate.net/profile/Roderic_Page
On 14 Sep 2015, at 17:34, Eduardo Dalcin <edalcin@jbrj.orgmailto:edalcin@jbrj.org> wrote:
The problem with these tools (LontraHarvest, OpenRefine, etc.) is that they are just data *retrieval* tools, not providing for data analytical and representation functionalities
Mauro, for me this is a blessing! :)
At CNC Flora workflow, the data from GBIF is useless the way it is, because it have to be validated first, taxonomically and spatially. Only after the process of the cleaning, georeferencing and validation with the expert, the data will be analyzed to take part of the risk assessment.
Cheers
Eduardo
_______________________________________________ API-users mailing list API-users@lists.gbif.orgmailto:API-users@lists.gbif.org http://lists.gbif.org/mailman/listinfo/api-users