CLB API related data
Dear list,
I'm developing a national "datawarehouse"/repository for natural history observation/occurrence data integrating several sources including GBIF and iNaturalist. For this purpose I'm generating lists of names and taxa related to the different ingested observation sources. In order to offer unified taxon search and filter functionality, I need equivalence mappings between ID-namespaces of different checklists (GBIF, CoL, internal, ornitho.lu, etc). I noticed that the CLB API offers the call `/dataset/{key}/nameusage/{id}/related` which for key=3LR and datasetKey=139831 [iNat] returns a correspondence.
Where does the data for this mapping come from? Examining both the CoL and iNat checklists NameRelation.tsv (from the CoLDP downloads) didn't give a match. Is this data imported into CLB though a different checklist or is it supplemented though some more hidden datasource? The ID-namespace translation does not seem to be universally implemented and seems somehow tied to the dataset (translation from iNat to CoL didn't seem to work), is this correct? Do you plan on supporting my use-case more explicitly in the future?
I would prefer not to use the CLB API and instead import the relevant checklists into our postgresql database in order to allow for an efficient and flexible operation.
Thanks for the great work that went into CLB and related projects!
Best regards,
--
Raffael Mancini
IT administrator and developer
Service d'information digital sur le patrimoine naturel (SIDPNAT)
Musée National d'Histoire Naturelle Luxembourg
T: +352 247 66667 - https://mnhn.lu
Dear Raffael,
ChecklistBank (CLB) contains a NamesIndex which tracks unique names and automatically provides a mapping between all names of all datasets in CLB. I have added a section to the API docs here that might be valuable for you: https://github.com/CatalogueOfLife/backend/blob/master/API.md#names-index
In particular there is a ID mapping export that seems to exactly suit your needs: https://github.com/CatalogueOfLife/backend/blob/master/API.md#names-index-id...
Best wishes, Markus
On 9. Aug 2023, at 14:09, MANCINI Raffael via COL-Users col-users@lists.gbif.org wrote:
Dear list,
I'm developing a national "datawarehouse"/repository for natural history observation/occurrence data integrating several sources including GBIF and iNaturalist. For this purpose I'm generating lists of names and taxa related to the different ingested observation sources. In order to offer unified taxon search and filter functionality, I need equivalence mappings between ID-namespaces of different checklists (GBIF, CoL, internal, ornitho.lu, etc). I noticed that the CLB API offers the call `/dataset/{key}/nameusage/{id}/related` which for key=3LR and datasetKey=139831 [iNat] returns a correspondence.
Where does the data for this mapping come from? Examining both the CoL and iNat checklists NameRelation.tsv (from the CoLDP downloads) didn't give a match. Is this data imported into CLB though a different checklist or is it supplemented though some more hidden datasource? The ID-namespace translation does not seem to be universally implemented and seems somehow tied to the dataset (translation from iNat to CoL didn't seem to work), is this correct? Do you plan on supporting my use-case more explicitly in the future?
I would prefer not to use the CLB API and instead import the relevant checklists into our postgresql database in order to allow for an efficient and flexible operation.
Thanks for the great work that went into CLB and related projects!
Best regards,
-- Raffael Mancini IT administrator and developer Service d'information digital sur le patrimoine naturel (SIDPNAT) Musée National d'Histoire Naturelle Luxembourg T: +352 247 66667 - https://mnhn.lu
COL-Users mailing list COL-Users@lists.gbif.org https://lists.gbif.org/mailman/listinfo/col-users
Dear Markus,
many thanks for the quick reply and the added documentation! This is exactly what I was looking for.
Did you ever consider having a mechanism with which checklist maintainers could feed in mapping data or do you want this to be a purely functional dependency of the checklists? Having stateful mapping would allow for an easy mapping between legacy (often non maintained) checklists/taxonomies and CoL. Could the same effect be achieved by adding synonyms to those legacy checklists.
Best regards!
--
Raffael Mancini
IT administrator and developer
Service d'information digital sur le patrimoine naturel (SIDPNAT)
Musée National d'Histoire Naturelle Luxembourg
T: +352 247 66667 - https://mnhn.lu
________________________________ From: Markus Döring mdoering@gbif.org Sent: 14 August 2023 14:23:00 To: MANCINI Raffael; Catalogue of Life user announcements and discussion Subject: Re: [COL-Users] CLB API related data
Dear Raffael,
ChecklistBank (CLB) contains a NamesIndex which tracks unique names and automatically provides a mapping between all names of all datasets in CLB. I have added a section to the API docs here that might be valuable for you: https://github.com/CatalogueOfLife/backend/blob/master/API.md#names-index
In particular there is a ID mapping export that seems to exactly suit your needs: https://github.com/CatalogueOfLife/backend/blob/master/API.md#names-index-id...
Best wishes, Markus
On 9. Aug 2023, at 14:09, MANCINI Raffael via COL-Users col-users@lists.gbif.org wrote:
Dear list,
I'm developing a national "datawarehouse"/repository for natural history observation/occurrence data integrating several sources including GBIF and iNaturalist. For this purpose I'm generating lists of names and taxa related to the different ingested observation sources. In order to offer unified taxon search and filter functionality, I need equivalence mappings between ID-namespaces of different checklists (GBIF, CoL, internal, ornitho.lu, etc). I noticed that the CLB API offers the call `/dataset/{key}/nameusage/{id}/related` which for key=3LR and datasetKey=139831 [iNat] returns a correspondence.
Where does the data for this mapping come from? Examining both the CoL and iNat checklists NameRelation.tsv (from the CoLDP downloads) didn't give a match. Is this data imported into CLB though a different checklist or is it supplemented though some more hidden datasource? The ID-namespace translation does not seem to be universally implemented and seems somehow tied to the dataset (translation from iNat to CoL didn't seem to work), is this correct? Do you plan on supporting my use-case more explicitly in the future?
I would prefer not to use the CLB API and instead import the relevant checklists into our postgresql database in order to allow for an efficient and flexible operation.
Thanks for the great work that went into CLB and related projects!
Best regards,
-- Raffael Mancini IT administrator and developer Service d'information digital sur le patrimoine naturel (SIDPNAT) Musée National d'Histoire Naturelle Luxembourg T: +352 247 66667 - https://mnhn.lu
COL-Users mailing list COL-Users@lists.gbif.org https://lists.gbif.org/mailman/listinfo/col-users
We have discussed this for taxon concept mappings, but the ideas are not mature enough at this point to implement them. We are also working with Jonathan Rees, Nico Franz and Beckett Sterner on adding a concept mapping feature to CLB. Initially on demand for requested groups only. There is a lot of changing data in CLB which means relations potentially change all the time, which makes storing them difficult or even useless.
In GBIF we did map all names from all datasets (>45.000!!!) to the GBIF Backbone and allowed to traverse names from one list to another via the Backbone. But that had some serious problems, the main one being that names not (yet) included in the backbone were not mapped and you could not find related names in other lists. ChecklistBank therefore has the central and kinda neutral NamesIndex component that everything is being mapped to and extended with new names as needed automatically.
Markus
On 14. Aug 2023, at 15:17, MANCINI Raffael Raffael.MANCINI@mnhn.lu wrote:
Dear Markus,
many thanks for the quick reply and the added documentation! This is exactly what I was looking for.
Did you ever consider having a mechanism with which checklist maintainers could feed in mapping data or do you want this to be a purely functional dependency of the checklists? Having stateful mapping would allow for an easy mapping between legacy (often non maintained) checklists/taxonomies and CoL. Could the same effect be achieved by adding synonyms to those legacy checklists.
Best regards!
-- Raffael Mancini IT administrator and developer Service d'information digital sur le patrimoine naturel (SIDPNAT) Musée National d'Histoire Naturelle Luxembourg T: +352 247 66667 - https://mnhn.lu
From: Markus Döring mdoering@gbif.org Sent: 14 August 2023 14:23:00 To: MANCINI Raffael; Catalogue of Life user announcements and discussion Subject: Re: [COL-Users] CLB API related data Dear Raffael,
ChecklistBank (CLB) contains a NamesIndex which tracks unique names and automatically provides a mapping between all names of all datasets in CLB. I have added a section to the API docs here that might be valuable for you: https://github.com/CatalogueOfLife/backend/blob/master/API.md#names-index
In particular there is a ID mapping export that seems to exactly suit your needs: https://github.com/CatalogueOfLife/backend/blob/master/API.md#names-index-id...
Best wishes, Markus
On 9. Aug 2023, at 14:09, MANCINI Raffael via COL-Users col-users@lists.gbif.org wrote:
Dear list,
I'm developing a national "datawarehouse"/repository for natural history observation/occurrence data integrating several sources including GBIF and iNaturalist. For this purpose I'm generating lists of names and taxa related to the different ingested observation sources. In order to offer unified taxon search and filter functionality, I need equivalence mappings between ID-namespaces of different checklists (GBIF, CoL, internal, ornitho.lu, etc). I noticed that the CLB API offers the call `/dataset/{key}/nameusage/{id}/related` which for key=3LR and datasetKey=139831 [iNat] returns a correspondence.
Where does the data for this mapping come from? Examining both the CoL and iNat checklists NameRelation.tsv (from the CoLDP downloads) didn't give a match. Is this data imported into CLB though a different checklist or is it supplemented though some more hidden datasource? The ID-namespace translation does not seem to be universally implemented and seems somehow tied to the dataset (translation from iNat to CoL didn't seem to work), is this correct? Do you plan on supporting my use-case more explicitly in the future?
I would prefer not to use the CLB API and instead import the relevant checklists into our postgresql database in order to allow for an efficient and flexible operation.
Thanks for the great work that went into CLB and related projects!
Best regards,
-- Raffael Mancini IT administrator and developer Service d'information digital sur le patrimoine naturel (SIDPNAT) Musée National d'Histoire Naturelle Luxembourg T: +352 247 66667 - https://mnhn.lu
COL-Users mailing list COL-Users@lists.gbif.org https://lists.gbif.org/mailman/listinfo/col-users
participants (2)
-
MANCINI Raffael
-
Markus Döring