[IPT] Darwin Core Star Schema

Tim Robertson trobertson at gbif.org
Wed Nov 11 14:15:02 CET 2015


Thanks Quentin

> So if I understand correctly, the event file can be used as a Core, just as Taxon and Occurrence can be Core files.

Yes, that’s correct

> Though as there can only be one Core ID I will still need to keep my taxon information in the Occurrence file.
> Although I don't think this is a problem, it can get a little confusing in the documentation due to the crossover of terms between taxon and occurrence files.

I’m afraid that is the kind of limitation I was eluding to about star schemas… You have to denormalise things into a format which flattens what you might otherwise model as 2 tables.

Currently a "Taxon” can’t be used as an extension, so you would need to use Occurrence.  Adding Taxon as an option would be technically possible, but that would be completely decoupled from the occurrences.  It would however allow you to have:

Core: Rows of Sampling event documenting e.g. a square on the ground sample on a specific period
  Extension taxon: List of species observed within the sampling event
  Extension occurrence: Documented evidence of specimens collected or observed

At the moment though, you would have to express species lists as occurrences, which might make some sense because they are effectively observations.

> I'm happy to be a Guinea pig. I'll experiment with the validator if you think this should work and let you know how I get on.

Thanks for this,

All the best,
Tim

> 
> Regards
> Quentin
> 
> 
> 
> Dr. Quentin Groom
> (Botany and Information Technology)
> 
> Botanic Garden Meise
> Domein van Bouchout
> B-1860 Meise
> Belgium
> 
> ORCID: 0000-0002-0596-5376
> 
> Landline; +32 (0) 226 009 20 ext. 364
> FAX:      +32 (0) 226 009 45
> 
> E-mail:     quentin.groom at plantentuinmeise.be <mailto:quentin.groom at plantentuinmeise.be>
> Skype name: qgroom
> Website:    www.botanicgarden.be <http://www.botanicgarden.be/>
> 
> 
> On 11 November 2015 at 11:58, Hannu Saarenmaa <hannu.saarenmaa at helsinki.fi <mailto:hannu.saarenmaa at helsinki.fi>> wrote:
> Quentin & Co
> 
> It depends what you mean by "survey".   I would put each visit to a sampling location (such as a plot) in the event core, and put all the taxa that are observed in a non-core table.   The properties of the entire survey (project) would go to the EML metadata.
> 
> Hannu
> 
> 
> On 2015-11-11 10:20, Quentin Groom wrote:
>> I'm rather confused how the Darwin Core Star Schema is meant to work for survey data.
>> 
>> Darwin Core can have one of two Core files, taxon or occurrence. The most appropriate for a survey would seem to be occurrence. So I imagine that in the star schema you could also have a related event file detailing the date and location of each survey and a non-core taxon file detailing the taxa that are observed.
>> 
>> However, this does not seem to be possible. The DWC-A validator (http://tools.gbif.org/dwca-validator/ <http://tools.gbif.org/dwca-validator/>), assumes only on core id in the core file so you can't link an occurrence both to a taxon and to an event. This is also true in the Darwin Core Archive Assistant (http://tools.gbif.org/dwca-assistant/ <http://tools.gbif.org/dwca-assistant/>). The solution seems to be to put all the information from the taxon core file into the occurrence file, but keep the separate event file linked with the core occurrence id.
>> 
>> Is this correct? It seems rather counter intuitive.
>> 
>> Regards
>> Quentin
>> 
>> 
>> Dr. Quentin Groom
>> (Botany and Information Technology)
>> 
>> Botanic Garden Meise
>> Domein van Bouchout
>> B-1860 Meise
>> Belgium
>> 
>> ORCID: 0000-0002-0596-5376
>> 
>> Landline; +32 (0) 226 009 20 ext. 364
>> FAX:      +32 (0) 226 009 45
>> 
>> E-mail:     quentin.groom at plantentuinmeise.be <mailto:quentin.groom at plantentuinmeise.be>
>> Skype name: qgroom
>> Website:    www.botanicgarden.be <http://www.botanicgarden.be/>
>> 
>> 
>> _______________________________________________
>> IPT mailing list
>> IPT at lists.gbif.org <mailto:IPT at lists.gbif.org>
>> http://lists.gbif.org/mailman/listinfo/ipt <http://lists.gbif.org/mailman/listinfo/ipt>
> 
> -- 
> 
> Hannu Saarenmaa, Research Director
> hannu.saarenmaa at uef.fi <mailto:hannu.saarenmaa at uef.fi>
> Mobile +358-50-4479668 <tel:%2B358-50-4479668>
> 
> University of Eastern Finland
> Digitarium, SIB Labs, Joensuu Science Park
> Länsikatu 15 (P.O. Box 111)
> FI-80101 Joensuu
> 
> www.digitarium.fi/en <http://www.digitarium.fi/en> - Service Centre for High-Performance Digitisation
> www.eubon.eu <http://www.eubon.eu/> - EU BON - GEO BON - Data Integration and Interoperability
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gbif.org/pipermail/ipt/attachments/20151111/4ba4dbcf/attachment.html>


More information about the IPT mailing list