[IPT] IPT v2.2 - release candidate

Kyle Braak kbraak at gbif.org
Fri Mar 13 17:54:13 CET 2015

Hi John, Paul,

Reporting back.. 

I did some investigation into what DataCite intends a list of rights to be used for. 

The DataCite Metadata Working Group has explained that a list of rights is intended to support applying multiple licenses that apply to the dataset as a whole. Here’s the answer provided to me on behalf of their chair:

> the Metadata Working Group has discussed your question and would like to say that the intension was (a) to allow for multiple licenses to be applied to a dataset. Moreover, we suggest that if different licenses apply to separable components of a dataset, those (various) components ought to have separate metadata records (and so also separate DOIs).

From the DataCite mailing list, I’m told datasets with multiple licenses are pretty common. For example, OpenAIRE applies multiple complementary licenses to their datasets sometimes [1]. 

As for EML, we did an investigation last year into how it allows licenses to be expressed for datasets. We did discover the license [2] and licenseURL fields, however, they relate to software not datasets. The EML mailing list was consulted for guidance on this topic with no answer ever received unfortunately [3]. Furthermore, EML documentation includes no guidance for applying multiple licenses to a dataset (or its components) as far as I can see. 

Nevertheless, EML does allow one free-text intellectualRights [4] element per dataset and this is where GBIF expresses a license in its own metadata profile (based on EML).  To make the license machine readable/parseable, what we do is use the ulink [5] element inside the intellectualRights to store the license and license URL separately. Since we need to enforce the GBIF licensing Policy [6], we only allow a single license to be expressed though. 

To sum up, it’s great we now have a clear recommendation from DataCite on how to apply licenses to datasets. To better integrate with DataCite we will benefit from adopting recommendations in line with theirs. 

With kind regards,


[1] https://guidelines.openaire.eu/wiki/OpenAIRE_Guidelines:_For_Data_Archives#Access_rights_and_license_information
[2] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-software.html#license
[3] https://github.com/peterdesmet/awesome-metadata/issues/2#issuecomment-62885616
[4] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-resource.html#intellectualRights
[5] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-text.html#ulink
[6] http://www.gbif.org/terms/licences

On 04 Mar 2015, at 07:01, John Wieczorek <tuco at berkeley.edu> wrote:

> With the latter, I agree. Looking forward to the outcome of the first item. And thanks, Paul, for realizing this and bringing it forward to everyone's attention.
> On Tue, Mar 3, 2015 at 3:30 PM, Tim Robertson <trobertson at gbif.org> wrote:
>> On Tue, 3 Mar 2015 18:55:35 +0100
>> Tim Robertson <trobertson at gbif.org> wrote:
>>> Thanks Paul and Tuco for the feedback - useful food for thought and I
>>> note that the DataCite metadata kernel also uses a list for rights,
>>> not a single statement.  Allowing a collection of rights might be the
>>> most applicable solution here.
>> That still leaves uncertainty, does the list CC-BY, CC-BY-NC mean that
>> the data set in it's entirety is licensed under both CC-BY and
>> CC-BY-NC, or that there are parts under one license and parts under the
>> other.  At the data set level, collection of rights statements could
>> easily be interpreted differently than a statement that rights are
>> stated at the record level. 
> Agreed.  We need to understand how this will relate to DataCite, EML etc and other networks we need to integrate with.
> I am not yet sure if DataCite intends to use it to indicate dual licensing (e.g. it can be used in either license) or if it indicates variance.  
> This needs further investigation and we’ll reply.
>>> A point of clarity though - an image extension allows you to provide
>>> metadata about an image that exists on a URL, but the image itself is
>>> not part of the DwC-A / dataset.  One field of the image metadata is
>>> the license applicable for the image but that should not be
>>> transferred to the dataset being put out by the IPT.  Or are we not
>>> in agreement on that?  E.g. the DwC-A can be available under CC0 but
>>> contain links to online images that could be behind some far more
>>> restrictive license.
>> This does seem fairly clear in an AudobonCore or other media extension
>> where metadata about the rights associated with external media objects
>> are being asserted in the metadata in association with the retrieval
>> locations of those media objects are being asserted.   It seems less
>> clear if dwc:associatedMedia is present in a flat Occurrence record, is
>> an assertion about the rights made in the dataset level metadata to be
>> taken to extend to the digital object at the other end of a link found
>> in dwc:associatedMedia?   
> That is one of the common questions - we always advise folk to use a more expressive model (i.e. extensions) where it is necessary to associate titles, rights statements etc.
> More tomorrow.
> Tim
>>> Thanks,
>>> Tim
>>> On 03 Mar 2015, at 17:13, John Wieczorek <tuco at berkeley.edu> wrote:
>>>> I agree. This is particularly problematic in a resource that
>>>> includes a media extension, where the rights of the core records
>>>> may well differ from that of the media, and where the rights on
>>>> individual media vary within the extension. I think creates an
>>>> unacceptable barrier. Instead, could the IPT allow a set of rights
>>>> at the dataset level or validate for rights at the record level?
>>>> On Tue, Mar 3, 2015 at 12:12 PM, Paul J. Morris <mole at morris.net>
>>>> wrote: On Tue, 3 Mar 2015 13:18:35 +0100
>>>> Kyle Braak <kbraak at gbif.org> wrote:
>>>>> Best practice is that the license applied to the dataset should
>>>>> not contradict the license(s) applied at the record level.
>>>> I think this imposes a requirement that the dataset level metadata
>>>> can have a value which indicates that rights are described at the
>>>> record level rather than at the dataset level.  Otherwise, it
>>>> imposes a requirement on data providers that they create a unique
>>>> resource for each separate rights statement, this will be a problem
>>>> for any provider who has more than one rights assertion in their
>>>> data, and for intermediate aggregators who are combining data sets
>>>> from downstream profiders and passing them on to other aggregators
>>>> upstream.
>>>> -Paul
>>>> --
>>>> Paul J. Morris
>>>> Biodiversity Informatics Manager
>>>> Harvard University Herbaria/Museum of Comparative Zoölogy
>>>> mole at morris.net  AA3SD  PGP public key available
>>>> _______________________________________________
>>>> IPT mailing list
>>>> IPT at lists.gbif.org
>>>> http://lists.gbif.org/mailman/listinfo/ipt
>>>> _______________________________________________
>>>> IPT mailing list
>>>> IPT at lists.gbif.org
>>>> http://lists.gbif.org/mailman/listinfo/ipt
>> -- 
>> Paul J. Morris
>> Biodiversity Informatics Manager
>> Harvard University Herbaria/Museum of Comparative Zoölogy
>> mole at morris.net  AA3SD  PGP public key available
> _______________________________________________
> IPT mailing list
> IPT at lists.gbif.org
> http://lists.gbif.org/mailman/listinfo/ipt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gbif.org/pipermail/ipt/attachments/20150313/a3ef7d22/attachment.html>

More information about the IPT mailing list