I understand. The recommendation is clear and concise, which is good at least. I hope the number affected will be few.
On Fri, Mar 13, 2015 at 9:54 AM, Kyle Braak kbraak@gbif.org wrote:
Hi John, Paul,
Reporting back..
I did some investigation into what DataCite intends a list of rights to be used for.
The DataCite Metadata Working Group has explained that a list of rights is intended to support applying multiple licenses that apply to the dataset as a whole. Here’s the answer provided to me on behalf of their chair:
the Metadata Working Group has discussed your question and would like to say that the intension was (a) to allow for multiple licenses to be applied to a dataset. Moreover, we suggest that if different licenses apply to separable components of a dataset, those (various) components ought to have separate metadata records (and so also separate DOIs).
From the DataCite mailing list, I’m told datasets with multiple licenses are pretty common. For example, OpenAIRE applies multiple *complementary* licenses to their datasets sometimes [1].
As for EML, we did an investigation last year into how it allows licenses to be expressed for datasets. We did discover the license [2] and licenseURL fields, however, they relate to software not datasets. The EML mailing list was consulted for guidance on this topic with no answer ever received unfortunately [3]. Furthermore, EML documentation includes no guidance for applying multiple licenses to a dataset (or its components) as far as I can see.
Nevertheless, EML does allow one free-text intellectualRights [4] element per dataset and this is where GBIF expresses a license in its own metadata profile (based on EML). To make the license machine readable/parseable, what we do is use the ulink [5] element inside the intellectualRights to store the license and license URL separately. Since we need to enforce the GBIF licensing Policy [6], we only allow a single license to be expressed though.
To sum up, it’s great we now have a clear recommendation from DataCite on how to apply licenses to datasets. To better integrate with DataCite we will benefit from adopting recommendations in line with theirs.
With kind regards,
Kyle
[1] https://guidelines.openaire.eu/wiki/OpenAIRE_Guidelines:_For_Data_Archives#A... [2] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-sof... [3] https://github.com/peterdesmet/awesome-metadata/issues/2#issuecomment-628856... [4] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-res... [5] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-tex... [6] http://www.gbif.org/terms/licences
On 04 Mar 2015, at 07:01, John Wieczorek tuco@berkeley.edu wrote:
With the latter, I agree. Looking forward to the outcome of the first item. And thanks, Paul, for realizing this and bringing it forward to everyone's attention.
On Tue, Mar 3, 2015 at 3:30 PM, Tim Robertson trobertson@gbif.org wrote:
On Tue, 3 Mar 2015 18:55:35 +0100 Tim Robertson trobertson@gbif.org wrote:
Thanks Paul and Tuco for the feedback - useful food for thought and I note that the DataCite metadata kernel also uses a list for rights, not a single statement. Allowing a collection of rights might be the most applicable solution here.
That still leaves uncertainty, does the list CC-BY, CC-BY-NC mean that the data set in it's entirety is licensed under both CC-BY and CC-BY-NC, or that there are parts under one license and parts under the other. At the data set level, collection of rights statements could easily be interpreted differently than a statement that rights are stated at the record level.
Agreed. We need to understand how this will relate to DataCite, EML etc and other networks we need to integrate with. I am not yet sure if DataCite intends to use it to indicate dual licensing (e.g. it can be used in either license) or if it indicates variance. This needs further investigation and we’ll reply.
A point of clarity though - an image extension allows you to provide metadata about an image that exists on a URL, but the image itself is not part of the DwC-A / dataset. One field of the image metadata is the license applicable for the image but that should not be transferred to the dataset being put out by the IPT. Or are we not in agreement on that? E.g. the DwC-A can be available under CC0 but contain links to online images that could be behind some far more restrictive license.
This does seem fairly clear in an AudobonCore or other media extension where metadata about the rights associated with external media objects are being asserted in the metadata in association with the retrieval locations of those media objects are being asserted. It seems less clear if dwc:associatedMedia is present in a flat Occurrence record, is an assertion about the rights made in the dataset level metadata to be taken to extend to the digital object at the other end of a link found in dwc:associatedMedia?
That is one of the common questions - we always advise folk to use a more expressive model (i.e. extensions) where it is necessary to associate titles, rights statements etc.
More tomorrow. Tim
Thanks, Tim
On 03 Mar 2015, at 17:13, John Wieczorek tuco@berkeley.edu wrote:
I agree. This is particularly problematic in a resource that includes a media extension, where the rights of the core records may well differ from that of the media, and where the rights on individual media vary within the extension. I think creates an unacceptable barrier. Instead, could the IPT allow a set of rights at the dataset level or validate for rights at the record level?
On Tue, Mar 3, 2015 at 12:12 PM, Paul J. Morris mole@morris.net wrote: On Tue, 3 Mar 2015 13:18:35 +0100 Kyle Braak kbraak@gbif.org wrote:
Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
I think this imposes a requirement that the dataset level metadata can have a value which indicates that rights are described at the record level rather than at the dataset level. Otherwise, it imposes a requirement on data providers that they create a unique resource for each separate rights statement, this will be a problem for any provider who has more than one rights assertion in their data, and for intermediate aggregators who are combining data sets from downstream profiders and passing them on to other aggregators upstream.
-Paul
Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available _______________________________________________ IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
-- Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt