Dear IPT users,
A release candidate of IPT v2.2 is now available for download [1]. Please note this release is only intended for testing, and should not be installed in production. If you don’t have time to install it, you can always try it out at http://ipt.gbif-uat.org/ [2] (just email me if you want an account).
There have been a lot of changes in this version. Below I list some of the most exciting changes that we’d appreciate your help testing:
3 new major features added: It can be configured with a DataCite or EZID account making it possible to assign DOIs to resources (if you don’t have a DataCite or EZID account, just email me for a test account) It can auto-generate a citation for a resource including the version number and DOI, which enable consistent and reliable citation. It can validate that each occurrence record has a valid basisOfRecord 3 new minor features added: Source mappings can be previewed prior to publication The resource version pending publication can be previewed prior to publication It uses a new version of the GBIF Metadata Profile (v1.1) to publish metadata allowing: i) a user ID (e.g. ORCID) to be entered for each contact, ii) a machine readable license to be stored, iii) support for multiple collections, iv) support for multiple creators, contacts, metadata providers, project personnel 3 notable bugs fixed: Cells in Excel now get written as they are displayed in Excel; previously integers were being written as decimals, and formulas were written as unevaluated strings A reverse proxy (mod_proxy) can be used regardless of the URL; previously this wouldn’t work if the URL didn’t contain the application context (e.g. /ipt) IPT can be deployed using Wildfly8 (JBoss); previously this didn't work because of a dependency incompatibility Also please be aware that starting with this version, publishers are required to publish their data under a CC0 1.0, CC-BY 4.0 or CC-BY-NC 4.0 declaration. Those who do not choose one of these licenses will not be able to register their resource with GBIF or make it globally discoverable through GBIF.org. For background on GBIF's decision, see here [3].
If no major problems are uncovered during testing, IPT v2.2 will be released in a couple weeks. In the meantime, volunteer translators are also hard at work, and the user manual is being updated.
Thank you for all your help,
Kyle, on behalf of the IPT development team and the GBIF Secretariat
[1] https://dl.dropboxusercontent.com/u/4552753/ipt-2.2-rc1.war [2] http://ipt.gbif-uat.org/ [3] http://www.gbif.org/terms/licences
On Thu, 26 Feb 2015 14:41:03 +0100 Kyle Braak kbraak@gbif.org wrote:
Also please be aware that starting with this version, publishers are required to publish their data under a CC0 1.0, CC-BY 4.0 or CC-BY-NC 4.0 declaration.
Kyle,
Will this version of IPT be capable of publishing a resource where different rows in the resource have different rights holders and license terms?
-Paul
Dear Paul,
Yes, you can still map to dcterms:license and dcterms:rightsHolder. What’s new in this version is that the publisher must apply a license to the dataset on the basic metadata page. Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
Best regards,
Kyle
On 02 Mar 2015, at 19:12, Paul J. Morris mole@morris.net wrote:
On Thu, 26 Feb 2015 14:41:03 +0100 Kyle Braak kbraak@gbif.org wrote:
Also please be aware that starting with this version, publishers are required to publish their data under a CC0 1.0, CC-BY 4.0 or CC-BY-NC 4.0 declaration.
Kyle,
Will this version of IPT be capable of publishing a resource where different rows in the resource have different rights holders and license terms?
-Paul
Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available
On Tue, 3 Mar 2015 13:18:35 +0100 Kyle Braak kbraak@gbif.org wrote:
Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
I think this imposes a requirement that the dataset level metadata can have a value which indicates that rights are described at the record level rather than at the dataset level. Otherwise, it imposes a requirement on data providers that they create a unique resource for each separate rights statement, this will be a problem for any provider who has more than one rights assertion in their data, and for intermediate aggregators who are combining data sets from downstream profiders and passing them on to other aggregators upstream.
-Paul
I agree. This is particularly problematic in a resource that includes a media extension, where the rights of the core records may well differ from that of the media, and where the rights on individual media vary within the extension. I think creates an unacceptable barrier. Instead, could the IPT allow a set of rights at the dataset level or validate for rights at the record level?
On Tue, Mar 3, 2015 at 12:12 PM, Paul J. Morris mole@morris.net wrote:
On Tue, 3 Mar 2015 13:18:35 +0100 Kyle Braak kbraak@gbif.org wrote:
Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
I think this imposes a requirement that the dataset level metadata can have a value which indicates that rights are described at the record level rather than at the dataset level. Otherwise, it imposes a requirement on data providers that they create a unique resource for each separate rights statement, this will be a problem for any provider who has more than one rights assertion in their data, and for intermediate aggregators who are combining data sets from downstream profiders and passing them on to other aggregators upstream.
-Paul
Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available _______________________________________________ IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
Thanks Paul and Tuco for the feedback - useful food for thought and I note that the DataCite metadata kernel also uses a list for rights, not a single statement. Allowing a collection of rights might be the most applicable solution here.
A point of clarity though - an image extension allows you to provide metadata about an image that exists on a URL, but the image itself is not part of the DwC-A / dataset. One field of the image metadata is the license applicable for the image but that should not be transferred to the dataset being put out by the IPT. Or are we not in agreement on that? E.g. the DwC-A can be available under CC0 but contain links to online images that could be behind some far more restrictive license.
Thanks, Tim
On 03 Mar 2015, at 17:13, John Wieczorek tuco@berkeley.edu wrote:
I agree. This is particularly problematic in a resource that includes a media extension, where the rights of the core records may well differ from that of the media, and where the rights on individual media vary within the extension. I think creates an unacceptable barrier. Instead, could the IPT allow a set of rights at the dataset level or validate for rights at the record level?
On Tue, Mar 3, 2015 at 12:12 PM, Paul J. Morris mole@morris.net wrote: On Tue, 3 Mar 2015 13:18:35 +0100 Kyle Braak kbraak@gbif.org wrote:
Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
I think this imposes a requirement that the dataset level metadata can have a value which indicates that rights are described at the record level rather than at the dataset level. Otherwise, it imposes a requirement on data providers that they create a unique resource for each separate rights statement, this will be a problem for any provider who has more than one rights assertion in their data, and for intermediate aggregators who are combining data sets from downstream profiders and passing them on to other aggregators upstream.
-Paul
Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available _______________________________________________ IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
On Tue, 3 Mar 2015 18:55:35 +0100 Tim Robertson trobertson@gbif.org wrote:
Thanks Paul and Tuco for the feedback - useful food for thought and I note that the DataCite metadata kernel also uses a list for rights, not a single statement. Allowing a collection of rights might be the most applicable solution here.
That still leaves uncertainty, does the list CC-BY, CC-BY-NC mean that the data set in it's entirety is licensed under both CC-BY and CC-BY-NC, or that there are parts under one license and parts under the other. At the data set level, collection of rights statements could easily be interpreted differently than a statement that rights are stated at the record level.
A point of clarity though - an image extension allows you to provide metadata about an image that exists on a URL, but the image itself is not part of the DwC-A / dataset. One field of the image metadata is the license applicable for the image but that should not be transferred to the dataset being put out by the IPT. Or are we not in agreement on that? E.g. the DwC-A can be available under CC0 but contain links to online images that could be behind some far more restrictive license.
This does seem fairly clear in an AudobonCore or other media extension where metadata about the rights associated with external media objects are being asserted in the metadata in association with the retrieval locations of those media objects are being asserted. It seems less clear if dwc:associatedMedia is present in a flat Occurrence record, is an assertion about the rights made in the dataset level metadata to be taken to extend to the digital object at the other end of a link found in dwc:associatedMedia?
-Paul
Thanks, Tim
On 03 Mar 2015, at 17:13, John Wieczorek tuco@berkeley.edu wrote:
I agree. This is particularly problematic in a resource that includes a media extension, where the rights of the core records may well differ from that of the media, and where the rights on individual media vary within the extension. I think creates an unacceptable barrier. Instead, could the IPT allow a set of rights at the dataset level or validate for rights at the record level?
On Tue, Mar 3, 2015 at 12:12 PM, Paul J. Morris mole@morris.net wrote: On Tue, 3 Mar 2015 13:18:35 +0100 Kyle Braak kbraak@gbif.org wrote:
Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
I think this imposes a requirement that the dataset level metadata can have a value which indicates that rights are described at the record level rather than at the dataset level. Otherwise, it imposes a requirement on data providers that they create a unique resource for each separate rights statement, this will be a problem for any provider who has more than one rights assertion in their data, and for intermediate aggregators who are combining data sets from downstream profiders and passing them on to other aggregators upstream.
-Paul
Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available _______________________________________________ IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
On Tue, 3 Mar 2015 18:55:35 +0100 Tim Robertson trobertson@gbif.org wrote:
Thanks Paul and Tuco for the feedback - useful food for thought and I note that the DataCite metadata kernel also uses a list for rights, not a single statement. Allowing a collection of rights might be the most applicable solution here.
That still leaves uncertainty, does the list CC-BY, CC-BY-NC mean that the data set in it's entirety is licensed under both CC-BY and CC-BY-NC, or that there are parts under one license and parts under the other. At the data set level, collection of rights statements could easily be interpreted differently than a statement that rights are stated at the record level.
Agreed. We need to understand how this will relate to DataCite, EML etc and other networks we need to integrate with. I am not yet sure if DataCite intends to use it to indicate dual licensing (e.g. it can be used in either license) or if it indicates variance. This needs further investigation and we’ll reply.
A point of clarity though - an image extension allows you to provide metadata about an image that exists on a URL, but the image itself is not part of the DwC-A / dataset. One field of the image metadata is the license applicable for the image but that should not be transferred to the dataset being put out by the IPT. Or are we not in agreement on that? E.g. the DwC-A can be available under CC0 but contain links to online images that could be behind some far more restrictive license.
This does seem fairly clear in an AudobonCore or other media extension where metadata about the rights associated with external media objects are being asserted in the metadata in association with the retrieval locations of those media objects are being asserted. It seems less clear if dwc:associatedMedia is present in a flat Occurrence record, is an assertion about the rights made in the dataset level metadata to be taken to extend to the digital object at the other end of a link found in dwc:associatedMedia?
That is one of the common questions - we always advise folk to use a more expressive model (i.e. extensions) where it is necessary to associate titles, rights statements etc.
More tomorrow. Tim
Thanks, Tim
On 03 Mar 2015, at 17:13, John Wieczorek tuco@berkeley.edu wrote:
I agree. This is particularly problematic in a resource that includes a media extension, where the rights of the core records may well differ from that of the media, and where the rights on individual media vary within the extension. I think creates an unacceptable barrier. Instead, could the IPT allow a set of rights at the dataset level or validate for rights at the record level?
On Tue, Mar 3, 2015 at 12:12 PM, Paul J. Morris mole@morris.net wrote: On Tue, 3 Mar 2015 13:18:35 +0100 Kyle Braak kbraak@gbif.org wrote:
Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
I think this imposes a requirement that the dataset level metadata can have a value which indicates that rights are described at the record level rather than at the dataset level. Otherwise, it imposes a requirement on data providers that they create a unique resource for each separate rights statement, this will be a problem for any provider who has more than one rights assertion in their data, and for intermediate aggregators who are combining data sets from downstream profiders and passing them on to other aggregators upstream.
-Paul
Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available _______________________________________________ IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
-- Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available
With the latter, I agree. Looking forward to the outcome of the first item. And thanks, Paul, for realizing this and bringing it forward to everyone's attention.
On Tue, Mar 3, 2015 at 3:30 PM, Tim Robertson trobertson@gbif.org wrote:
On Tue, 3 Mar 2015 18:55:35 +0100 Tim Robertson trobertson@gbif.org wrote:
Thanks Paul and Tuco for the feedback - useful food for thought and I note that the DataCite metadata kernel also uses a list for rights, not a single statement. Allowing a collection of rights might be the most applicable solution here.
That still leaves uncertainty, does the list CC-BY, CC-BY-NC mean that the data set in it's entirety is licensed under both CC-BY and CC-BY-NC, or that there are parts under one license and parts under the other. At the data set level, collection of rights statements could easily be interpreted differently than a statement that rights are stated at the record level.
Agreed. We need to understand how this will relate to DataCite, EML etc and other networks we need to integrate with. I am not yet sure if DataCite intends to use it to indicate dual licensing (e.g. it can be used in either license) or if it indicates variance. This needs further investigation and we’ll reply.
A point of clarity though - an image extension allows you to provide metadata about an image that exists on a URL, but the image itself is not part of the DwC-A / dataset. One field of the image metadata is the license applicable for the image but that should not be transferred to the dataset being put out by the IPT. Or are we not in agreement on that? E.g. the DwC-A can be available under CC0 but contain links to online images that could be behind some far more restrictive license.
This does seem fairly clear in an AudobonCore or other media extension where metadata about the rights associated with external media objects are being asserted in the metadata in association with the retrieval locations of those media objects are being asserted. It seems less clear if dwc:associatedMedia is present in a flat Occurrence record, is an assertion about the rights made in the dataset level metadata to be taken to extend to the digital object at the other end of a link found in dwc:associatedMedia?
That is one of the common questions - we always advise folk to use a more expressive model (i.e. extensions) where it is necessary to associate titles, rights statements etc.
More tomorrow. Tim
Thanks, Tim
On 03 Mar 2015, at 17:13, John Wieczorek tuco@berkeley.edu wrote:
I agree. This is particularly problematic in a resource that includes a media extension, where the rights of the core records may well differ from that of the media, and where the rights on individual media vary within the extension. I think creates an unacceptable barrier. Instead, could the IPT allow a set of rights at the dataset level or validate for rights at the record level?
On Tue, Mar 3, 2015 at 12:12 PM, Paul J. Morris mole@morris.net wrote: On Tue, 3 Mar 2015 13:18:35 +0100 Kyle Braak kbraak@gbif.org wrote:
Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
I think this imposes a requirement that the dataset level metadata can have a value which indicates that rights are described at the record level rather than at the dataset level. Otherwise, it imposes a requirement on data providers that they create a unique resource for each separate rights statement, this will be a problem for any provider who has more than one rights assertion in their data, and for intermediate aggregators who are combining data sets from downstream profiders and passing them on to other aggregators upstream.
-Paul
Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available _______________________________________________ IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
-- Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available
Hi John, Paul,
Reporting back..
I did some investigation into what DataCite intends a list of rights to be used for.
The DataCite Metadata Working Group has explained that a list of rights is intended to support applying multiple licenses that apply to the dataset as a whole. Here’s the answer provided to me on behalf of their chair:
the Metadata Working Group has discussed your question and would like to say that the intension was (a) to allow for multiple licenses to be applied to a dataset. Moreover, we suggest that if different licenses apply to separable components of a dataset, those (various) components ought to have separate metadata records (and so also separate DOIs).
From the DataCite mailing list, I’m told datasets with multiple licenses are pretty common. For example, OpenAIRE applies multiple complementary licenses to their datasets sometimes [1].
As for EML, we did an investigation last year into how it allows licenses to be expressed for datasets. We did discover the license [2] and licenseURL fields, however, they relate to software not datasets. The EML mailing list was consulted for guidance on this topic with no answer ever received unfortunately [3]. Furthermore, EML documentation includes no guidance for applying multiple licenses to a dataset (or its components) as far as I can see.
Nevertheless, EML does allow one free-text intellectualRights [4] element per dataset and this is where GBIF expresses a license in its own metadata profile (based on EML). To make the license machine readable/parseable, what we do is use the ulink [5] element inside the intellectualRights to store the license and license URL separately. Since we need to enforce the GBIF licensing Policy [6], we only allow a single license to be expressed though.
To sum up, it’s great we now have a clear recommendation from DataCite on how to apply licenses to datasets. To better integrate with DataCite we will benefit from adopting recommendations in line with theirs.
With kind regards,
Kyle
[1] https://guidelines.openaire.eu/wiki/OpenAIRE_Guidelines:_For_Data_Archives#A... [2] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-sof... [3] https://github.com/peterdesmet/awesome-metadata/issues/2#issuecomment-628856... [4] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-res... [5] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-tex... [6] http://www.gbif.org/terms/licences
On 04 Mar 2015, at 07:01, John Wieczorek tuco@berkeley.edu wrote:
With the latter, I agree. Looking forward to the outcome of the first item. And thanks, Paul, for realizing this and bringing it forward to everyone's attention.
On Tue, Mar 3, 2015 at 3:30 PM, Tim Robertson trobertson@gbif.org wrote:
On Tue, 3 Mar 2015 18:55:35 +0100 Tim Robertson trobertson@gbif.org wrote:
Thanks Paul and Tuco for the feedback - useful food for thought and I note that the DataCite metadata kernel also uses a list for rights, not a single statement. Allowing a collection of rights might be the most applicable solution here.
That still leaves uncertainty, does the list CC-BY, CC-BY-NC mean that the data set in it's entirety is licensed under both CC-BY and CC-BY-NC, or that there are parts under one license and parts under the other. At the data set level, collection of rights statements could easily be interpreted differently than a statement that rights are stated at the record level.
Agreed. We need to understand how this will relate to DataCite, EML etc and other networks we need to integrate with. I am not yet sure if DataCite intends to use it to indicate dual licensing (e.g. it can be used in either license) or if it indicates variance. This needs further investigation and we’ll reply.
A point of clarity though - an image extension allows you to provide metadata about an image that exists on a URL, but the image itself is not part of the DwC-A / dataset. One field of the image metadata is the license applicable for the image but that should not be transferred to the dataset being put out by the IPT. Or are we not in agreement on that? E.g. the DwC-A can be available under CC0 but contain links to online images that could be behind some far more restrictive license.
This does seem fairly clear in an AudobonCore or other media extension where metadata about the rights associated with external media objects are being asserted in the metadata in association with the retrieval locations of those media objects are being asserted. It seems less clear if dwc:associatedMedia is present in a flat Occurrence record, is an assertion about the rights made in the dataset level metadata to be taken to extend to the digital object at the other end of a link found in dwc:associatedMedia?
That is one of the common questions - we always advise folk to use a more expressive model (i.e. extensions) where it is necessary to associate titles, rights statements etc.
More tomorrow. Tim
Thanks, Tim
On 03 Mar 2015, at 17:13, John Wieczorek tuco@berkeley.edu wrote:
I agree. This is particularly problematic in a resource that includes a media extension, where the rights of the core records may well differ from that of the media, and where the rights on individual media vary within the extension. I think creates an unacceptable barrier. Instead, could the IPT allow a set of rights at the dataset level or validate for rights at the record level?
On Tue, Mar 3, 2015 at 12:12 PM, Paul J. Morris mole@morris.net wrote: On Tue, 3 Mar 2015 13:18:35 +0100 Kyle Braak kbraak@gbif.org wrote:
Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
I think this imposes a requirement that the dataset level metadata can have a value which indicates that rights are described at the record level rather than at the dataset level. Otherwise, it imposes a requirement on data providers that they create a unique resource for each separate rights statement, this will be a problem for any provider who has more than one rights assertion in their data, and for intermediate aggregators who are combining data sets from downstream profiders and passing them on to other aggregators upstream.
-Paul
Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available _______________________________________________ IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
-- Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
I understand. The recommendation is clear and concise, which is good at least. I hope the number affected will be few.
On Fri, Mar 13, 2015 at 9:54 AM, Kyle Braak kbraak@gbif.org wrote:
Hi John, Paul,
Reporting back..
I did some investigation into what DataCite intends a list of rights to be used for.
The DataCite Metadata Working Group has explained that a list of rights is intended to support applying multiple licenses that apply to the dataset as a whole. Here’s the answer provided to me on behalf of their chair:
the Metadata Working Group has discussed your question and would like to say that the intension was (a) to allow for multiple licenses to be applied to a dataset. Moreover, we suggest that if different licenses apply to separable components of a dataset, those (various) components ought to have separate metadata records (and so also separate DOIs).
From the DataCite mailing list, I’m told datasets with multiple licenses are pretty common. For example, OpenAIRE applies multiple *complementary* licenses to their datasets sometimes [1].
As for EML, we did an investigation last year into how it allows licenses to be expressed for datasets. We did discover the license [2] and licenseURL fields, however, they relate to software not datasets. The EML mailing list was consulted for guidance on this topic with no answer ever received unfortunately [3]. Furthermore, EML documentation includes no guidance for applying multiple licenses to a dataset (or its components) as far as I can see.
Nevertheless, EML does allow one free-text intellectualRights [4] element per dataset and this is where GBIF expresses a license in its own metadata profile (based on EML). To make the license machine readable/parseable, what we do is use the ulink [5] element inside the intellectualRights to store the license and license URL separately. Since we need to enforce the GBIF licensing Policy [6], we only allow a single license to be expressed though.
To sum up, it’s great we now have a clear recommendation from DataCite on how to apply licenses to datasets. To better integrate with DataCite we will benefit from adopting recommendations in line with theirs.
With kind regards,
Kyle
[1] https://guidelines.openaire.eu/wiki/OpenAIRE_Guidelines:_For_Data_Archives#A... [2] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-sof... [3] https://github.com/peterdesmet/awesome-metadata/issues/2#issuecomment-628856... [4] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-res... [5] https://knb.ecoinformatics.org/#external//emlparser/docs/eml-2.1.1/./eml-tex... [6] http://www.gbif.org/terms/licences
On 04 Mar 2015, at 07:01, John Wieczorek tuco@berkeley.edu wrote:
With the latter, I agree. Looking forward to the outcome of the first item. And thanks, Paul, for realizing this and bringing it forward to everyone's attention.
On Tue, Mar 3, 2015 at 3:30 PM, Tim Robertson trobertson@gbif.org wrote:
On Tue, 3 Mar 2015 18:55:35 +0100 Tim Robertson trobertson@gbif.org wrote:
Thanks Paul and Tuco for the feedback - useful food for thought and I note that the DataCite metadata kernel also uses a list for rights, not a single statement. Allowing a collection of rights might be the most applicable solution here.
That still leaves uncertainty, does the list CC-BY, CC-BY-NC mean that the data set in it's entirety is licensed under both CC-BY and CC-BY-NC, or that there are parts under one license and parts under the other. At the data set level, collection of rights statements could easily be interpreted differently than a statement that rights are stated at the record level.
Agreed. We need to understand how this will relate to DataCite, EML etc and other networks we need to integrate with. I am not yet sure if DataCite intends to use it to indicate dual licensing (e.g. it can be used in either license) or if it indicates variance. This needs further investigation and we’ll reply.
A point of clarity though - an image extension allows you to provide metadata about an image that exists on a URL, but the image itself is not part of the DwC-A / dataset. One field of the image metadata is the license applicable for the image but that should not be transferred to the dataset being put out by the IPT. Or are we not in agreement on that? E.g. the DwC-A can be available under CC0 but contain links to online images that could be behind some far more restrictive license.
This does seem fairly clear in an AudobonCore or other media extension where metadata about the rights associated with external media objects are being asserted in the metadata in association with the retrieval locations of those media objects are being asserted. It seems less clear if dwc:associatedMedia is present in a flat Occurrence record, is an assertion about the rights made in the dataset level metadata to be taken to extend to the digital object at the other end of a link found in dwc:associatedMedia?
That is one of the common questions - we always advise folk to use a more expressive model (i.e. extensions) where it is necessary to associate titles, rights statements etc.
More tomorrow. Tim
Thanks, Tim
On 03 Mar 2015, at 17:13, John Wieczorek tuco@berkeley.edu wrote:
I agree. This is particularly problematic in a resource that includes a media extension, where the rights of the core records may well differ from that of the media, and where the rights on individual media vary within the extension. I think creates an unacceptable barrier. Instead, could the IPT allow a set of rights at the dataset level or validate for rights at the record level?
On Tue, Mar 3, 2015 at 12:12 PM, Paul J. Morris mole@morris.net wrote: On Tue, 3 Mar 2015 13:18:35 +0100 Kyle Braak kbraak@gbif.org wrote:
Best practice is that the license applied to the dataset should not contradict the license(s) applied at the record level.
I think this imposes a requirement that the dataset level metadata can have a value which indicates that rights are described at the record level rather than at the dataset level. Otherwise, it imposes a requirement on data providers that they create a unique resource for each separate rights statement, this will be a problem for any provider who has more than one rights assertion in their data, and for intermediate aggregators who are combining data sets from downstream profiders and passing them on to other aggregators upstream.
-Paul
Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available _______________________________________________ IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
-- Paul J. Morris Biodiversity Informatics Manager Harvard University Herbaria/Museum of Comparative Zoölogy mole@morris.net AA3SD PGP public key available
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
participants (4)
-
John Wieczorek
-
Kyle Braak
-
Paul J. Morris
-
Tim Robertson