Re: [IPT] GBIF Integrated Publishing Toolkit
[forwarding to list again as the message was rejected]
retrieving a full record from the network is a basic often needed feature. But there might be scenarios where people do not want to have a running server at all and prefer to simply publish their dataset to somewhere else. In that case "somewhere else" could be responsible for exposing the records. An IPT used for (national) hosting might be such a solution (given the record resolution is implemented). But one could also think to serve the full, original record by GBIF centrally for those cases. That would require GBIF to cache a full record as it came in of course, e.g. the full ABCD or a DWC with extensions potentially unknown to us. We have that on our wishlist for quite some time already.
Is it correct that both solutions, IPT record resolution or central GBIF resolution, would work for the BioCASe portals?
Markus
On Sep 15, 2010, at 10:48, Guentsch, Anton wrote:
Hi Markus,
I am not so familiar with the IPT but may be this as a clarification: Some of the "special interest networks" using GBIF/BioCASE technology heavily rely on the ability of provider software to return a full data record (e.g. ABCD and its extensions) for a given ID. This is in particular essential for the DNA-Bank-Network, which uses ABCD plus the DNA-extension. A second example is the emerging GBIF-D soil zoology network. Personally, I think that these networks have a great potential for demonstrating the capabilities of the GBIF infrastructures being based on high-quality data subsets and interfaces which directly serve a certain user community.
For the IPT I think that at the very least we would need some kind of interface which makes every record addressable. Preferably, the record should be a full record in the sense that DNA-extensions etc. are somehow represented.
Best regards, Anton
-----Ursprüngliche Nachricht----- Von: "Markus Döring (GBIF)" [mailto:mdoering@gbif.org] Gesendet: Mittwoch, 15. September 2010 10:24 An: =?iso-8859-1?Q?=22Holetschek@ns.gbif.org; Holetschek, Jörg Cc: Tim Robertson (trobertson@gbif.org); ipt@lists.gbif.org; Guentsch, Anton; Berendsohn, Walter G. Betreff: Re: [IPT] GBIF Integrated Publishing Toolkit
Jörg, currently it is not planned to provide single record access of any kind as we removed any record storage, e.g. the previous H2 database, to increase performance and reduce resource requirements. But if that is indeed needed we could add it back in of course. A simple link to a record in xml or rdf based on its ID would be sufficient?
Markus
On Sep 15, 2010, at 10:11, Holetschek, Jörg wrote:
Hi Tim,
the removal of the TAPIR interface will cause problems for all of our
BioCASe portals - for showing the complete record details, they do a DiGIR/BioCASe/TAPIR request, which would then fail.
Will there be an alternative option to get the DwC XML document, for
example via a deep link?
Cheers from Berlin, Jörg
-----Original Message----- From: trobertson@gbif.org [mailto:trobertson@gbif.org] Sent: Monday, 13 September, 2010 16:11 To: Berendsohn, Walter G. Subject: GBIF Integrated Publishing Toolkit
[FOR INFORMATION]
Dear Node Managers and Data Publishers,
The objective of this communication is to update you on the current
situation of the GBIF Integrated Publishing Toolkit (IPT). The IPT is a tool providing publishing capabilities for primary biodiversity occurrence data, taxonomic checklists and the associated metadata for these resource types.
Since its first introduction in March 2009, the IPT has been used
with some success by a limited number of users to publish primary biodiversity data into the GBIF network, along with descriptive metadata at the dataset level.
The GBIF Secretariat (GBIFS) has solicited and received a vast amount
of feedback during the first year of the IPT use, and would like to thank all those who provided this. The feedback received overwhelmingly confirms that the concept of the IPT is sound. However, feedback has also made it clear that it is over-specified for the majority of user needs, is not yet sufficiently robust, is too slow in operation, and (due to the excessive functionality) has high server requirements creating a barrier for many to adopt and use it. In addition, the feedback has made it clear that our communications surrounding the status of the IPT releases have been too infrequent and unclear concerning which release candidate we were offering and what functionalities this release candidate would include or not include, as per user needs/expectations.
Based upon the extensive feedback GBIFS has revised our planned
development roadmap for the IPT and are currently performing a major refactoring effort to:
. Reduce the server requirements significantly . Increase the data import performance by removing the embedded
database
. Remove the dependency on heavy libraries and tools such as
Geoserver
. Remove all data interfaces that are not necessary for data
publication through GBIF
. Improve the robustness of the tool
When complete, the refactored IPT will offer through an intuitive
interface:
. Authoring of metadata according to the GBIF metadata profile . Import and mapping of checklist and occurrence data either from
file upload or by connection of a database
. Registration to and publication onto the GBIF network . Importing and mapping of data to the DarwinCore, and DarwinCore
extensions
. Output formats of EML (2.1.0) metadata and DarwinCore Archive . Improved customisation options for the "About this IPT" . Improved GBIF Registry integration . Improved DarwinCore extension and vocabulary organization . Improved management of the organisations to which the IPT and
resources are related, to enable co-hosting capabilities
. Enhanced dataset metadata authoring
Whilst this is a reduction in existing functionality this has been
deemed necessary to ensure the IPT is meeting the core requirements as expressed by the user community. The following features will be removed and only reintroduced if necessary through consultation with the IPT community:
. TAPIR interface . TCS output format . Search and browse web interface . OGC web services
All IPT development efforts are now focused on this revised roadmap,
and it is anticipated that early testing of the new codebase will commence in October 2010 with the involvement of willing Nodes. The release of a fully stable IPT is targeted for the end of 2010. However, this will only occur if the testing community is satisfied that the product is 'release-ready' together with the required user manual and technical documentation. Should that not be the case, as it may take more time to implement and test the improvements outlined above, you will be informed as soon as possible.
Further developments of the IPT beyond this release will only be
implemented after a further scoping with the GBIF community. An area for further IPT discussion will be set up on the GBIF community website (http://community.gbif.org)
If you have any questions or would like to know more about the GBIF
IPT and/or revised process, feel free to contact:
Tim Robertson GBIF Information Systems Architect trobertson@gbif.org
With best regards,
Tim Robertson GBIF Secretariat
IPT mailing list IPT@lists.gbif.org http://lists.gbif.org/mailman/listinfo/ipt
Hi Markus,
sure - the portals don't care where the record details are retrieved from, as long as they're available through ideally a typical protocol request or, less ideal, a custom link. Both protocol and access point should be stored in the Index, but I guess that's not the big issue.
Cheers, Jörg
-----Ursprüngliche Nachricht----- Von: "Markus Döring (GBIF)" [mailto:mdoering@gbif.org] Gesendet: Mittwoch, 15. September 2010 11:08 An: Guentsch, Anton Cc: Holetschek, Jörg; Tim Robertson (trobertson@gbif.org); ipt@lists.gbif.org; Berendsohn, Walter G. Betreff: Re: [IPT] GBIF Integrated Publishing Toolkit
[forwarding to list again as the message was rejected]
retrieving a full record from the network is a basic often needed feature. But there might be scenarios where people do not want to have a running server at all and prefer to simply publish their dataset to somewhere else. In that case "somewhere else" could be responsible for exposing the records. An IPT used for (national) hosting might be such a solution (given the record resolution is implemented). But one could also think to serve the full, original record by GBIF centrally for those cases. That would require GBIF to cache a full record as it came in of course, e.g. the full ABCD or a DWC with extensions potentially unknown to us. We have that on our wishlist for quite some time already.
Is it correct that both solutions, IPT record resolution or central GBIF resolution, would work for the BioCASe portals?
Markus
On Sep 15, 2010, at 10:48, Guentsch, Anton wrote:
Hi Markus,
I am not so familiar with the IPT but may be this as a clarification: Some of the "special interest networks" using GBIF/BioCASE technology heavily rely on the ability of provider software to return a full data record (e.g. ABCD and its extensions) for a given ID. This is in particular essential for the DNA-Bank-Network, which uses ABCD plus the DNA-extension. A second example is the emerging GBIF-D soil zoology network. Personally, I think that these networks have a great potential for demonstrating the capabilities of the GBIF infrastructures being based on high-quality data subsets and interfaces which directly serve a certain user community.
For the IPT I think that at the very least we would need some kind of interface which makes every record addressable. Preferably, the record should be a full record in the sense that DNA-extensions etc. are somehow represented.
Best regards, Anton
-----Ursprüngliche Nachricht----- Von: "Markus Döring (GBIF)" [mailto:mdoering@gbif.org] Gesendet: Mittwoch, 15. September 2010 10:24 An: =?iso-8859-1?Q?=22Holetschek@ns.gbif.org; Holetschek, Jörg Cc: Tim Robertson (trobertson@gbif.org); ipt@lists.gbif.org; Guentsch, Anton; Berendsohn, Walter G. Betreff: Re: [IPT] GBIF Integrated Publishing Toolkit
Jörg, currently it is not planned to provide single record access of any kind as we removed any record storage, e.g. the previous H2 database, to increase performance and reduce resource requirements. But if that is indeed needed we could add it back in of course. A simple link to a record in xml or rdf based on its ID would be sufficient?
Markus
On Sep 15, 2010, at 10:11, Holetschek, Jörg wrote:
Hi Tim,
the removal of the TAPIR interface will cause problems for all of our
BioCASe portals - for showing the complete record details, they do a DiGIR/BioCASe/TAPIR request, which would then fail.
Will there be an alternative option to get the DwC XML document, for
example via a deep link?
Cheers from Berlin, Jörg
participants (2)
-
"Holetschek, Jörg"
-
"Markus Döring (GBIF)"