<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=iso-8859-1" http-equiv=Content-Type>
<META name=GENERATOR content="MSHTML 8.00.7600.16625"></HEAD>
<BODY
style="WORD-WRAP: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space">
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Verdana><SPAN
class=842375711-15092010>Tim,</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Verdana><SPAN
class=842375711-15092010></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Verdana><SPAN
class=842375711-15092010>how will the GBIF Indexer store the original DwC
record? Is it kept in a text field in the Index database? In the original DwC
archive?</SPAN></FONT></DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Verdana><SPAN
class=842375711-15092010></SPAN></FONT> </DIV>
<DIV dir=ltr align=left><FONT color=#0000ff size=2 face=Verdana><SPAN
class=842375711-15092010>Jörg</SPAN></FONT></DIV><BR>
<DIV dir=ltr lang=de class=OutlookMessageHeader align=left>
<HR tabIndex=-1>
<FONT size=2 face=Tahoma><B>Von:</B> ipt-bounces@lists.gbif.org
[mailto:ipt-bounces@lists.gbif.org] <B>Im Auftrag von </B>Tim Robertson
(GBIF)<BR><B>Gesendet:</B> Mittwoch, 15. September 2010 13:29<BR><B>An:</B>
ipt@lists.gbif.org<BR><B>Betreff:</B> Re: [IPT] Functionality request: ADMIN
checking data before GBIFregistration<BR></FONT><BR></DIV>
<DIV></DIV>Thanks all for the comments, which I will try and collate here with
some resolutions:
<DIV><BR></DIV>
<DIV><B>Propose resolved:</B></DIV>
<DIV>a) The preference would be to allow the ADMIN to provide Registration
privileges to individual MANAGERS</DIV>
<DIV><BR></DIV>
<DIV><B>Outstanding issues:</B></DIV>
<DIV>b) Visualisations are important, and will need discussion and potentially
further IPT modules developed, or deployment of external services. </DIV>
<DIV>I know of one group considering the development of DwC-A visualisations
already, and technologies like Google Fusion tables makes this kind of thing
trivial. </DIV>
<DIV><BR></DIV>
<DIV>
<DIV>c) record resolution is something that has been indicated as important.
This can be achieved in one of 2 ways:</DIV>
<DIV> > implementation in the IPT, and requires research into
technologies to perform satisfactorily</DIV>
<DIV> Potential technologies might be</DIV>
<DIV> - Berkeley DB</DIV>
<DIV> - A relational database, such as Mysql or
H2</DIV>
<DIV> - Lucene indexing </DIV>
<DIV> > reliance on a "stable cache" for record serving</DIV>
<DIV><BR></DIV>
<DIV>The first release (for test purposes) of the revised IPT software will not
have record level serving, but while this is being developed, I would like to
ask people to start discussing what kind of record level serving is truly a
requirement in the IPT, as opposed to a "nice to have". We support DwC-A
in the GBIF portal, and the intention is to simply reserve the record that came
from the DwC-A directly, unless the record indicates there is further
information on a URL (e.g. if the record identifier is an LSID). Would
this strategy not be suitable for the likes of the BioCASe portals as often
there is no further information to redirect to? in the case of the IPT,
there is no extra information, and I propose should the source be a DwC-A, that
individual records be cached in the harvesting portal. With this approach,
there would be no individual record serving needs in the IPT.</DIV>
<DIV><BR></DIV>
<DIV>Ultimately, we might consider aiming for data owners offering single
records on a resolvable URL, and conforming to Linked/Open Data requirements,
along with a DwC-A effectively providing a single "index" view of the
dataset. The DwC-A records would reference the originals by resolvable ID
so any search system would always be able to point back to the authoritative
source. This would effectively be distributed indexing, and not dissimilar
to the sharing of sitemaps, but with extra information to enable better
discovery. </DIV>
<DIV><BR></DIV>
<DIV>Thank you all for this feedback, and please correct any misunderstandings
on my part</DIV>
<DIV><BR></DIV>
<DIV>Tim</DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV><BR></DIV>
<DIV> </DIV>
<DIV> </DIV>
<DIV><BR>
<DIV>
<DIV>On Sep 15, 2010, at 12:03 PM, Mihail-Constantin Carausu wrote:</DIV><BR
class=Apple-interchange-newline>
<BLOCKQUOTE type="cite">
<DIV>Dear Tim<BR><BR>I think the development team's mentioned approach is a
workable solution<BR>to cover both requirements at this stage.<BR>However, I
think Hannu (and me) had in mind a kind of "Basket of<BR>approvals"-alike
functionality in the Admin's (owner of the provider)<BR>administration section
side: When a Manager has been published a dataset<BR>through the IPT, this
will automatically trigger a request for approval<BR>or submits an yes/no
event in the Admin's administration section. The<BR>Admin must finally active
interfere and approve the dataset publication<BR>(e.g. by checking an
"Approved" check box in the basket of approvals<BR>list with events/datasets
in the administration section) at the absolute<BR>latest stage (e.g. when GBIF
just needs to start to index it, or<BR>something like that). Without this
final approval the dataset will still<BR>be published and visible through the
IPT but not visible/searchable on<BR>the GBIF data portal. This approach is
not necessarily in contradiction<BR>with the Manager's ability to autonomously
publish datasets within the<BR>IPT, only it puts this ability always under
control from the central<BR>administration section when the dataset has to go
to the GBIF data<BR>portal. <BR>I think both solutions/approaches have obvious
advantages and<BR>disadvantages while none of them provides a 100% protection
against<BR>publishing something odd by a (test) user.<BR>I have a little
question regarding the development team's proposed<BR>solutions: is it
not possible for the central Admin to enable the<BR>publishing ability for
some "trusted" managers and disable this for<BR>others inside the same
instance of the IPT.<BR><BR>Now I saw Hannu's new message just arrived, sorry
for eventually<BR>unsynchronized double-crossing messages, but I will send
this anyhow.<BR><BR>Best
regards,<BR>Mihail<BR><BR>---------------------------- <BR>Mihail
Carausu<BR>MSc.Eng., Informatics Manager<BR>Danish Biodiversity Information
Facility (DanBIF)<BR>--------------------------------------------
<BR><BR><BR>-----Original Message-----<BR>From: ipt-bounces@lists.gbif.org [<A
href="mailto:ipt-bounces@lists.gbif.org">mailto:ipt-bounces@lists.gbif.org</A>]
On<BR>Behalf Of Tim Robertson (GBIF)<BR>Sent: 15. september 2010 09:43<BR>To:
<A href="mailto:ipt@lists.gbif.org">ipt@lists.gbif.org</A><BR>Subject: [IPT]
Functionality request: ADMIN checking data
before<BR>GBIFregistration<BR><BR>Hi all,<BR><BR>Hannu has raised a request
for the following to be satisfied by the IPT:<BR><SPAN
style="WHITE-SPACE: pre" class=Apple-tab-span></SPAN>"- Publishing a resource
must be accepted by the owner of the <BR>provider. It has happened
that a test user publishes something odd <BR>which goes all the way to
the data portal without nobody controlling <BR>it."<BR><BR>This is a
contradiction to the requests of others, and specifically <BR>those
wishing to promote basic "data hosting centers", who request <BR>that a
data MANAGER should be able to work autonomously.<BR><BR>After discussion with
the developers the proposal is to implement the <BR>following, which we
hope satisfies both requirements:<BR>In the Administration section, an ADMIN
can choose to enable or <BR>disable the ability for MANAGERS to register
resources with GBIF. By <BR>default MANAGERS can register a
resource, but an ADMIN can disable <BR>this through this check
box.<BR><BR>If anyone has any concerns or comments on this approach, please
can <BR>you raise them on this list?<BR><BR>Many
thanks,<BR>Tim<BR><BR><BR><BR><BR><BR><BR>_______________________________________________<BR>IPT
mailing list<BR><A
href="mailto:IPT@lists.gbif.org">IPT@lists.gbif.org</A><BR>http://lists.gbif.org/mailman/listinfo/ipt<BR><BR></DIV></BLOCKQUOTE></DIV><BR></DIV></DIV></BODY></HTML>