[IPT] coreid (lowercase "i") vs coreId in meta.xml - schema validation

Stoner, Dan F dstoner at acis.ufl.edu
Thu Dec 12 16:53:24 UTC 2019

Fantastic, I see the updated schema and the meta.xml validation error has gone away.


Dan Stoner
iDigBio / ACIS Laboratory
University of Florida

From: IPT <ipt-bounces at lists.gbif.org> on behalf of Matthew Blissett <mblissett at gbif.org>
Sent: Thursday, December 12, 2019 11:08 AM
To: ipt at lists.gbif.org
Subject: Re: [IPT] coreid (lowercase "i") vs coreId in meta.xml - schema validation

[External Email]

Hi Dan,

On 12/12/2019 16:30, Stoner, Dan F wrote:
> I found some oddities and I am not exactly sure where to go next.
> We are noticing the following while processing meta.xml in darwin core archives produced by IPT (and other servers):
> Schema validation failed, continuing unvalidated
> XMLSyntaxError: Element '{https://urldefense.proofpoint.com/v2/url?u=http-3A__rs.tdwg.org_dwc_text_-257Dcoreid&d=DwIGaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=GhEALp5fgUuEr_myFMqdby27w-SUjMv06c7EippE1CE&m=bnkWo_ToB27TIVHP8znLmPtSg9efDH4PSHiPCHLCYiw&s=fO3_RmmAqrY_FB7CQUQFrR425KMvSAdch8_tGWKT7F8&e= ': This element is not expected. Expected is ( {https://urldefense.proofpoint.com/v2/url?u=http-3A__rs.tdwg.org_dwc_text_-257DcoreId&d=DwIGaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=GhEALp5fgUuEr_myFMqdby27w-SUjMv06c7EippE1CE&m=bnkWo_ToB27TIVHP8znLmPtSg9efDH4PSHiPCHLCYiw&s=bYzUXTURlEkia4j-Edo6JwrC48ykHIv7XbMfkQQ1tsw&e=  )

There's some background to this on this issue:

The schema itself and the documentation were conflicting, and this was
fixed (in mine and Tim's opinion) the wrong way, by changing the schema.

*I've just pushed a commit to fix it the right way,* i.e. reflecting 99%
actual usage and leaving the schema as it was for almost a decade.

Although we do accept either, we still see only 31 datasets registered
in GBIF with "coreId" rather than "coreid".

> It seems like most consumers are not actually validating meta.xml using the schema, and the producers are generating files out of compliance with the schema.
> Most of the Darwin Core archives I have manually inspected and tried to validate contain meta.xml with lowercase "i" in coreid despite the Standard indicating capital "I" in coreId.
> I poked at the GBIF Darwin Core Validator 3 code repo and found this:
> schema.meta=https://urldefense.proofpoint.com/v2/url?u=https-3A__raw.githubusercontent.com_tdwg_dwc_master_standard_documents_text_tdwg-5Fdwc-5Ftext.xsd-2Chttp-3A__rs.tdwg.org_dwc_text_tdwg-5Fdwc-5Ftext.xsd&d=DwIGaQ&c=sJ6xIWYx-zLMB3EPkvcnVg&r=GhEALp5fgUuEr_myFMqdby27w-SUjMv06c7EippE1CE&m=bnkWo_ToB27TIVHP8znLmPtSg9efDH4PSHiPCHLCYiw&s=gvEhG3t7NeRxRoKNiM0vjl_H_I5MbRH9xQ4lf5CGHGw&e=
> The first link leads to 404, the second leads to an xsd that contains the proper coreId.  So maybe the Validator is not being "strict" about validation against the schema?

I suspect it has been running for so long that, when the validator
process was originally started, both URLs were valid, and had coreid or
one of each.



IPT mailing list
IPT at lists.gbif.org

More information about the IPT mailing list