[COL-Users] Question about Catalogue of Life (COL) Data Package file format
Cheng, Tiejun (NIH/NLM/NCBI) [E]
chengt2 at ncbi.nlm.nih.gov
Wed Sep 29 19:59:16 UTC 2021
Hi Markus,
Thanks again for the info. Sorry about asking the same question twice ;-)
Best,
Tiejun
-----Original Message-----
From: Markus Döring <mdoering at gbif.org>
Sent: Wednesday, September 29, 2021 3:50 PM
To: Catalogue of Life user announcements and discussion <col-users at lists.gbif.org>
Subject: Re: [COL-Users] Question about Catalogue of Life (COL) Data Package file format
Dear Tiejun,
thanks for the notification. I have fixed the download links in the portal now.
ColDP comes in two flavors. The simple version merges Taxon, Synonym and Name into a single NameUsage entity similar to how DwC does it: https://github.com/CatalogueOfLife/coldp/blob/master/README.md#nameusage
This is how we prepare the COL downloads.
The other one splits the name usages into 3 distinct entities.
Treatments are missing because we don't have them in COL currently. This will be added once we deal with Plazi articles over the next month.
Best,
Markus
> On 29. Sep 2021, at 17:29, Cheng, Tiejun (NIH/NLM/NCBI) [E] <chengt2 at ncbi.nlm.nih.gov> wrote:
>
> Dear COL team,
>
> I’m trying to download the ColDP Archive of THE COL CHECKLIST VERSION 2021-09-21 on this page:https://www.catalogueoflife.org/data/download. However, the actual download linkhttps://download.catalogueoflife.org/col/monthly/2021-09-21_coldp.zip is not working and returns 404. I am able to find another data file here: https://download.catalogueoflife.org/col/. The most recent one I believe ishttps://download.catalogueoflife.org/col/latest_coldp.zip with a timestamp of 2021-08-27 11:00.
>
> According to the description of ColDP file format here: https://www.catalogueoflife.org/about/colusage#data-formats. The ZIP archive is supposed to bundle the following delimited text files:
> • Name
> • NameRelation
> • Taxon
> • Synonym
> • NameUsage
> • Reference
> • TypeMaterial
> • Distribution
> • VernacularName
> • Media
> • SpeciesInteraction
> • TaxonConceptRelation
> • SpeciesEstimate
> • Treatments
>
> However, In the latest_coldp.zip I downloaded, I got something different.
>
> $zipinfo -1 latest_coldp.zip
> • NameUsage.tsv
> • NameRelation.tsv
> • TypeMaterial.tsv
> • VernacularName.tsv
> • Distribution.tsv
> • Media.tsv
> • SpeciesEstimate.tsv
> • SpeciesInteraction.tsv
> • TaxonConceptRelation.tsv
> • Reference.tsv
> • reference.json (<- this is new)
>
> Some important files are missing.
> • Name
> • Taxon
> • Synomym
> • Treatments
>
> Could you please help understand it? Thanks.
>
> Best regards,
>
> Tiejun CHENG, Ph.D.
> ---------------------------
> National Center for Biotechnology Information (NCBI) National Library
> of Medicine (NLM) National Institutes of Health (NIH)
>
> Bldg. 38A, Rm. 8S816A
> 8600 Rockville Pike
> Bethesda, MD 20894
>
> Phone: 301-402-9527
> Email: chengt2 at ncbi.nlm.nih.gov
>
> _______________________________________________
> COL-Users mailing list
> COL-Users at lists.gbif.org
> https://lists.gbif.org/mailman/listinfo/col-users
_______________________________________________
COL-Users mailing list
COL-Users at lists.gbif.org
https://lists.gbif.org/mailman/listinfo/col-users
More information about the COL-Users
mailing list