[COL-Users] Question about Catalogue of Life (COL) Data Package file format

Cheng, Tiejun (NIH/NLM/NCBI) [E] chengt2 at ncbi.nlm.nih.gov
Wed Sep 29 15:29:13 UTC 2021


Dear COL team,

I'm trying to download the ColDP Archive of THE COL CHECKLIST VERSION 2021-09-21 on this page: https://www.catalogueoflife.org/data/download. However, the actual download link https://download.catalogueoflife.org/col/monthly/2021-09-21_coldp.zip is not working and returns 404. I am able to find another data file here: https://download.catalogueoflife.org/col/. The most recent one I believe is https://download.catalogueoflife.org/col/latest_coldp.zip with a timestamp of 2021-08-27 11:00.

According to the description of ColDP file format here: https://www.catalogueoflife.org/about/colusage#data-formats. The ZIP archive is supposed to bundle the following delimited text files:

  *   Name<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#name>
  *   NameRelation<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#namerelation>
  *   Taxon<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#taxon>
  *   Synonym<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#synonym>
  *   NameUsage<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#nameusage>
  *   Reference<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#reference>
  *   TypeMaterial<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#typematerial>
  *   Distribution<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#distribution>
  *   VernacularName<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#vernacularname>
  *   Media<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#media>
  *   SpeciesInteraction<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#speciesinteraction>
  *   TaxonConceptRelation<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#taxonconceptrelation>
  *   SpeciesEstimate<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#speciesestimate>
  *   Treatments<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#treatment>

However, In the latest_coldp.zip I downloaded, I got something different.

$zipinfo -1 latest_coldp.zip

  *   NameUsage.tsv
  *   NameRelation.tsv
  *   TypeMaterial.tsv
  *   VernacularName.tsv
  *   Distribution.tsv
  *   Media.tsv
  *   SpeciesEstimate.tsv
  *   SpeciesInteraction.tsv
  *   TaxonConceptRelation.tsv
  *   Reference.tsv
  *   reference.json (<- this is new)

Some important files are missing.

  *   Name
  *   Taxon
  *   Synomym
  *   Treatments

Could you please help understand it? Thanks.

Best regards,

Tiejun CHENG, Ph.D.
---------------------------
National Center for Biotechnology Information (NCBI)
National Library of Medicine (NLM)
National Institutes of Health (NIH)

Bldg. 38A, Rm. 8S816A
8600 Rockville Pike
Bethesda, MD 20894

Phone: 301-402-9527
Email: chengt2 at ncbi.nlm.nih.gov<mailto:chengt2 at ncbi.nlm.nih.gov>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gbif.org/pipermail/col-users/attachments/20210929/f5f51bb4/attachment.html>


More information about the COL-Users mailing list