[COL-Users] Question about Catalogue of Life (COL) Data Package file format
Cheng, Tiejun (NIH/NLM/NCBI) [E]
chengt2 at ncbi.nlm.nih.gov
Wed Sep 29 15:29:13 UTC 2021
Dear COL team,
I'm trying to download the ColDP Archive of THE COL CHECKLIST VERSION 2021-09-21 on this page: https://www.catalogueoflife.org/data/download. However, the actual download link https://download.catalogueoflife.org/col/monthly/2021-09-21_coldp.zip is not working and returns 404. I am able to find another data file here: https://download.catalogueoflife.org/col/. The most recent one I believe is https://download.catalogueoflife.org/col/latest_coldp.zip with a timestamp of 2021-08-27 11:00.
According to the description of ColDP file format here: https://www.catalogueoflife.org/about/colusage#data-formats. The ZIP archive is supposed to bundle the following delimited text files:
* Name<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#name>
* NameRelation<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#namerelation>
* Taxon<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#taxon>
* Synonym<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#synonym>
* NameUsage<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#nameusage>
* Reference<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#reference>
* TypeMaterial<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#typematerial>
* Distribution<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#distribution>
* VernacularName<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#vernacularname>
* Media<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#media>
* SpeciesInteraction<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#speciesinteraction>
* TaxonConceptRelation<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#taxonconceptrelation>
* SpeciesEstimate<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#speciesestimate>
* Treatments<https://github.com/CatalogueOfLife/coldp/blob/master/README.md#treatment>
However, In the latest_coldp.zip I downloaded, I got something different.
$zipinfo -1 latest_coldp.zip
* NameUsage.tsv
* NameRelation.tsv
* TypeMaterial.tsv
* VernacularName.tsv
* Distribution.tsv
* Media.tsv
* SpeciesEstimate.tsv
* SpeciesInteraction.tsv
* TaxonConceptRelation.tsv
* Reference.tsv
* reference.json (<- this is new)
Some important files are missing.
* Name
* Taxon
* Synomym
* Treatments
Could you please help understand it? Thanks.
Best regards,
Tiejun CHENG, Ph.D.
---------------------------
National Center for Biotechnology Information (NCBI)
National Library of Medicine (NLM)
National Institutes of Health (NIH)
Bldg. 38A, Rm. 8S816A
8600 Rockville Pike
Bethesda, MD 20894
Phone: 301-402-9527
Email: chengt2 at ncbi.nlm.nih.gov<mailto:chengt2 at ncbi.nlm.nih.gov>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gbif.org/pipermail/col-users/attachments/20210929/f5f51bb4/attachment.html>
More information about the COL-Users
mailing list