Dear Tiejun,
thanks for the notification. I have fixed the download links in the portal now.
ColDP comes in two flavors. The simple version merges Taxon, Synonym and Name into a single NameUsage entity similar to how DwC does it: https://github.com/CatalogueOfLife/coldp/blob/master/README.md#nameusage
This is how we prepare the COL downloads. The other one splits the name usages into 3 distinct entities.
Treatments are missing because we don't have them in COL currently. This will be added once we deal with Plazi articles over the next month.
Best, Markus
On 29. Sep 2021, at 17:29, Cheng, Tiejun (NIH/NLM/NCBI) [E] chengt2@ncbi.nlm.nih.gov wrote:
Dear COL team,
I’m trying to download the ColDP Archive of THE COL CHECKLIST VERSION 2021-09-21 on this page:https://www.catalogueoflife.org/data/download. However, the actual download linkhttps://download.catalogueoflife.org/col/monthly/2021-09-21_coldp.zip is not working and returns 404. I am able to find another data file here: https://download.catalogueoflife.org/col/. The most recent one I believe ishttps://download.catalogueoflife.org/col/latest_coldp.zip with a timestamp of 2021-08-27 11:00.
According to the description of ColDP file format here: https://www.catalogueoflife.org/about/colusage#data-formats. The ZIP archive is supposed to bundle the following delimited text files: • Name • NameRelation • Taxon • Synonym • NameUsage • Reference • TypeMaterial • Distribution • VernacularName • Media • SpeciesInteraction • TaxonConceptRelation • SpeciesEstimate • Treatments
However, In the latest_coldp.zip I downloaded, I got something different.
$zipinfo -1 latest_coldp.zip • NameUsage.tsv • NameRelation.tsv • TypeMaterial.tsv • VernacularName.tsv • Distribution.tsv • Media.tsv • SpeciesEstimate.tsv • SpeciesInteraction.tsv • TaxonConceptRelation.tsv • Reference.tsv • reference.json (<- this is new)
Some important files are missing. • Name • Taxon • Synomym • Treatments
Could you please help understand it? Thanks.
Best regards,
Tiejun CHENG, Ph.D.
National Center for Biotechnology Information (NCBI) National Library of Medicine (NLM) National Institutes of Health (NIH)
Bldg. 38A, Rm. 8S816A 8600 Rockville Pike Bethesda, MD 20894
Phone: 301-402-9527 Email: chengt2@ncbi.nlm.nih.gov
COL-Users mailing list COL-Users@lists.gbif.org https://lists.gbif.org/mailman/listinfo/col-users