Hello, I'm a student from Indiana University, working towards a Masters in Data Science. As part of a class project, we are working with the team from Global Biointeractions (GloBI) to compare how much their data covers the known species list. To that end we are using GBIF as the reference system.
I have worked out the details of navigating your site and extracting information for us to compare against. However, I'm unable to reconcile counts between the download and what I see on the species page.
*Mismatch in species count* Using Kingdom = Archaea and Phylum = Crenarchaeota http://www.gbif.org/species/79, I navigated to the Occurrences page and downloaded the 3,824 occurrences. When I extract unique Species from this list, there are 59, whereas the main page indicates there should be 68.
I see a similar issue with a) Kingdom = Archaea (should be 523, see 485) b) Class = Pinopsida (should be 2,366 species, but see 983)
There is probably a small and obvious mistake I'm making, but not sure I know how to identify that.
Any guidance would be must appreciated.
Regards, *Srini Anand* Masters of Data Science student srianand@indiana.edu