Re: [IPT] [tdwg-content] Reverting the process of DwC standardization

All, Is part of the issue being expressed here because the raw ecological data sets we're discussing are small-ish matrices rather than occurrences, with site codes as columns, taxa as rows and measures of density/abundance as cells (and similar for environmental variables)? Such structures are often used as input for software that executes eg ordinations, classification & regression trees, species richness estimates. The shortcoming of such a structure is the inherent idiosyncratic nature of "site codes", with variable numbers of them, i.e. an arbitrary number of columns. I doubt it was ever designed for ease of dataset integration, but rather for ease of computation. Representing this structure as Event core requires significant transposition & potential for error if it were manual. Open Refine is one such tool that could permit bi-directional transpositions (DwC -> matrix and then matrix -> DwC), but it is still clunky and accommodation of extensions is virtually non-existent. But, perhaps Open Refine recipes and guides gets us one step closer to finding a balance between the need for standardized representation & efficient transport (DwC) vs. end-users who want matrices for ease of computation. David P. Shorthouse On Tue, Oct 27, 2015 at 7:36 AM, David Valentim Dias <dvdias@sibbr.gov.br> wrote:

Hello, Please see my updated suggestion at https://github.com/gbif/ipt/issues/1165 IMHO Open Refine is not the right tool. One can simply use org.apache.poi in his Java application for reading all the information from the different files inside the DwC, and create an ODS file with the combined matrix, which takes into consideration also possible parentEventID. I'm sorry I don't have time to do it myself. I hope it's clear. -- Menashè 2015-10-28 18:57 GMT+01:00 Shorthouse, David <david.shorthouse@umontreal.ca> :
participants (2)
-
Menashe' Eliezer
-
Shorthouse, David