Dear all, I'm still hold with my dataset with more than 20 millions occurrences.
I understood that the issue was due to the large size of the Zipfile. It's too big for the ZipFile Java Api. I did a little trick and I was able to create the data resource. I integrate the DwC Archive with occurrence and verbatim files with just 15 occurrences, then I changed those files with the real ones and it seems to work.
Now, the problem is when I try to load the Zip File into the Cassandra using load function from the biocache, I got a Java out of memory heap error because the code use the RAM to download, unzip and read the file. Unfortunately, 4 Go (zip file) and 23 Go (unzip file) is too big for it.
Do you know if there is another way to do it ? Can I unzip the file and run the loading after it ? Or can I "manually" integrate data into the Cassandra ?
Thanks in advance for your help. Cheers, Marie