[Hit] Multiple DAO

Julien Husson biology.info at gmail.com
Tue Nov 12 18:22:42 CET 2013


Dear Kyle,

Thank you for answering my question.

The project i'm working on need two step :

1) harvest data from several external databases to feed a central DB

I wish external databases use a custom IPT to map their data to Darwin Core
Archive files then the HIT load this files and extract them into the
central DB.

2) synchronize databases data : when there are changes in the central
database via our portal but also the external databases ( adding, updating
data, ... )

Last but no least, i'm very interested on adding the extraction code back
in. Please let me know how to implement this.

Regards,

Julien H.





On Thu, Nov 7, 2013 at 12:02 PM, Kyle Braak [GBIF] <kbraak at gbif.org> wrote:

> Dear Julien,
>
> The DAOs with .synchronise deal with synchronization (synchronizing raw
> occurrence records). The other set of DAOs deal with extraction (extracting
> raw occurrence records into occurrence records).
>
> Some background. Before the HIT, indexing at GBIF was done using this code<https://code.google.com/p/gbif-dataportal/source/browse/#svn/trunk/portal-index%3Fstate%3Dclosed>.
> When developing the HIT, the synchronization part was rewritten, and the
> extraction part remained the same. To not risk breaking extraction, it was
> safer to just wire up the old extraction DAOs unchanged, and then code a
> new set of DAOs for synchronization.
>
> After a couple years, the extraction part was rewritten anyways, done
> independent of the HIT using Hadoop. Therefore, the synchroniser-gbif<https://code.google.com/p/gbif-indexingtoolkit/source/browse/#svn/trunk/synchroniser-gbif> was branched
> into synchroniser-gbif-refactor<https://code.google.com/p/gbif-indexingtoolkit/source/browse/#svn/branches/synchroniser-gbif-refactor>,
> with all the extraction code completely removed.
>
> Another thing you will likely notice is that the HIT can no longer
> communicate with the GBIF Registry.
>
> Over the course of many years, the HIT has constantly had to adapt with
> the evolving GBIF Registry. First, it talked with the original GBIF
> (UDDI-based) registry, then with the GBIF GBRDS Registry<http://gbif.blogspot.dk/2011/04/evolution-of-gbif-registry.html>,
> and lastly with the GBIF (dataset-aware) Registry<http://gbif.blogspot.dk/2012/10/the-gbif-registry-is-now-dataset-aware.html>
> .
>
> This month GBIF has discontinued its reliance on the HIT for indexing,
> relying instead on a new suite of tools and technologies such as HBase that
> will enable real-time indexing. This new indexing communicates with the
> latest version of the Registry: the GBIF (real-time) Registry<http://gbif.blogspot.dk/2013/10/the-new-real-time-gbif-registry-has.html>
> .
>
> Therefore, the HIT is now out of date. Furthermore, as mentioned above,
> the latest branch of the synchronizer project doesn't include any of the
> extraction code. As you are aware, extraction is needed to be able to
> populate the database serving an adapted GBIF Data Portal.
>
> The HIT will always remain a free and open-source project. Therefore, if
> you are interested in updating it to use the latest Registry, and adding
> the extraction code back in, I could try to guide you. There are a handful
> of existing HIT installations around the world that would also benefit from
> your volunteer efforts to update the project.
>
> With kind regards,
>
> Kyle Braak
> Developer
> Secretariat of the Global Biodiversity Information Facility (GBIF)
> Universitetsparken 15, DK-2100 Copenhagen Ø
> Denmark, Europe
> +45 353 21479 (work)
> Skype: kyle.braak
> www.gbif.org
>
> On Oct 31, 2013, at 1:18 PM, Julien Husson wrote:
>
> hi guys,
>
> I'm working on a custom synchronizer for my own portal with an Oracle
> Server.
>
> I don't understand why DAO are duplicate for example :
>
> org.gbif.harvest.portal.synchronise.dao.ImageRecordDAO
> org.gbif.portal.dao.ImageRecordDAO
>
> it's the same thing for DAO Impl and Model
>
> Is there a cause ? What's the motive of this ?
>
> In advance Thx you,
>
> Julien H.
> _______________________________________________
> Hit mailing list
> Hit at lists.gbif.org
> http://lists.gbif.org/mailman/listinfo/hit
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.gbif.org/pipermail/hit/attachments/20131112/e3855dad/attachment.html 


More information about the Hit mailing list