<div dir="ltr"><div><div><div>Hi Tim, thanks for your prompt answer.<br><br></div>It&#39;s not really what i expected, to be more specific &gt;<br><br></div>First,

 i have to harvest lot of data from multiple DB (remote or not) with, of

 course, different structures/models and format of datas ... sometimes 

no DB just flat CSV/xsl files...<br>

That&#39;s why i use an IPT to create data mapping and to standardize the data stream into Dwc standard.<br>(<span lang="en"><span>I&#39;m talking about</span> <span>millions of</span> <span>specimen)</span></span><br>

<br></div><div>Secondly, i build specific indexes of harvested data with

 a custom harvester using canadensys-harvester lib. ( Thanks to 

Christian). It&#39;s at this point that it&#39;s begin to be difficult with a 

denormalized view of the data from IPT Dwc-A.<br>

</div><div>&nbsp;&#39;Cause i need to transform this denormalized view or raw 

model into a normalize view that match with my big relational database 

model which become the new repository.<br><br></div>That&#39;s why i thought that the custom extensions could be make my life easier.</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Mar 14, 2014 at 5:12 PM, Julien Husson <span dir="ltr">&lt;<a href="mailto:biology.info@gmail.com" target="_blank">biology.info@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div>Hi Tim, thanks for your prompt answer.<br><br></div>It&#39;s not really what i expected, to be more specific &gt;<br>

<br></div>First, i have to harvest lot of data from multiple DB (remote or not) with, of course, different structures/models and format of datas ... sometimes no DB just flat CSV/xsl files...<br>

That&#39;s why i use an IPT to create data mapping and to standardize the data stream into Dwc standard.<br>(<span lang="en"><span>I&#39;m talking about</span> <span>millions of</span> <span>specimen)</span></span><br>

<br></div><div>Secondly, i build specific indexes of harvested data with a custom harvester using canadensys-harvester lib. ( Thanks to Christian). It&#39;s at this point that it&#39;s begin to be difficult with a denormalized view of the data from IPT Dwc-A.<br>


</div><div>&nbsp;&#39;Cause i need to transform this denormalized view or raw model into a normalize view that match with my big relational database model which become the new repository.<br><br></div><div>That&#39;s why i thought that the custom extensions could be make my life easier.<span lang="en"><span></span></span></div>


<div><div><div><br><br></div></div></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Mar 14, 2014 at 4:32 PM, Tim Robertson [GBIF] <span dir="ltr">&lt;<a href="mailto:trobertson@gbif.org" target="_blank">trobertson@gbif.org</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">Hi Julien,<div><br></div><div>Thanks for your question. &nbsp;It really depends on what you are trying to publish. &nbsp;We can add extensions of course, but without knowing the specifics it is difficult to comment.</div>


<div><br></div><div>However, a &ldquo;specimen, event, location&rdquo; DB model would typically map to an Occurrence core with no extensions required - this is the most common use case of Darwin Core and the IPT. &nbsp;An Occurrence core is basically a denormalized view of the data.</div>


<div><br></div><div>If I were the data manager, I would probably consider that I was publishing a &ldquo;DwC Occurrence view&rdquo; of my more complex model and as such would keep a view in the database along the lines of:</div><div>


<br></div><div>CREATE VIEW view_dwc_occurrence AS</div><div>SELECT</div><div>&nbsp; specimen.bar_code AS occurrenceID,</div><div>&nbsp; <a href="http://specimen.name" target="_blank">specimen.name</a> AS scientificName,</div><div>

&nbsp; location.latitude AS decimalLatitude,</div>

<div>&nbsp; location.longitude AS decimalLongitude,</div><div>&nbsp; event.year AS year</div><div>FROM&nbsp;</div><div>&nbsp; specimen&nbsp;</div><div>&nbsp; JOIN event ON &hellip;&nbsp;</div><div>&nbsp; JOIN location ON &hellip;</div><div>WHERE</div><div>&nbsp; &lt;insert any conditions for record inclusion, such as non endangered etc&gt;</div>


<div><br></div><div>Then in my IPT I would simply do &ldquo;SELECT * FROM view_dwc_occurrence&rdquo;. &nbsp;Here you are flattening the normalised model into a denormalized DwC view of the data.</div><div><br></div><div>Maintaining a view in the database layer as opposed to a custom mapping in the IPT benefits you by:</div>


<div>&nbsp; i) catching issues early with database schema changes since the DB will likely stop you with an error</div><div>&nbsp; ii) offering an easy mapping of DB table field names, to DwC terms in a language I find very familiar (SQL)</div>


<div>&nbsp; iii) a super simple IPT mapping, as all columns will map automatically in the IPT since they are DwC recognised terms already</div><div><br></div><div>Does that help in any way? &nbsp;If not, could you please elaborate on the model and what you are trying to achieve and we&rsquo;ll do all we can.</div>


<div><br></div><div>Thanks,</div><div>Tim</div><div><br></div><div>&nbsp;</div><div><br><div><div><div><div>On 14 Mar 2014, at 16:06, Julien Husson &lt;<a href="mailto:biology.info@gmail.com" target="_blank">biology.info@gmail.com</a>&gt; wrote:</div>


<br></div></div><blockquote type="cite"><div><div><div dir="ltr"><div><div><div>Hi,<br><br></div><div>I use Dwc-A to feed my BD.<br></div><div><br></div>We know the limits of the Dwc star schema to represent a relationnal database.<br>


<br><span lang="en">For example in the case of 1-n cardinality :&nbsp; specimen --- event/record --- location.<br>

</span><span lang="en">If i understand the concept, I need to use </span><span lang="en">the Darwin <b>Core</b> <b>Occurrence</b> <b>extension</b>, denormalize my relational model in a big raw model and transform / re-normalize this model to match with my DB model</span>.<br>


<br><span lang="en"><span>In</span> <span>order to reduce cost<em>, </em>dev<em> </em>and<em> </em>optimize sql statement, it will be </span></span><span lang="en"><span lang="en"><span>grandly</span></span> appreciate to add custom extension. In this case, i can to be very close of my relational database model and avoid multiple step of dev.<br>


</span></div><span lang="en"><span></span></span></div><span lang="en"><span></span></span><div><div><br>I discovered this link but explanantion is now deprecated<br>

<a href="http://dag-endresen.blogspot.fr/2009/06/adding-extension-for-germplasm-to-gbif.html" target="_blank">http://dag-endresen.blogspot.fr/2009/06/adding-extension-for-germplasm-to-gbif.html</a><br><br></div><div>Thanks,<br>


<br></div>

<div>J.<br></div></div></div></div></div>

_______________________________________________<br>IPT mailing list<br><a href="mailto:IPT@lists.gbif.org" target="_blank">IPT@lists.gbif.org</a><br><a href="http://lists.gbif.org/mailman/listinfo/ipt" target="_blank">http://lists.gbif.org/mailman/listinfo/ipt</a><br>


</blockquote></div><br></div></div></blockquote></div><br></div>

</div></div></blockquote></div><br></div>