[GloBI] Ontology of Biotic Interactions?
Jorrit Poelen
jhpoelen at xs4all.nl
Wed Dec 19 01:54:31 CET 2018
Hey John -
I've shared your desire to get a table of interactions terms of the
Relations Ontology (RO) with Chris Mungall, the maintainer of RO (cc-ed)
via https://github.com/oborel/obo-relations/issues/295 .
Meanwhile, I've created a minimal table of RO interaction terms for you
to consider:
https://github.com/globalbioticinteractions/nomer/blob/master/nomer/src/test/resources/org/globalbioticinteractions/nomer/match/ro.tsv
or
https://raw.githubusercontent.com/globalbioticinteractions/nomer/master/nomer/src/test/resources/org/globalbioticinteractions/nomer/match/ro.tsv
If you have suggestions on adding terms (I am sure the list is
incomplete), please do share / document them via
https://github.com/oborel/obo-relations#requesting-new-relations . If
that is doesn't work for you, please feel free to share a list here and
we can probably figure something out.
Hope this helps,
-jorrit
On 11/27/18 3:41 PM, Jorrit Poelen wrote:
>
> Hi John:
>
> Thanks for sharing the access database file. I was able to convert the
> file to tsv files without too much trouble and had a look at the
> examples you shared.
>
> I appreciate how you normalized the life stages and sex of the
> interacting things. also, the reverse interactions and the pairs help
> to more intuitively understand and parse the data.
>
> Re: associationTypes - When looking at the association kinds (from
> tblAssocKinds), I realized that bodyPart (e.g., "leaf"), life stage
> (e.g., "nymph") and physiological state (e.g., "dead") are mixed into
> the interaction type (e.g., "feeding on"). I can see how this notation
> can be handy for data entry or capture writing on labels, however, I
> would have the tendency to map the interaction phrases to separate the
> different kinds of things into separate columns, just like you did
> with lifestage and sex. That said, I'd always want to keep the
> verbatim interaction phrase around to preserve the original language.
>
> Re: list of OBO relations terms - they are listed, but in specialized
> formats (e.g., OBO, OWL) at http://purl.obolibrary.org/obo/ro.obo and
> http://purl.obolibrary.org/obo/ro.owl respectively. I've opened an
> issue to remind myself to make it easier to provide a list (see
> <https://github.com/jhpoelen/eol-globi-data/issues/386>). For the time
> being, you might be inspired by a subset of the supported interactions
> types via
> <https://api.globalbioticinteractions.org/interactionTypes.csv?type=csv>
> . More on that later.
>
> Re: mapping interaction terms - I agree that automated mapping is
> tricky business. What I had in mind is more of a static translation
> table that is used to maintain how one systems interaction terms (like
> yours) would translate into another naming scheme (like OBO Relations
> Ontology). In our case, an automation would use the static translation
> table to link the RO terms. So, no fancy methods here. An example of
> such a translation table can be found at
> https://github.com/globalbioticinteractions/inaturalist :
> https://github.com/globalbioticinteractions/inaturalist/blob/master/interaction_types.csv
> translates terms native the iNaturalist into RO .
> https://github.com/globalbioticinteractions/inaturalist/blob/master/interaction_types_ignored.csv
> contains a list of terms that are explicitly ignored.
>
> Re: next steps - in my experience, highly normalized data structures
> are important and useful when actively managing and curating data.
> However, when exporting data to other systems, often a denormalized
> format (aka "wide single table") really makes life a lot easier for
> moving snapshots of the data around . . . as long as it's automated
> and the identifiers are preserved. So, my suggested next step in our
> integration would be to figure out how to create a method that
> automatically generates a de-normalized table from your wealth of
> association data in a similar form as outlined in
> https://github.com/globalbioticinteractions/template-dataset and, more
> specifically in
> https://github.com/globalbioticinteractions/template-dataset/blob/master/interactions.tsv
> . Once the de-normalization is complete, terms (like assocKinds, but
> also lifestage, bodypart, physiological state) can be translated using
> static translation tables (see above).
>
> In short - in my view, an integration would preserve the autonomy of
> local data management, including terms, while automating a translation
> into a de-normalized format friendly for integration. This allows to
> updates to easily flow through the system without manual intervention
> without effecting the existing data management platform. Most
> importantly perhaps, the process contains sufficient information for
> GloBI (or others) to link back into your database as well as the
> original sources / specimen.
>
> I hope this helps and I'd be interested to help document this
> integration between your Neuropterida database with GloBI as a use
> case to share with our peers.
>
> Curious to hear your thoughts,
>
> -jorrit
>
>
> On 11/27/18 12:29 PM, John Oswald wrote:
>>
>> Hi Jorrit,
>>
>> See below…
>>
>> I think that sharing the Access database file would be a great start.
>>
>> ---I have just shared a Dropbox folder with you that contains an
>> Access database that contains the several tables that I am currently
>> working with to try to formulate a strategy for capturing
>> relationship data. I’m happy to receive any comments/suggestions for
>> improvements. The three tables are briefly discussed below.
>>
>> tblAssocKinds (association kinds) – Contains a list of 544
>> “association kind” text strings. Many (ca. 200) of these originated
>> with an initial 2010 dataset of associations that had been extracted
>> from insect specimen labels by Norm Johnson of Ohio State University.
>> I subsequently tried to standardize the phrasing of the original
>> association strings, added additional strings, then tried to write
>> “reverse association” strings so that there were pairs of phrases
>> that could be used to state the relationship from the “opposite
>> sides” of the association. The pairings are held in tblAssocPairs.
>> Not all AssocKinds have reverse associations, so some are not
>> currently included in a pair. The AssocKinds are mostly
>> taxon-to-taxon or taxon-to-inanimate, but there are some outliers as
>> I was experimenting with different kinds of associations. Is this
>> kind of “reverse association” information useful in a broader context?
>>
>> tblAssocPairs (association pairs) – Contains pairings of values from
>> tblAssocKinds.
>>
>> frmFlexAssocPairs – A simple query that contains AssocPairs as both
>> AssocKind IDs and text strings (easier to read).
>>
>> tblAssociations – My current working table for capturing association
>> data from the literature. This table currently contains only ca. 150
>> records of test data entered from the literature (I want to make sure
>> that I get things set up optimally before extending the data capture
>> effort). The table is general structured around a “left” associate
>> and a “right” associate, “separated” by an AssocPair ID that links to
>> tblAssocPairs. The table is currently structured for ease of data
>> capture from the literature, and includes some other kinds of
>> desirable data (e.g., sex, life stage, geography, literature source)
>> that would be captured from the same literature sources. As I get
>> into this though, it seems clear that many of those other data, which
>> will not be available for all associations, should probably be
>> removed to other relationally-linked tables in order to keep the data
>> normalized. In the current structure the “left” associate is assumed
>> to be an insect species of the superorder Neuropterida, which is
>> specified by a “combination object” ID (field LeftNidaCombObjID). For
>> data entry purposes this ID links to an episodically re-calculated
>> lookup table of >20,000 genus-group/species-group scientific name
>> combinations (essentially a master lookup table of almost every
>> Neuropterida combination that has ever been used in the literature).
>> This “combination” ID can be used to link into most of the related
>> taxonomic and nomenclatural data in my database. The link to the
>> literature source is specified in field BibObjPageCiteID
>> (Bibliographic Object Page Citation ID), which is an identifier that
>> specifies a particular page/plate in the neuropterid literature (from
>> which can be looked up all of the typical bibliographic information
>> about the literature source, plus other data linked to individual
>> literature pages, e.g., figures, chresonyms, and other annotations).
>> This links to my bibliographic dataset of 17,000+ published works
>> that contain information on the Neuropterida.
>>
>> I imagine that a first GloBI integration (or any other integration)
>> would preserve the existing system and implement an automated
>> translation or mapping procedures (e.g., scripts).
>>
>> ---To the extent possible I would prefer to not have to rely on
>> automated translations, which are prone to interpretation errors (or
>> maybe I’m misunderstanding what is automated here…). I would prefer
>> to “oversplit” the associations that I use in my database, then
>> re-group those associations into aggregate sets that correspond to
>> other set(s) of association types used by other projects. This would
>> give me more flexibility for defining associations that are useful
>> for my purposes, and more control over how those associations are
>> mapped when used in (potentially multiple) other contexts. We’re
>> probably saying basically the same thing here, but I would like to
>> retain the ability to control/influence the basic mapping of
>> associations through relationships defined in my database.
>>
>> One of such procedures would include a mapping from your internal
>> association types into another naming scheme such as OBO Relation
>> Onology (http://obofoundry.org/ontology/ro.html
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__obofoundry.org_ontology_ro.html&d=DwMDaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=n0isp79O1WSTtoOYJGr1_rF-2PrQuw41UXGiGQ_Rpb8&m=SYaRVnY0jiGjyYGGuj4Lx0dPncmBZFAatHgtwJ-h4Uk&s=so-xkW80dqsEkKPNLOdRhe1JCD0nC8HQgpn-zKjmaDE&e=>):
>> the closer the terms are in meaning, the easier the mapping is.
>>
>> ---Right, see above. How can I get a complete listing of the relation
>> strings used in the Relation Ontology, their definitions (meanings),
>> and their hierarchical organization? I don’t know how to do that. Is
>> that something you can conveniently download, put into a table-based
>> format, then send to me so that I can incorporate it into my database
>> to experiment with?
>>
>> ---I expect that the relations in the Relation Ontology are fairly
>> general. Is there another ontology (or some other source of relation
>> terms) that deals more specifically (or at least includes) more
>> specific relations that would pertain between insect taxa and either
>> insect or non-insect taxa, and/or insect taxa and inanimate objects?
>> That is something that I would find useful. Does something like that
>> exist? Outside of the Relation Ontology, where do you source other
>> kinds of relations? Or, how does one go about contributing to the
>> development of a more specific ontology of relations that pertain to
>> insects? Would that be a useful thing?
>>
>> With such a translation / mapping method in place, others can more
>> easily find your data and you can more easily find similar projects.
>> Then, hopefully, over time, we'll learn from each other in discussion
>> forums, professional meetings or workshops and make it easier to
>> share and capture these complex datasets.
>>
>> ---I’m interested to know if others have already developed
>> well-defined sets of terms/phrases that describe
>> relations/associations among insects and other organisms and
>> inanimate objects. Can you make any recommendations for where I might
>> look to find to find such sets of terms/phrases? Do any of the other
>> projects involved in GloBI have such term/phrase sets that are
>> available for examination?
>>
>> Perhaps we'll even settle on some best practices!
>>
>> So, yes, please send a (complete/partial) copy of your native
>> database files or raw datasets, so I can get a sense of what your
>> datasets looks like.
>>
>> ---You should have received an e-mail from Dropbox on this today, or
>> will soon.
>>
>> Cheers,
>>
>> John
>>
>> Curious to hear your thoughts,
>>
>> -jorrit
>>
>> On 11/21/18 11:32 AM, John Oswald wrote:
>>
>> Hi Jorrit,
>>
>> Thanks for responding. Deborah Paul mentioned to me at the
>> recent ESA meeting in Vancouver, BC, that GloBI would be one
>> place that I should look, so I hoped to make contact with you by
>> posting to the GloBI listserver. See more interleaved below…
>>
>> ---oo0oo---
>>
>> John D. Oswald
>>
>> Professor of Entomology
>>
>> Curator, Texas A&M University Insect Collection
>>
>> Department of Entomology
>>
>> Texas A&M University
>>
>> College Station, TX 77843-2475
>>
>> E-mail: j-oswald at tamu.edu <mailto:j-oswald at tamu.edu>
>>
>> Phone: 1-979-862-3507
>>
>> Lacewing Digital Library: http://lacewing.tamu.edu/
>>
>> Bibliography of the Neuropterida:
>> http://lacewing.tamu.edu/Biblio/Main
>>
>> Neuropterida Species of the World:
>> http://lacewing.tamu.edu/SpeciesCatalog/Main
>>
>> *From:*GloBI <globi-bounces at lists.gbif.org>
>> <mailto:globi-bounces at lists.gbif.org>*On Behalf Of *Jorrit Poelen
>> *Sent:* Tuesday, November 20, 2018 8:59 PM
>> *To:* globi at lists.gbif.org <mailto:globi at lists.gbif.org>
>> *Subject:* Re: [GloBI] Ontology of Biotic Interactions?
>>
>> Hey John -
>>
>> Like Robert (hi Robert!) mentioned, GloBI is also using the OBO
>> Relations Ontology for defining biotic and abiotic interaction
>> types, just like many other projects.
>>
>> ---As a newbie to ontologies (and an entomologist, not a computer
>> scientist) I’m still trying to wrap my head around ontologies. In
>> a nutshell, so far as I can determine, an ontology is at heart an
>> extended web of controlled vocabulary terms with relationships
>> defined between terms at different web nodes (terms), together
>> with facilities to link to other ontologies. Is that about right?
>>
>> I've been attempting to collect my thoughts on data format and
>> models at
>> https://github.com/jhpoelen/globis-b-interactions/blob/master/text/on-species-interaction-data-models-and-formats.md
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_jhpoelen_globis-2Db-2Dinteractions_blob_master_text_on-2Dspecies-2Dinteraction-2Ddata-2Dmodels-2Dand-2Dformats.md&d=DwMDaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=n0isp79O1WSTtoOYJGr1_rF-2PrQuw41UXGiGQ_Rpb8&m=RumeEb9OhO5-uPYMBvLVR_6mx3SCkPawDJ-uJYJWSC4&s=5rsVFs3GJwqHVhJgL9CJ3zhrFjEJuilAn7bu9SAW91c&e=>
>> .
>>
>> ---Thanks for the link. I have briefly looked this over just now,
>> but will go back for a detailed read later. There appears much
>> there that would be good for me to consider as I continue
>> development on my end.
>>
>> In my experience, an effective way to figure out how to
>> capture/share your data is to use what you have today (as is!)
>> and try to integrate a subset of it with other projects (like
>> GloBI, GBIF).
>>
>> ---For the present, as briefly explained in my initial e-mail,
>> I’m mostly trying to set up an efficient way to capture data on
>> interactions/associations of neuropterid species with other taxa,
>> inanimate objects, and concepts in my personal Neuropterida
>> research database on the three insect orders Neuroptera,
>> Megaloptera, and Raphidioptera. The Neuropterida database
>> (currently built in Access) is highly parsed (ca. 350
>> relationally linked tables, including ca. 150 standardized
>> ‘lookup’ tables covering various subjects), and significantly
>> normalized (ca. 300 tables normalized to 3NF or better). The core
>> data are bibliographic, nomenclatural, taxonomic, and ‘agent
>> based’, but there are extensions going off in many other
>> directions. I currently share parts of these data via episodic
>> downloads to support a variety of projects – particularly the
>> Neuropterida domain of the Catalogue of Life (used by GBIF and
>> many other projects globally), and the various modules of the
>> Lacewing Digital Library project (lacewing.tamu.edu). I am an
>> insect systematist/taxonomist by training, but I got into
>> databasing in the early 1990’s and have been capturing data on
>> the Neuropterida ever since. I am currently involved in a project
>> whose primary product will be a new module in the Lacewing
>> Digital Library that is devoted to interactions/associations
>> (mostly predator/prey) of neuropterid insects and hemipterous
>> insects. Thus, my primary motivation at the current time is to
>> develop an effective and efficient database subschema for
>> capturing these kinds of data, and to relationally link that
>> subschema into my existing overall database schema. I would like
>> to do this in a fairly general way so that I can (1) capture a
>> wide variety of different kinds of interaction/association data
>> into the same subschema of my database, (2) standardize the
>> terminology/phrasing that I use so that terms/phrases are based
>> on explicit definitions and are consistent with other similar
>> controlled vocabularies for similar projects (to the extent that
>> this may be possible; thus my foray here into ontologies…), and
>> (3) be reasonably sure that there is fairly straightforward
>> pathway through which whatever interaction/association subschema
>> I develop within my database that it can be linked out to other
>> projects that I might get involved with in the future.
>>
>> Your projects sounds very similar to other projects that have
>> already been integrated into GloBI - a mix of specimen and
>> literature data with their own way of describing interaction terms.
>>
>> ---Yes, I am sure that there are lots of other projects that are
>> capturing similar data. I’d like to learn from those projects so
>> that I can avoid any common pitfalls and ‘start off on the right
>> foot’ as I get into this. To the extent possible, as I get
>> started, I’d like to tap into any well-defined sets of
>> interaction terms that may already exist. My initial though has
>> been to build out this subschema in my database starting from my
>> current table of ‘association kinds’. But I’m open to new ideas,
>> and looking for someone who might be interested to discuss such
>> issues.
>>
>> Do you have some samples that you can share so that I can get a
>> sense of what you currently have?
>>
>> ---Sure. I can export a few tables to a simple Access database
>> and send that to you. Can you work with that? Please confirm. If
>> so, I can e-mail it to you or post it to you via Dropbox
>> (depending on size).
>>
>> ---One of the difficulties that I am currently having is how one
>> might extract the “data items” (to me this would be the set of
>> controlled terms/phrases and their linking
>> interaction/association terms) from one or more ontologies so
>> that those data can be placed into and manipulated within a
>> relational database structure. Is there an easy way to do that?
>>
>> Cheers,
>>
>> John
>>
>> thx,
>>
>> -jorrit
>>
>> https://globalbioticinteractions.org
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__globalbioticinteractions.org&d=DwMDaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=n0isp79O1WSTtoOYJGr1_rF-2PrQuw41UXGiGQ_Rpb8&m=RumeEb9OhO5-uPYMBvLVR_6mx3SCkPawDJ-uJYJWSC4&s=I3T9B7MwP8QsKCbpPfmjxbX37fWerqfnZPmO9RYlbCM&e=>
>>
>>
>> On 11/19/18 6:21 PM, Bates, Robert P wrote:
>>
>> Hi John,
>>
>> We’ve been working with subsets of the OBO Relation Ontology
>> (which if I’m not mistaken is also what GloBI uses) to
>> provide the concepts for interaction relationships in our
>> VERA modeling system:
>>
>> http://www.obofoundry.org/ontology/ro.html
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.obofoundry.org_ontology_ro.html&d=DwMDaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=n0isp79O1WSTtoOYJGr1_rF-2PrQuw41UXGiGQ_Rpb8&m=RumeEb9OhO5-uPYMBvLVR_6mx3SCkPawDJ-uJYJWSC4&s=kzb8EYNLqZjOBHOdXtKcwh141kUweLk_nINGFtUMVDc&e=>
>>
>> -R
>>
>> *Robert Bates*
>>
>> Research Scientist
>>
>> Design & Intelligence Lab
>>
>> *Georgia Institute of Technology*
>>
>> Technology Square Research Building, 85 5th Street NW,
>> Atlanta, GA 30308
>>
>> e: rbates8 at gatech.edu <mailto:rbates8 at gatech.edu>
>>
>> m: 770.713.8531
>>
>> *From: *GloBI <globi-bounces at lists.gbif.org>
>> <mailto:globi-bounces at lists.gbif.org>on behalf of John Oswald
>> <j-oswald at tamu.edu> <mailto:j-oswald at tamu.edu>
>> *Date: *Monday, November 19, 2018 at 9:18 PM
>> *To: *"globi at lists.gbif.org"
>> <mailto:globi at lists.gbif.org><globi at lists.gbif.org>
>> <mailto:globi at lists.gbif.org>
>> *Subject: *[GloBI] Ontology of Biotic Interactions?
>>
>> I’m growing and extending interaction/association data in my
>> research database on the species of the superorder
>> Neuropterida (Insecta: orders Neuroptera, Megaloptera, and
>> Raphidioptera) of the world. The data is primarily drawn from
>> the published literature; some also from specimen labels. I’m
>> interested in standardizing the terminology that I use to
>> describe interactions and associations. I’m interested in
>> taxon to taxon interactions (e.g., species X eats species Y;
>> species X is phoretic on species Y), taxon to inanimate
>> object interactions/associations (e.g., species X oviposits
>> on substrate Y [say, rocks]), and taxon to concept
>> associations (e.g., species X exhibits behavior Y). Can
>> anyone recommend any good lists of standardized terms (with
>> definitions) for this sort of thing? Are there any good, well
>> developed, ontologies for general taxon-taxon and/or
>> taxon-inanimate interactions? I have a list of 500+
>> “association kinds” (without well-standardized definitions)
>> that I have scraped together over the years. I’d like to plug
>> these into (or convert them into) something more standardized
>> if something more standard exists. Thanks for any suggestions
>> on where I might go next on this.
>>
>> ---oo0oo---
>>
>> John D. Oswald
>>
>> Professor of Entomology
>>
>> Curator, Texas A&M University Insect Collection
>>
>> Department of Entomology
>>
>> Texas A&M University
>>
>> College Station, TX 77843-2475
>>
>> E-mail: j-oswald at tamu.edu <mailto:j-oswald at tamu.edu>
>>
>> Phone: 1-979-862-3507
>>
>> Lacewing Digital Library: http://lacewing.tamu.edu/
>>
>> Bibliography of the Neuropterida:
>> http://lacewing.tamu.edu/Biblio/Main
>>
>> Neuropterida Species of the World:
>> http://lacewing.tamu.edu/SpeciesCatalog/Main
>>
>>
>>
>>
>> _______________________________________________
>>
>> GloBI mailing list
>>
>> GloBI at lists.gbif.org <mailto:GloBI at lists.gbif.org>
>>
>> https://lists.gbif.org/mailman/listinfo/globi <https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.gbif.org_mailman_listinfo_globi&d=DwMDaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=n0isp79O1WSTtoOYJGr1_rF-2PrQuw41UXGiGQ_Rpb8&m=RumeEb9OhO5-uPYMBvLVR_6mx3SCkPawDJ-uJYJWSC4&s=RymsxaAO66KIoZRhT4qKmGNaLZuQhGlJY83ri2bvwfQ&e=>
>>
>
> _______________________________________________
> GloBI mailing list
> GloBI at lists.gbif.org
> https://lists.gbif.org/mailman/listinfo/globi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gbif.org/pipermail/globi/attachments/20181218/c00173d7/attachment-0001.html>
More information about the GloBI
mailing list