Hey John -

I've shared your desire to get a table of interactions terms of the Relations Ontology (RO) with Chris Mungall, the maintainer of RO (cc-ed) via https://github.com/oborel/obo-relations/issues/295 .

Meanwhile, I've created a minimal table of RO interaction terms for you to consider:

https://github.com/globalbioticinteractions/nomer/blob/master/nomer/src/test/resources/org/globalbioticinteractions/nomer/match/ro.tsv

or

https://raw.githubusercontent.com/globalbioticinteractions/nomer/master/nomer/src/test/resources/org/globalbioticinteractions/nomer/match/ro.tsv

If you have suggestions on adding terms (I am sure the list is incomplete), please do share / document them via https://github.com/oborel/obo-relations#requesting-new-relations . If that is doesn't work for you, please feel free to share a list here and we can probably figure something out.

Hope this helps,

-jorrit

On 11/27/18 3:41 PM, Jorrit Poelen wrote:

Hi John:

Thanks for sharing the access database file. I was able to convert the file to tsv files without too much trouble and had a look at the examples you shared.

I appreciate how you normalized the life stages and sex of the interacting things. also, the reverse interactions and the pairs help to more intuitively understand and parse the data.

Re: associationTypes - When looking at the association kinds (from tblAssocKinds), I realized that bodyPart (e.g., "leaf"), life stage (e.g., "nymph") and physiological state (e.g., "dead") are mixed into the interaction type (e.g., "feeding on"). I can see how this notation can be handy for data entry or capture writing on labels, however, I would have the tendency to map the interaction phrases to separate the different kinds of things into separate columns, just like you did with lifestage and sex.  That said, I'd always want to keep the verbatim interaction phrase around to preserve the original language.

Re: list of OBO relations terms - they are listed, but in specialized formats (e.g., OBO, OWL) at http://purl.obolibrary.org/obo/ro.obo and http://purl.obolibrary.org/obo/ro.owl respectively. I've opened an issue to remind myself to make it easier to provide a list (see <https://github.com/jhpoelen/eol-globi-data/issues/386>). For the time being, you might be inspired by a subset of the supported interactions types via <https://api.globalbioticinteractions.org/interactionTypes.csv?type=csv> . More on that later.

Re: mapping interaction terms - I agree that automated mapping is tricky business. What I had in mind is more of a static translation table that is used to maintain how one systems interaction terms (like yours) would translate into another naming scheme (like OBO Relations Ontology). In our case, an automation would use the static translation table to link the RO terms. So, no fancy methods here. An example of such a translation table can be found at https://github.com/globalbioticinteractions/inaturalist : https://github.com/globalbioticinteractions/inaturalist/blob/master/interaction_types.csv translates terms native the iNaturalist into RO . https://github.com/globalbioticinteractions/inaturalist/blob/master/interaction_types_ignored.csv contains a list of terms that are explicitly ignored.

Re: next steps - in my experience, highly normalized data structures are important and useful when actively managing and curating data. However, when exporting data to other systems, often a denormalized format (aka "wide single table") really makes life a lot easier for moving snapshots of the data around . . . as long as it's automated and the identifiers are preserved. So, my suggested next step in our integration would be to figure out how to create a method that automatically generates a de-normalized table from your wealth of association data in a similar form as outlined in https://github.com/globalbioticinteractions/template-dataset and, more specifically in https://github.com/globalbioticinteractions/template-dataset/blob/master/interactions.tsv . Once the de-normalization is complete, terms (like assocKinds, but also lifestage, bodypart, physiological state) can be translated using static translation tables (see above).

In short - in my view, an integration would preserve the autonomy of local data management, including terms, while automating a translation into a de-normalized format friendly for integration. This allows to updates to easily flow through the system without manual intervention without effecting the existing data management platform. Most importantly perhaps, the process contains sufficient information for GloBI (or others) to link back into your database as well as the original sources / specimen. 

I hope this helps and I'd be interested to help document this integration between your Neuropterida database with GloBI as a use case to share with our peers.

Curious to hear your thoughts,

-jorrit


On 11/27/18 12:29 PM, John Oswald wrote:

Hi Jorrit,

     See below…

I think that sharing the Access database file would be a great start.

---I have just shared a Dropbox folder with you that contains an Access database that contains the several tables that I am currently working with to try to formulate a strategy for capturing relationship data. I’m happy to receive any comments/suggestions for improvements. The three tables are briefly discussed below.

tblAssocKinds (association kinds) – Contains a list of 544 “association kind” text strings. Many (ca. 200) of these originated with an initial 2010 dataset of associations that had been extracted from insect specimen labels by Norm Johnson of Ohio State University. I subsequently tried to standardize the phrasing of the original association strings, added additional strings, then tried to write “reverse association” strings so that there were pairs of phrases that could be used to state the relationship from the “opposite sides” of the association. The pairings are held in tblAssocPairs. Not all AssocKinds have reverse associations, so some are not currently included in a pair. The AssocKinds are mostly taxon-to-taxon or taxon-to-inanimate, but there are some outliers as I was experimenting with different kinds of associations. Is this kind of “reverse association” information useful in a broader context?

tblAssocPairs (association pairs) – Contains pairings of values from tblAssocKinds.

frmFlexAssocPairs – A simple query that contains AssocPairs as both AssocKind IDs and text strings (easier to read).

tblAssociations – My current working table for capturing association data from the literature. This table currently contains only ca. 150 records of test data entered from the literature (I want to make sure that I get things set up optimally before extending the data capture effort). The table is general structured around a “left” associate and a “right” associate, “separated” by an AssocPair ID that links to tblAssocPairs. The table is currently structured for ease of data capture from the literature, and includes some other kinds of desirable data (e.g., sex, life stage, geography, literature source) that would be captured from the same literature sources. As I get into this though, it seems clear that many of those other data, which will not be available for all associations, should probably be removed to other relationally-linked tables in order to keep the data normalized. In the current structure the “left” associate is assumed to be an insect species of the superorder Neuropterida, which is specified by a “combination object” ID (field LeftNidaCombObjID). For data entry purposes this ID links to an episodically re-calculated lookup table of >20,000 genus-group/species-group scientific name combinations (essentially a master lookup table of almost every Neuropterida combination that has ever been used in the literature). This “combination” ID can be used to link into most of the related taxonomic and nomenclatural data in my database. The link to the literature source is specified in field BibObjPageCiteID (Bibliographic Object Page Citation ID), which is an identifier that specifies a particular page/plate in the neuropterid literature (from which can be looked up all of the typical bibliographic information about the literature source, plus other data linked to individual literature pages, e.g., figures, chresonyms, and other annotations). This links to my bibliographic dataset of 17,000+ published works that contain information on the Neuropterida.

I imagine that a first GloBI integration (or any other integration) would preserve the existing system and implement an automated translation or mapping procedures (e.g., scripts).

---To the extent possible I would prefer to not have to rely on automated translations, which are prone to interpretation errors (or maybe I’m misunderstanding what is automated here…). I would prefer to “oversplit” the associations that I use in my database, then re-group those associations into aggregate sets that correspond to other set(s) of association types used by other projects. This would give me more flexibility for defining associations that are useful for my purposes, and more control over how those associations are mapped when used in (potentially multiple) other contexts. We’re probably saying basically the same thing here, but I would like to retain the ability to control/influence the basic mapping of associations through relationships defined in my database.

One of such procedures would include a mapping from your internal association types into another naming scheme such as OBO Relation Onology (http://obofoundry.org/ontology/ro.html): the closer the terms are in meaning, the easier the mapping is.

---Right, see above. How can I get a complete listing of the relation strings used in the Relation Ontology, their definitions (meanings), and their hierarchical organization? I don’t know how to do that. Is that something you can conveniently download, put into a table-based format, then send to me so that I can incorporate it into my database to experiment with?

---I expect that the relations in the Relation Ontology are fairly general. Is there another ontology (or some other source of relation terms) that deals more specifically (or at least includes) more specific relations that would pertain between insect taxa and either insect or non-insect taxa, and/or insect taxa and inanimate objects? That is something that I would find useful. Does something like that exist? Outside of the Relation Ontology, where do you source other kinds of relations? Or, how does one go about contributing to the development of a more specific ontology of relations that pertain to insects? Would that be a useful thing?

With such a translation / mapping method in place, others can more easily find your data and you can more easily find similar projects. Then, hopefully, over time, we'll learn from each other in discussion forums, professional meetings or workshops and make it easier to share and capture these complex datasets.

---I’m interested to know if others have already developed well-defined sets of terms/phrases that describe relations/associations among insects and other organisms and inanimate objects. Can you make any recommendations for where I might look to find to find such sets of terms/phrases? Do any of the other projects involved in GloBI have such term/phrase sets that are available for examination?

Perhaps we'll even settle on some best practices!

So, yes, please send a (complete/partial) copy of your native database files or raw datasets, so I can get a sense of what your datasets looks like.

---You should have received an e-mail from Dropbox on this today, or will soon.

Cheers,

John

Curious to hear your thoughts,

-jorrit

On 11/21/18 11:32 AM, John Oswald wrote:

Hi Jorrit,

    Thanks for responding. Deborah Paul mentioned to me at the recent ESA meeting in Vancouver, BC, that GloBI would be one place that I should look, so I hoped to make contact with you by posting to the GloBI listserver. See more interleaved below…

 

---oo0oo---

 

John D. Oswald

Professor of Entomology

Curator, Texas A&M University Insect Collection

Department of Entomology

Texas A&M University

College Station, TX  77843-2475

 

E-mail: j-oswald@tamu.edu

Phone: 1-979-862-3507

 

Lacewing Digital Library: http://lacewing.tamu.edu/

Bibliography of the Neuropterida: http://lacewing.tamu.edu/Biblio/Main

Neuropterida Species of the World: http://lacewing.tamu.edu/SpeciesCatalog/Main

 

From: GloBI <globi-bounces@lists.gbif.org> On Behalf Of Jorrit Poelen
Sent: Tuesday, November 20, 2018 8:59 PM
To:
globi@lists.gbif.org
Subject: Re: [GloBI] Ontology of Biotic Interactions?

 

Hey John -

Like Robert (hi Robert!) mentioned, GloBI is also using the OBO Relations Ontology for defining biotic and abiotic interaction types, just like many other projects.

---As a newbie to ontologies (and an entomologist, not a computer scientist) I’m still trying to wrap my head around ontologies. In a nutshell, so far as I can determine, an ontology is at heart an extended web of controlled vocabulary terms with relationships defined between terms at different web nodes (terms), together with facilities to link to other ontologies. Is that about right?

I've been attempting to collect my thoughts on data format and models at https://github.com/jhpoelen/globis-b-interactions/blob/master/text/on-species-interaction-data-models-and-formats.md .

---Thanks for the link. I have briefly looked this over just now, but will go back for a detailed read later. There appears much there that would be good for me to consider as I continue development on my end.

In my experience, an effective way to figure out how to capture/share your data is to use what you have today (as is!) and try to integrate a subset of it with other projects (like GloBI, GBIF). 

---For the present, as briefly explained in my initial e-mail, I’m mostly trying to set up an efficient way to capture data on interactions/associations of neuropterid species with other taxa, inanimate objects, and concepts in my personal Neuropterida research database on the three insect orders Neuroptera, Megaloptera, and Raphidioptera. The Neuropterida database (currently built in Access) is highly parsed (ca. 350 relationally linked tables, including ca. 150 standardized ‘lookup’ tables covering various subjects), and significantly normalized (ca. 300 tables normalized to 3NF or better). The core data are bibliographic, nomenclatural, taxonomic, and ‘agent based’, but there are extensions going off in many other directions. I currently share parts of these data via episodic downloads to support a variety of projects – particularly the Neuropterida domain of the Catalogue of Life (used by GBIF and many other projects globally), and the various modules of the Lacewing Digital Library project (lacewing.tamu.edu). I am an insect systematist/taxonomist by training, but I got into databasing in the early 1990’s and have been capturing data on the Neuropterida ever since. I am currently involved in a project whose primary product will be a new module in the Lacewing Digital Library that is devoted to interactions/associations (mostly predator/prey) of neuropterid insects and hemipterous insects. Thus, my primary motivation at the current time is to develop an effective and efficient database subschema for capturing these kinds of data, and to relationally link that subschema into my existing overall database schema. I would like to do this in a fairly general way so that I can (1) capture a wide variety of different kinds of interaction/association data into the same subschema of my database, (2) standardize the terminology/phrasing that I use so that terms/phrases are based on explicit definitions and are consistent with other similar controlled vocabularies for similar projects (to the extent that this may be possible; thus my foray here into ontologies…), and (3) be reasonably sure that there is fairly straightforward pathway through which whatever interaction/association subschema I develop within my database that it can be linked out to other projects that I might get involved with in the future.

Your projects sounds very similar to other projects that have already been integrated into GloBI - a mix of specimen and literature data with their own way of describing interaction terms.

---Yes, I am sure that there are lots of other projects that are capturing similar data. I’d like to learn from those projects so that I can avoid any common pitfalls and ‘start off on the right foot’ as I get into this. To the extent possible, as I get started, I’d like to tap into any well-defined sets of interaction terms that may already exist. My initial though has been to build out this subschema in my database starting from my current table of ‘association kinds’. But I’m open to new ideas, and looking for someone who might be interested to discuss such issues.

Do you have some samples that you can share so that I can get a sense of what you currently have?

---Sure. I can export a few tables to a simple Access database and send that to you. Can you work with that? Please confirm. If so, I can e-mail it to you or post it to you via Dropbox (depending on size).

---One of the difficulties that I am currently having is how one might extract the “data items” (to me this would be the set of controlled terms/phrases and their linking interaction/association terms) from one or more ontologies so that those data can be placed into and manipulated within a relational database structure. Is there an easy way to do that?

Cheers,

John

thx,

-jorrit

https://globalbioticinteractions.org

On 11/19/18 6:21 PM, Bates, Robert P wrote:

Hi John,

 

We’ve been working with subsets of the OBO Relation Ontology (which if I’m not mistaken is also what GloBI uses) to provide the concepts for interaction relationships in our VERA modeling system:

 

http://www.obofoundry.org/ontology/ro.html

 

-R

 

Robert Bates

Research Scientist

 

Design & Intelligence Lab

Georgia Institute of Technology

Technology Square Research Building, 85 5th Street NW, Atlanta, GA 30308

e: rbates8@gatech.edu

m: 770.713.8531

 

 

 

From: GloBI <globi-bounces@lists.gbif.org> on behalf of John Oswald <j-oswald@tamu.edu>
Date: Monday, November 19, 2018 at 9:18 PM
To:
"globi@lists.gbif.org" <globi@lists.gbif.org>
Subject: [GloBI] Ontology of Biotic Interactions?

 

I’m growing and extending interaction/association data in my research database on the species of the superorder Neuropterida (Insecta: orders Neuroptera, Megaloptera, and Raphidioptera) of the world. The data is primarily drawn from the published literature; some also from specimen labels. I’m interested in standardizing the terminology that I use to describe interactions and associations. I’m interested in taxon to taxon interactions (e.g., species X eats species Y; species X is phoretic on species Y), taxon to inanimate object interactions/associations (e.g., species X oviposits on substrate Y [say, rocks]), and taxon to concept associations (e.g., species X exhibits behavior Y). Can anyone recommend any good lists of standardized terms (with definitions) for this sort of thing? Are there any good, well developed, ontologies for general taxon-taxon and/or taxon-inanimate interactions? I have a list of 500+ “association kinds” (without well-standardized definitions) that I have scraped together over the years. I’d like to plug these into (or convert them into) something more standardized if something more standard exists. Thanks for any suggestions on where I might go next on this.

 

---oo0oo---

 

John D. Oswald

Professor of Entomology

Curator, Texas A&M University Insect Collection

Department of Entomology

Texas A&M University

College Station, TX  77843-2475

 

E-mail: j-oswald@tamu.edu

Phone: 1-979-862-3507

 

Lacewing Digital Library: http://lacewing.tamu.edu/

Bibliography of the Neuropterida: http://lacewing.tamu.edu/Biblio/Main

Neuropterida Species of the World: http://lacewing.tamu.edu/SpeciesCatalog/Main

 




_______________________________________________
GloBI mailing list
GloBI@lists.gbif.org
https://lists.gbif.org/mailman/listinfo/globi

_______________________________________________
GloBI mailing list
GloBI@lists.gbif.org
https://lists.gbif.org/mailman/listinfo/globi