In an Operational Data Hub (ODH), semantics and RDF triples provide many capabilities with respect to managing data and the complexities of data integration. They contribute to flexible modeling and agility by providing the ability to manage facts, concepts and complex relationships associated with data, within a very rich context.
“I can link any document of any type to any other document of any other type at any time—design time, run time—anywhere I want with any kind of relationship I can imagine. You cannot do that in relational [databases],” said Mike Bowers, principal architect at the Church of Latter Day Saints.
Let’s see how semantics and RDF triples help express more complex and interesting sets of data facts and relationships.
The Power of Triples
Expressing data as triples is quite powerful, so much so that it is the underpinning of the promise of the semantic web. The promise and potential of semantics and RDF triples go far beyond the domain of data integration—too much to be fully covered here. However, we can look at a smaller (yet still expansive) scope of what’s possible with respect to data integration, such as:
The ability to represent relationships in a natural and agile way. In a relational database management system (RDBMS), representing a relationship between two entities is quite rigid. A modeler essentially chooses a cardinality (e.g., one-to-one, one-to-many, many-to-many) and encodes these choices as constraints—via primary keys and foreign keys—across a number of tables. Once these constraints are in place, they may not be violated unless and until a database administrator gets involved to change the rules.
Triples, on the other hand, are not constraints but instead are data items that are created at any time for any entity. Any time a business relationship occurs, a triple may be created.
The ability to explicitly encode context and intent. In an RDBMS world, relationships are typically devoid of explicit context, requiring some implicit knowledge of the designer’s intent. RDF triples, on the other hand, encode full context by naming the type of relationship (i.e., the predicate) between entities (the subject and object).
The ability to create complex graphs of relationships between things. For example, representing a social graph (e.g., friends, friends of friends, etc.) is quite easy using RDF, and there are W3C standard ways to do so.
The ability to encode facts at any time. As mentioned previously, triples don’t necessarily have to be relationships between things; they may also be additional context about an entity. Such a capability expands what’s possible with respect to metadata representation.
The ability to make inferences about data and complex relationships via rule sets. For example, we might create rules so that when two facts are true (or two relationships exist), we may infer that a third fact or relationship exists without it being explicitly encoded in the data.
It’s intrinsically powerful to be able to represent various facts and relationships with triples (particularly when combined with documents). However, the ability to reason over these data representations in ways we simply could not before is perhaps the most compelling piece about data integration with an ODH.
Next Up
Are you still scratching your head about this whole ODH thing? Well, in our next blog, we’ll discuss some fundamentals—namely, how you can fit an ODH into your existing architecture.
DOWNLOAD our ebook, “Data Hub Guide for Architects,” and take a journey from the beginning including how ODH emerged and why and how the pattern can solve your data-management challenges.
Kate Ranta
Kate Ranta is a Solutions Marketing Manager at MarkLogic. She is a communications and marketing professional with a focus on digital content strategy, inbound marketing, social media campaign management, SEO, and project management.