Previous Posts: Part 1: How to Model and Manage Entities with UML Part 2: Introduction to the UML-to-Entity Services Toolkit: UML Modeling with MarkLogic’s Entity Services
Welcome to the continuation (i.e., the third blog) of my series that explores modeling for MarkLogic’s Entity Services using the Unified Modeling Language (UML). In Part 1, I introduce the concept of using UML notation to visually depict the model and explain how to seamlessly transform the UML model to MarkLogic’s Entity Services model descriptor format, which I demonstrate in Part 2 using movies an example, and also introduce my UML-to-Entity Services toolkit for model-driven MarkLogic development.
Here, we will examine modeling for a mixed document/semantic database. Besides showcasing semantics, my example also demonstrates how to use UML modeling with MarkLogic’s Data Hub Framework (DHF). The source code for this UML modeling example is on GitHub.
Our data model (Figure 1) describes employees and departments in a company’s human resources repository. The company is the fictional GlobalCorp. Our model is based on an example included in DHF’s GitHub repository.
Figure 1: Sample data model with departments and employees
The two main classes are Department
and Employee
. A department has an ID (departmentId
) and a name (departmentName
). An employee has an ID (employeeId
), name (firstName
, lastName
), salary (baseSalary
, bonus
), hire status and dates (status
, effectiveDate
, hireDate
, title
), plus addresses, phone numbers, and emails. The latter are complex types, hence the four additional classes—Address
, Phone
, Email
, GeoCoordinates
(a type within Address
)—which are datatypes used by Employee
.
Relationships are particularly significant in this example; for example, an employee reportsTo
another employee and an employee is a memberOf
a department. We might physically represent these relationships by using document references or containment. The employee document, for example, could contain an attribute called memberOf
, whose value is the ID of the corresponding department. However, GlobalCorp has decided to represent these relationships using semantic triples instead for the following reasons:
If you look carefully at the model, you will see it is peppered with semantic, or “sem”, stereotypes. The model makes use of the Entity Services UML profile included in the toolkit. The profile defines semantic and other stereotypes used to map UML to Entity Services. Using these stereotypes, GlobalCorp is able to describe in the model the IRIs, RDF types, and RDFS labels of employees and departments. It also relates these entities using predicates defined by the W3 organizational ontology.
Here is a breakdown of the semantic stereotypes used in GlobalCorp’s model:
Department
and Employee
classes bear the stereotype semType
. This stereotype associates with each class an RDF semantic type. Department’s RDF type is https://www.w3.org/ns/org#OrganizationalUnit. Thus, from a semantic perspective, a department is an organizational unit as defined by W3C’s organization definition. Employee’s RDF type is friend-of-a-friend (FOAF) ontology.Department
and Employee
classes also define an IRI. The purpose of the IRI is to uniquely identify a department or employee when we use it as the subject or object for a semantic triple. In each class we nominate one attribute to serve as the IRI, stereotyping that attribute as semIRI
. For Department
, that attribute is deptIRI
of Department
; for Employee
, it is empIRI
. Notice that each of the IRI attributes also bears the stereotypes xCalculated
and exclude
. Thus, these IRI attributes are merely calculated fields, used to help construct triples. That attribute will not be included in the XML document representation of the department or employee. The concat
tag indicates how the IRI’s value is calculated. For example deptIRI
is the concatenation of “http://www.w3.org/ns/org#d” and the department ID.semLabel
. For Department
that attribute is departmentName
. For Employee
, it is empLabel
. Notice that departmentName
is not a calculated field; it is a full-fledged attribute that will also appear in the department’s XML document. empLabel
, on the other hand, is an excluded field whose value is calculated from the firstName
and lastName
attributes.reportsTo
, which relates one employee to another, is a semProperty
with the predicate https://www.w3.org/ns/org#reportsTo. Thus if employee A reports to employee B, we construct a triple whose subject is the IRI of employee A (employee A’s empIRI
), whose predicate is the one given, and whose object is the IRI of employee B (employee B’s empIRI
). Notice the exclude
stereotype; the XML representation of an employee will not contain the reportsTo
element. We will maintain the relationship solely using a triple.Employee
and Department
shown as memberOf
is a semProperty
with predicate https://www.w3.org/ns/org#memberOf. The triple we create has the employee’s empIRI
as subject, the predicate given, and the department’s deptIRI
as object. This relationship is excluded from the document.Specifying the stereotypes in the model is beneficial because the toolkit’s transform module, which maps the UML model to Entity Services, understands these semantic stereotypes and generates code to create triples based on the content of the document. For example, here in Figure 2 is the code the toolkit generates to create employee triples showing that every aspect of this code arises from the semantic stereotypes:
let $semIRI := map:get($options, "empIRI") return ( sem:triple(sem:iri($semIRI), sem:iri("http://www.w3.org/2000/01/rdf-schema#label"), map:get($options, "empLabel")), sem:triple(sem:iri($semIRI), sem:iri("http://www.w3.org/1999/02/22-rdf-syntax-ns#type"), sem:iri("http://xmlns.com/foaf/0.1/Agent")), sem:triple(sem:iri($semIRI), sem:iri("http://www.w3.org/ns/org#memberOf"),sem:iri(map:get($options, "memberOf"))), sem:triple(sem:iri($semIRI), sem:iri("http://www.w3.org/ns/org#reportsTo"),sem:iri(map:get($options, "reportsTo"))), sem:triple(sem:iri($semIRI), sem:iri("http://xmlns.com/foaf/0.1/name"),map:get($options, "empLabel")) )
Figure 2: Auto-generated code creating triples based on semantic stereotypes
Figure 3 shows some example triples describing employee 114, his superior, and his department. He is a FOAF agent named Earl Garza who reports to Ruth Shaw (employee 1) and is a member of R&D (department 4).
Subject | Object | Predicate |
org#e114 | rdf:type | FOAF Agent |
org#e114 | rdfs:label | “Earl Garza” |
org#e114 | foaf/name | “Earl Garza” |
org#e114 | org#reportsTo | org#e1 |
org#e114 | org#memberOf | org#d4 |
org#e1 | rdfs:label | “Ruth Shaw” |
org#d1 | rdfs:label | “R&D” |
Subject | Object | Predicate |
---|---|---|
org#Global | rdf:type | org#Organization |
org#ACME | rdf:type | org#Organization |
org#ACMETakeover | rdf:type | org#ChangeEvent |
org#ACMETakeover | org:originalOrganization | org#ACME |
org#ACMETakeover | org:resultingOrganization | org#Global |
To learn more about semantics and the MarkLogic Data Hub Framework, refer to the following resources:
Subscribe to get all the news, info and tutorials you need to build better business apps and sites