From Concept to Connected Data with Data Hub Central Community Edition (DHCCE)

May 11, 2020 Data & AI, MarkLogic

Do you like to work in a simple graphical user interface? Do you want to have a visual representation of your data models that everyone in your company can understand— and that your architects and DBAs can use without writing code?

With the community tool Data Hub Central Community Edition (DHCCE), you can. DHCCE allows you to create a semantically enriched Entity Services data model for your Data Hub by simply drawing graphs. DHCCE provides a no-code solution to designing entities for your MarkLogic Data Hub.

What is DHCCE?

DHCCE is a suite of applications for the MarkLogic Data Hub that provides graphical data modeling and exploration of your Data Hub.

Note: DHCCE for MarkLogic Data Hub is acommunity tool. As such, it is not supported by MarkLogic Corporation and is only updated and corrected based on a best-effort approach. Any contribution or feedback is welcomed to make the tool better. DHCCE is designed to run on MarkLogic Server 10.0-2 with MarkLogic Data Hub 5.1.0 installed.

Who is DHCCE for?

DHCCE is for data architects and business stakeholders who collaborate on data models. It gives them a graphical user interface to create representations of business entities and the relationships between them. Instead of creating hard-to-read ERDs or UMLs that DBAs must then translate into tables and keys, DHCCE makes simple designs that plug directly into MarkLogic’s Data Hub. That gives you elegant simplicity and raw power. It’s like drawing a diagram on a whiteboard, if the whiteboard were connected to a flexible multi-model database.

How does it work?

It used to be that only a very exclusive group of experts concerned themselves with data. Data was like one of those compartmented wing-nuts and lag-bolts organizers your dad had in your childhood basement, with the rows and columns of identical little drawers; sometimes you’d open a bunch of those drawers and fish around for a nail to hang a picture.

Now, data is your enterprise’s most important asset, and everyone uses it to get their job done. As more and more people care about and work with data, a need has emerged for a simple visual tool for data modeling and exploration.

What is data modeling?

Data modeling is making new forms of data for new uses. At MarkLogic we talk about using Data Hubs to integrate data from silos. That’s a way of envisioning a new holistic form, with pieces that never fit together before. Modeling often happens during a whiteboard discussion that’s more less detailed, depending on which people are in the room with which roles. I’ve been in many of these in the course of my career, and they’re a lot of fun— creative and fluid. No matter who’s attending, these sessions start the same way: a couple of circles connected with a line.

DHCCE Connect

With DHCCE’s powerful Connect application, you also use circles and lines to make your data model. Circles are your entities and lines are the relationships between them. If you want to model a customer order connecting with a product, you start by dropping a circle on your canvas and naming it Customer.

You can click and type in a field to add properties like orderId, and draw lines to make relationships that describe semantic associations like “places order.” You can also create purely conceptual entities, such as a product category that links orders and new product catalogs.

DHCCE Connect models circles as entities and lines as relationships.

These circles and lines are a fast way to create data models in a collaborative environment. They’re also a very efficient way to share models across different groups in a company.

No worries about someone erasing the whiteboard for another meeting: all your work is saved in MarkLogic.

Now, pictures are great, and simple diagrams are worth a thousand lines of SQL, but DHCCE also integrates directly with the MarkLogic Data Hub framework. Entities you design with DHCCE Connect can be used directly in mapping, harmonization, and mastering flows that you code yourself using our Data Hub Framework, or lay out with another one of our graphical tools, QuickStart.

DHCCE Explore

Because DHCCE is so tightly integrated with the MarkLogic Data Hub, DHCCE’s Explore application can provide a searchable graph that shows your data in the shape of the data model you created. Once you’ve run your Data Hub’s mapping and mastering flows, you can visualize graphs in your Data Hub. You’ll see the same circles and lines you used when you were modeling, but now you’ll see a circle for every instance you added to the Data Hub. You can explore more deeply by clicking and dragging the graph, or by typing in the search field. Using all the power of MarkLogic’s multi-model indexing, you can filter your view to show only specific entities. You can also expand entity-to-entity and entity-to-concept relationships with the graphical user interface to your data.

DHCCE Explore provides a searchable graph that shows your data in the shape of the data model you created. 

If you have previously mastered data and merged two entity instances, you can unmerge using DHCCE Explore.

Explore is also the place where you can view provenance information, for example, the specific Data Hub flow that created a particular entity instance.

DHCCE Know

MarkLogic Data Hub makes use of its native triplestore. There’s a sophisticated semantic model underneath your entities and their relationships, and DHCCE Know is the place to go to explore the ontological dimensions of your Data Hub.

DHCCE Know is the place to go to explore the ontological dimensions of your Data Hub.

Get started

You can build DHCCE yourself, or just run it as a .jar file against any Data Hub. Get started today with this GitHub Wiki guide for DHCCE.

Related resources

Frank Rubino

Frank first joined MarkLogic in 2006 after a ten year career as a Computer Scientist at Adobe Systems, building collaboration, XML, and data-driven features for Creative Suite. At MarkLogic he was a Senior Principle Consultant, working for customers like Pearson, HMH, Publishers Press, McGraw-Hill and Congressional Quarterly. He left MarkLogic to serve as CTO at Spectrum Chemical & Laboratory Products, where he led an Oracle EBS migration, and an e-commerce website re-architecture that used MarkLogic for content-marketing. After Spectrum, he was Executive Director of Technology and UX at Kaplan Publishing, where he built a mobile content delivery platform for 200,000 students. In 2011, he rejoined MarkLogic and took a Solutions Director role, where he enjoys a mix of development, architecture, and sales projects. He tweets at @xmlnovelist.