MarkLogic from a Relational Perspective: Part 4

The Right Partner for Your Data Journey

This is the fourth in a series of blogs for people coming from a relational background, to help you understand the differences in how MarkLogic handles data integration and access.

TL/DR: Building the best solutions requires both the right technology and the right support. With MarkLogic, you benefit from a full assortment of capabilities, tools and experience that you need to succeed with complex data-integration projects.

The data technology landscape offers more capabilities and choices than ever before. How do you build solutions and a platform that lets you exploit your options and not be overwhelmed?

The reality is, that even if you have perfect technology, you can fall short. In every situation, you have a variety of choices on how to model and implement. Picking the optimal development path will depend on understanding the pros and cons of each option and how to combine them into the optimal whole.

Enter MarkLogic. MarkLogic is tightly focused on helping its customers build platforms and applications to integrate data and build data hubs using new data-management technologies.

MarkLogic has unparalleled experience in integrating not just data, but data-management technologies, to fully exploit the capabilities of each and let the benefits of each magnify each other.

Consider MarkLogic as your partner in your data journey.

The Big Picture

MarkLogic is an exceptional software platform for integrating data and is also a partner to many on their data journeys. Both of these are critical to success in navigating today’s new data world.

The Data Journey – the Past

In the business world of the past, data and data technology was simpler than today. The general workflow for IT projects was a well-defined set of data and requirements, and the project consisted of implementing the requirements, with a relational database providing access to the data. Individual applications were largely technology and data silos with well-defined access points or queries used to link them.

The Need for More

Today, the world is a very different place. Government regulations require reports that span the entire enterprise. Risk analysis requires the incorporation of non-relational data. Sales and marketing demand a 360-degree view of prospects and customers, from order histories to their activity on social media. Loan approval and tracking need access to wider and broader types of data.

In today’s and tomorrow’s world, a simple relational solution will not suffice. To create world-class solutions, it is necessary to augment yesterday’s data-management technologies with a variety of new ones including semantics, document, search, geospatial and alerting.

Semantics allows the incorporation of a firm’s understanding of its industry and internal organization into its data-management system, making searches and queries far more powerful. Semantics allows the linking of related documents making it easy to find all of the data that relates to a topic. Both of these make it possible to precisely determine exactly what data applies to a question or issue.

Document systems make it possible to handle data complexity far more efficiently than is possible with relational. Document-based, structured queries provide full query abilities within a single entity type, even when documents have structures that vary from each other.

Search makes it possible to find what you are looking for without any understanding of the data’s schema (the data can be free-flowing text). It is extremely efficient and scalable with large numbers of concurrent users able to query simultaneously.

Geospatial, of course, allows for queries to include location, and alerting enables a data management platform to push data to users instead of forcing them to ask for what they want. All of these capabilities make it possible to create applications more powerful than what has been possible before.

A Complicated New World

Along with new abilities comes increased choice. Today, when a new use case comes up, there are many approaches for implementation.

Some do not believe this is an issue. A common belief is that “there are many types of data, so there needs to be many types of database.” Implicit in this is the idea that each data type is a silo and needs to be handled with a separate technology silo. The thinking is that while there may be more moving parts, each can be compartmentalized, keeping the overall complexity manageable.

There are a number of issues with this approach. First, by building separate data and technology silos, it becomes difficult to achieve the benefits possible from a true consolidated, multi-model approach.

If all you have is a hammer, everything looks like a nail. If you build technology and data silos, the owners and developers of each silo will tend to look at the world from its perspective. In this environment, applications will tend to not take full advantage of the capabilities of a truly integrated, multi-model (not multi-product) approach.

Second, it is often up to the application builder to decide what kinds of data to use. Do you bring hierarchical data into a single document, or do you use semantic triples to link separate documents? Do you convert tables to triples and query with SPARQL? Do you convert documents to SQL tables and allow users to query them?

The choice is up to the architect/developer. It is not a pre-existing fact of nature.

Finally, it is a lot more work to build and integrate separate technology silos to create a single application than to start with a data-management approach that stores them as a single database and considers the technologies to be a part of the whole. (For a real-life example of this, read my colleague, Damon Feldman’s, post, Avoiding the Franken-beast: Polyglot Persistence Done Right.)

Moving Forward

MarkLogic does not just have a hammer, it has a full assortment of tools. MarkLogic implementations use each of the new data-management technologies in an optimal, integrated way.

MarkLogic’s skills in integrating the new data-management technologies benefit users in a variety of ways.

First, there is the MarkLogic Data Hub software. This incorporates best practices, code generation and advanced functionality (like deduplication of data) into a front end on top of the core MarkLogic multi-model database. With the Data Hub, many aspects of the optimal integration of the underlying technologies have been thought out in advance. Users diagram their setup, and the Data Hub generates harmonization code that automatically generates triples to link entities, populate documents with lineage information, create a canonical view from multiple underlying data sets and much more. In some cases, users may never need to understand the underlying technologies.

Using the Data Hub allows firms to jump start their journeys into the new world of data management.

As useful as it is, the MarkLogic Data Hub software does not fully handle all of the situations you may come across. In this case, having MarkLogic as a formal partner is invaluable. MarkLogic is focused 100% on complex integration projects. Our presales teams, support, consulting and senior management spend their days understanding the interactions between business needs, data and technology. They are eager to share their knowledge, background and experience.

When you call upon MarkLogic for help, you are not accessing an academic understanding of technology. MarkLogic is built upon the accumulated experience of an endless number of customer interactions in every industry covering a broad spectrum of use cases.

Learn More

Read the other posts in this series:
- Part 1 – MarkLogic in the Technology Landscape
- Part 2 – Data Modeling – From Relational to MarkLogic
- Part 3 – Moving from Relational to MarkLogic
[Free Training] MarkLogic University
MarkLogic Consulting Services

MarkLogic

David Kaaret

David Kaaret has worked with major investment banks, mutual funds, and online brokerages for over 15 years in technical and sales roles.

He has helped clients design and build high performance and cutting edge database systems and provided guidance on issues including performance, optimal schema design, security, failover, messaging, and master data management.