The Power of Abstractions

November 12, 2021 Data & AI, MarkLogic

One powerful theme throughout the history of computing — indeed of human thought — has been the impact of powerful abstractions, and how they helped do more. Computing abstractions lessen friction, and make entirely new things possible on a broad scale.

Not to bore you with history, but consider that COBOL was the first portable computing language, essentially abstracting hardware from logic. More recently in the infrastructure domain, virtualization has abstracted compute, storage and network, and resulted in a new abstraction — the container. New languages led to better abstractions of logic. Browsers led to an abstraction of the user experience, and so on.

Each abstraction did the same thing — reduce friction and make entirely new things possible.

But what about data itself? How do we abstract data to make it more useful? More helpful? Reduce its friction, and make new things possible?

The answer — as many people will tell you — is to create metadata. Create tags and labels about the information, instead of trying to use the information directly. Dramatic examples abound: smart search, query, and discovery that are foundational in domains like financial services, complex manufacturing, pharma, public safety, and more. For them, the need for a powerful data abstraction — based on metadata — is a matter of necessity, and not a luxury.

The notion of using metadata to abstract data — or creating a metadata abstraction if you prefer — is not a new one, and has been with us for a while, especially in complex data landscapes.

Most of these historical approaches have been found wanting for a reason: they could not keep up with what people wanted to do with their metadata, namely quickly capture key aspects of whatever data was at hand, and put it to work in new ways. These platforms were inherently rigid, and couldn’t lay claim to being a useful abstraction.

When you look at all the different ways people have tried to make metadata more agile and abstraction-like, you’ll notice something: they used a database underneath that wasn’t designed for purpose. Many of the inflexibility challenges simply arose from people trying to describe a rich, agile metadata world with static relational tables or similar.

You see the same unproductive pattern arise when larger IT groups attempt to solve a complex metadata management problem with tools that — while they might be familiar — aren’t designed for the task.

Can We Talk About Agility?

Looking at the history of abstractions, they all delivered agility in a new form, and people liked it.

It is reasonable to ask why is agility so important when creating a metadata abstraction? Simply put: because your need is supporting an organizational learning process. Somebody wants to get smarter about something important, do so quickly, and not stop. And that is never, ever a linear process, no matter how many times someone tries to put that on a schedule.

Inflexibility essentially undermines the desired outcome, and the effort joins the list of well-intentioned IT-led initiatives that didn’t work out so well. At least now you understand the problem somewhat better.

Turning back for a moment to the interesting assortment of industries and fascinating use cases that are dead serious on using metadata to create data agility, the common thread is that they were trying to get smarter about something that was very, very important to them. The answers lay in data that couldn’t be easily used directly, it was complex and had to be abstracted first.

No ordinary database, data warehouse, data mart, data lake, etc. is going to solve that problem, as none were intended to create a metadata abstraction of complex data.

The specifics of what was important to these organizations — and how they talk about it — will vary. But never the level of importance, which is typically quite high, e.g. executive involvement in outcomes.

Considering the broader industry analyst community, you’ll see passing thoughts here and there regarding the need for metadata abstractions, but that’s about it. They haven’t connected the dots that these abstractions solve an interesting problem, namely enabling organizational learning in complex data domains that are really important to the people who have them.

As a result, there’s really no visible framework today that talks about this specific problem, or provides somewhat useful guidance. Someone will see the pattern before long, never fear. Like I said, abstracting data is a powerful concept.

Where Does That Leave Us?

Powerful abstractions have a long history of being really good things. We know how to abstract data — metadata. When we try to abstract and use metadata using something that wasn’t designed for purpose, we end up with unexpected complexity and thus will lose agility quickly.

That’s important, because the problems being solved inherently require agility, as there’s organizational learning going on. Fail to be agile, fail the mission.

If there is one plug I could make here, it’s that MarkLogic is a metadata-centric data management abstraction — a “database”, if you insist — that is built for complex data agility. That’s where it uniquely shows up, and that’s why customers like it.

If I go back in history to when MarkLogic was founded, that was one of the more powerful ideas floating around at the time. They took it, and ran with it, along with several other related ones: search-centric, schema-on-use, etc. It turns out that the need for a metadata-centric data management platform shows up in many shapes and forms.

The argument around powerful ideas usually boils down to timing — the right idea at the right time, hopefully when there’s a big problem to be solved. There is plenty of evidence that there are plenty of complex data problems with burning issues attached.

The challenge — as with any new abstraction — is changing people’s mindset.

Chuck Hollis

Chuck joined the MarkLogic team in 2021, coming from Oracle as SVP Portfolio Management. Prior to Oracle, he was at VMware working on virtual storage. Chuck came to VMware after almost 20 years at EMC, working in a variety of field, product, and alliance leadership roles.

Chuck lives in Vero Beach, Florida with his wife and three dogs. He enjoys discussing the big ideas that are shaping the IT industry.