I joined MarkLogic 3 months ago and it is an exciting time to be selling the only Enterprise NOSQL database that manages “any-structured data” (including XML) with its ACID compliance, JSON, SQL and REST interfaces, HA and DR functionality – exciting that is once I can decipher the acronyms. More pertinently, understand the capabilities those acronyms represent and how they deliver the value that around 400 MarkLogic customers benefit from every day.
I decided to get myself to a place where the alphabet soup became a bit more palatable – but where to begin? Taking guidance from the words of Julie Andrews: “When you read you begin with A-B-C, when you sing you begin with Do-Re-Mi,” I believe that in the MarkLogic world it all begins with S-Q-L. This may seem counter-intuitive but it seems that unless you understand SQL then NOSQL will mean very little to you and Enterprise NOSQL even less.
According to Wikipedia SQL or Structured Query Language is a “special-purpose programming language designed for managing data held in relational database management systems (RDBMS).” Thanks to the likes of IBM and Oracle (formerly Relational Software Inc), SQL was developed and became commercially available in various systems by the end of the 1970s and early 1980s. It became ANSI (American National Standards Institute) standard in 1986 but in the tradition of all great standards it can differ slightly between databases and different vendors. SQL is used to define and manipulate data but was specifically designed as one of the first commercial languages for the relational database model. In very simplistic terms, this model uses tables of data in columns and rows which are related together using a primary key.
(Enter NOSQL, Stage Right)
Let’s clear up the name first. NOSQL, originally meaning “No SQL”, is now generally acknowledged to be an acronym meaning “Not Only SQL”, since a number of databases of this type do allow SQL type language to be used.
My colleague David Cassel delivered a presentation on NOSQL and the various types of NOSQL databases and he explains that there are (roughly) four different types – Graph, Key Value Store, Column Store and Document Store. MarkLogic is one of the latter and this means that it uses identifiers called URIs and each of these has a document associated with it. A document is a unit of storage and is analogous to a row in a relational database. To see how they each stack up, check out the MarkLogic Founders Award winner Mike Bowers presentation on thepros and cons of each.
The key differentiator for NOSQL databases is that they do not use the relational model nor any fixed schema. They also, mostly, don’t use SQL to access data. This can lend a great deal of flexibility, performance and speed particularly when dealing with unstructured data – that is data that does not fit well into columns and rows without losing much of its meaning – for example an email, document or tweet. In a world where Big Data has entered common parlance, NOSQL is coming into its own because we are constantly reminded that 80% of our data is unstructured and so a new generation of databases is needed to help deal with it.
So-fah-So-Good? Next time I will take a look at some acronyms that represent the features that make MarkLogic the only Enterprise NOSQL database.
*With thanks to Tony Hughes for inspiration.