As the co-founder of Semaphore, I’ve been on the forefront of driving the strategy and product direction for the company since before it was a company. And long before our 2021 acquisition by MarkLogic. Back in 2001, we focused on search engine development and knowledge management. But with Google on the rise, we quickly realized our competitive differentiation would come by elevating both the data quality and data agility we could offer customers, across all types of formats.
Both data quality and data agility are intrinsically delivered by appending, and then exploiting, accurate and wide-ranging metadata to whatever data elements are pertinent to the task at hand. After all, metadata is the backbone of more modern data architectures such as data hub, data fabric, data mesh and knowledge graphs. More on this later in the customer examples since Semaphore is a recognized leader in this field.
Fast forward to today and our new home at Progress, I still see enormous potential for organizations to make better use of the data they already have.
According to IDC figures, 80% of data within an organization will be unstructured by 2025. Yet, most decisions are based on the 20% of data that is structured. This raises an interesting question: is the 20% they’re using sufficiently representative of the 80%? Put in different terms, is the data companies use steering them in the best possible direction? It’s hard to know because the costs of sufficiently describing unstructured data have, historically, been out of reach.
Our goal at MarkLogic and Semaphore has been to solve for that delta and enable organizations to make better use of the data they already have. Too often, customers ask questions about data that is framed within their current use case parameters. The trick then becomes to approach customer conversations, not limiting inquiry to narrow data concerns, but being open to the vast potential for data to empower organizational decision making. There are endless applications for this, but to get us started, I’ll present four key areas that can fundamentally transform how organizations use their data.
Reactive decisions are well and good, but, by definition, they leave practitioners chasing crises that could have been avoided. Evaluating data and using it to forecast and mitigate maintenance problems before they arise has delivered dramatic costs and efficiency gains for our manufacturing clients.
For example, product shutdowns of highly temperamental CPU fabrication machinery created production delays and forced one of our customers to scrap batches of products because they didn’t meet purity standards. A little dust can wreak havoc, apparently. But this problem was later mitigated by following a preventative maintenance regimen, made possible by smart use of data.
Using data, engineers could predict when parts would fail and when cleaning would be needed, then address those concerns in advance during emergency maintenance of other parts, or during scheduled system downtime. To do this, the customer created a knowledge graph with all the data from their processes and documentation, as well as all engineering data and IoT data. They could then query the data in real time to see what systems were identified as needing proactive service. By leveraging their data, the company enjoyed massive cost savings. They moved away from reacting to emergencies and discarding batches of products—the latter of which racked up financial penalties—and embraced proactive maintenance. Yet this is only one of the many potential applications for our technology.
Many industries have products that are essentially loss-leaders but endure because they open opportunities for recurring revenue. Printer manufacturers make money on the ink, for example. So, minimizing costs for customer support on loss-leaders is paramount. Every moment spent wrangling with customers over simple issues loses the company money. Customer self-service, therefore, becomes essential to sustaining this business model.
Using a knowledge graph that tapped their documentation, knowledge base and support data, this customer used Semaphore and MarkLogic to create a system where customers could explain their problem using natural language. They then would get served the precise passages they needed to remedy their issue—aggregated into a handy step-by-step troubleshooting guide. This empowered customers to quickly solve problems and reduced service costs, all via a smart, data-powered web application.
Many critical business systems use mainframes, but mainframe MIPS are really expensive because of the hardware implication, the software application and the maintenance implications. Few, if any, systems are designed to manage transactional workloads for the web era. We have one healthcare client that has hundreds of requests per minute during its open enrollment period. To scale the mainframe for these peak periods would be extremely expensive, costing in the hundreds of millions. Yet it would sit idle the rest of the year—and be expensive to maintain. For these contexts, MarkLogic provides a data hub, i.e., a cache, deployed as an operational data hub, abstracting the enterprise systems and delivering high scalability and easier access to simpler data. After all the data is ingested, it’s denormalized so it’s much closer to the target use case needs and is therefore faster to serve. The MarkLogic server and clusters are then scaled according to demand. Outside of peak times, it can be scaled down, offering both power and agility.
One of the key advantages of MarkLogic vs. expanding source systems is that typically none of those systems are cross-referencing each other. Data silos with no contextual information about it. With MarkLogic, you have a fully linked, harmonized and queryable system. Instead of a data black hole, you have full transparency. And, you have much more power because you are joining several systems together.
Tagging is a continual pain point for anyone using web content management systems. Authors add tags, without regard to whether they are compatible or meaningful. Frankly, the same author can apply divergent tags to analogous content from day to day, creating an untenable metadata mess. Instead, we can use MarkLogic to work behind the scenes applying some metadata detail, delivering a content database and a document database that make much more sense. Now, scale this scenario up beyond a single web instance to thousands. We have clients using Semaphore’s semantic metadata capabilities to manage hundreds of websites in 27 languages. At this scale, a disciplined approach to organizing metadata becomes a business imperative.
The next evolution of data isn’t necessarily about finding new data sources, but about making thoughtful use of the data organizations already have. I look forward to continuing to work with clients to find new and novel applications for business data utility, now as part of Progress.
View all posts from Matthieu Jonglez on the Progress blog. Connect with us about all things application development and deployment, data integration and digital business.
Let our experts teach you how to use Sitefinity's best-in-class features to deliver compelling digital experiences.
Learn MoreSubscribe to get all the news, info and tutorials you need to build better business apps and sites