Extracting Hidden Value from Your Content

February 16, 2023 Data & AI, Semaphore

"Employees spend 1.8 hours every day—9.3 hours per week, on average—searching and gathering information."

McKinsey, 2023

McKinsey clearly articulates that data isn’t often easily findable or organized in a way where it’s easy to surface. Make information easier to find within your content by creating descriptive metadata based on the meaning of concepts and relationships. Why is discoverability so important? I’m glad you asked!

Discoverability enables:

  • Re-use – Enables you to use your content in different contexts quickly and for different use cases
  • Auditing – Allows you to stay ahead of your competition and FULLY access the volume of knowledge in your content
  • Creation – Helps you focus resources on creating missing content that fills gaps vs recreating content that already exists

If you are looking to get value from your content, discoverability is key. Markets are shifting fast, and expectations from customers are always growing. For this reason, organizations need to stand out from the crowd. If not, they will eventually be replaced.

Historically when people heard “content creators,” they would think of the publishing and media industries. But a content creator is anyone who makes and publishes digital content—whether it is created from scratch or repurposed from existing content. Anyone in any industry can be a content creator. This means anyone who creates and publishes digital content should be thinking about new ways to use their content.

How successful is your organization at extracting and leveraging the hidden value in your content?

Companies typically start getting control over their content with a structured authoring process. Structured authoring is an authoring workflow that lets you define and enforce consistent organization of information in documents. It is also flexible because it is often done online and can be understood by software as well as humans. In structured authoring, content can be broken down into smaller bite-sized pieces and re-used elsewhere.

Smarter Content Roadmap

While content creators are on a journey to smarter content, structure is not enough. They want content that delivers more. Through the power of enriched metadata and semantic relationships, content creators can make their content more useful to humans and computers, and, in turn, power new applications. This means you can transition from machine-readable data to machine-interpretable data. Unfortunately, many are not accessing these tools—which means their content isn’t fully discoverable.

In addition, creators are often poor at tagging their content and often wind up re-creating it rather than re-using it. The beauty of the Progress MarkLogic platform is that it manages, indexes, stores and searches content with ease AND builds upon that content with powerful new use cases. With the help of the Progress Semaphore semantic platform, customers can auto-classify, tag content and extract information to create new use cases, applications and user experiences.

Many content creators in enterprise organizations have already:

  • Created structured or semi-structured content
  • Stored their content in powerful repositories that can index and search
  • Published their content online to allow their customers to access, purchase or use some of their content

While these steps are beneficial, this is simply not enough to fully leverage their content. As we noted earlier, getting the most value out of content means making it discoverable—which requires several steps that many organizations have not yet achieved. In our work with content creators in enterprise organizations, we’ve found those additional steps to be:

  • Create or adopt knowledge models (ontologies, taxonomies, vocabularies) or entities to define content
  • Use knowledge models to drive classification, fact extraction and entity extraction to enrich your content with metadata and understand it in different contexts. This… is… key!
  • Combine knowledge models, metadata and data to build a knowledge graph
  • Use that knowledge graph to find new ways to create, recommend, consume and monetize your content

What Is Semantics?

Semantic technology comprises a set of methods and tools that provide advanced means for categorizing and processing data, as well as for discovering relationships within data sets. Semantic searches can use an alignment to a term to navigate relationships in a hierarchy, returning more relevant results.

Ultimately, semantics provides context, meaning and insight to your data—making your data machine-interpretable.

Knowledge Graphs

Knowledge Graphs model content that is important to your organization, including the entities, concepts, topics and relationships between them. The knowledge graph creates a complete picture of the information, that organizations can use to gain new insights, drive business processes, govern information and create new revenue streams.

Content Classification

We have discussed the importance of metadata, semantics and knowledge graphs for content enrichment. Now let’s discuss classification as a key element in providing metadata so it is machine-interpretable. Since content creators usually have a lot of content, we want to make sure we are semantically serving the right content. Doing this manually would take too much time. A classifier will allow you to read and understand your document and offer up a correct classification.

Semaphore does exactly this! Semaphore classifies your documents against the model, while MarkLogic Server indexes the content for search. The taxonomies and ontologies are quickly expanded for more details for classifications and extractions. These can be internally produced or externally sourced.

Entity Extraction and Fact Extraction

Content can be classified, and you can extract entities and facts that are important to the document.

Entity extraction lets you design applications around real-world concepts, or entities, such as Customers and Orders, Trades and Counter Parties or Providers and Outcomes. The entity extraction process in Semaphore aligns the business analysts who define the entities and the developers who combine them in application code.

When a user understands their content—the documents, their structure and the data and facts they contain—they can model specific aspects of that content to allow them to extract relevant facts. The fact extraction capability in Semaphore generates units of information (“facts”) contained within content (paragraphs, sentences or terms) that have a specific meaning and can have a simple or complex structure.

This is important because the linked entities can provide additional information. Again, by classifying, you are validating that you are matching the correct documents to concepts.

Entity extraction and fact extraction are key capabilities to improve search and discovery so you can provide a better service to your customers. They allow end users to use search terms THEY understand—this is HUGE!

“I’m fifty! Fifty years old!” — A Use Case Example

Several years ago, NBCUniversal decided they wanted to do something special for Saturday Night Live’s 40th anniversary by highlighting its content. The challenge was they had so much wonderful content from different eras and weren’t sure which content to highlight. So, they decided to have an application built to connect fans to all their content and provide an entirely new experience consuming SNL content.

When I heard about the application, I was so excited and wanted to try it immediately. There was a set of skits that I so enjoyed watching, but I couldn’t remember the actress’s name or her character’s name. But I had the skit’s catch phrase engraved in my memory. Have you ever had a phrase stuck in your head that you couldn’t stop thinking about? This is exactly what I was experiencing! In fact, every time I thought of it, I would walk around my home, kick my leg in the air and say, “I’m Fifty! Fifty years old!” And yes, it would bring me great joy! (Don’t judge me.)

On this day, I would find out if I could benefit from this phrase that was stuck in my head. I typed “I’m Fifty” in the search bar and, to my delight, ALL of Sally O’Malley’s skits that were played by Molly Shannon popped right up! I spent multiple hours discovering skits I never even knew existed. All because it provided maximum discoverability.

This, my friends, is the power of fully leveraging your content!

The question remains, how successful is your organization at extracting and using the hidden value in your content?

If your organization has content that needs to be accessed by internal or external people, are you confident they are discovering the content they need as easily as possible? If not, you should consider if your organization has:

  • Created or adopted knowledge models (ontologies, taxonomies, vocabularies) or entities to define content
  • Used knowledge models to drive classification, fact extraction and entity extraction to create metadata that helps you understand your content in different contexts
  • Combined models, metadata and data to build a knowledge graph
  • Used that knowledge graph to find new ways to create, recommend, consume and monetize your content

If not, I assure you that your organization is not getting the most value from your content—and may be leaving money on the table or creating frustrated consumers who might be looking at other solutions!

Unlock the Full Power of Your Content

To learn more about how to unlock valuable content by adding semantic knowledge, watch our on-demand webinar.

Contact us to speak with someone about getting more value from your content.

La-Verne Chambers

La-Verne is a Customer Development Representative in the Customer Success Management team. She serves as a trusted advisor to our customers, supporting their continued use of MarkLogic by ensuring customers have current knowledge of new developments, assisting our customers to be successful and streamline their businesses by using our products and sharing new use cases, and being a liaison between our customers and MarkLogic resources

When she is not working, La-Verne loves deep sea fishing and usually averages about 25 – 30 catches per fishing trip. She also loves organizing events and spending time outdoors with her dog, Brooklyn.