Embracing Open Standards for Accessible Data in the Pharma Industry

August 14, 2019 Data & AI, MarkLogic

This blog is the second of a four-part series that explores the importance of FAIR (Findable, Accessible, Interoperable, Reusable) principles for pharmaceutical companies looking for a data platform to tackle their most critical business problems.

FAIR Data Principles for Pharmas: Accessible

For data to be accessible, it is as much about who can use the data and how they can use it as it is ensuring that other people have limited or no access to the same data. When we talk about data accessibility, there is a range of things we are concerned with.

From the user level, we want to ensure that only the necessary people have access to the data – it must be secure and governed. Different user roles should have different permissions, and data should be blocked or partially redacted for certain users. Perhaps some users can read the data, whereas others have full write access to author new data or make updates to existing records.

Imagine a researcher looking at patient records of a specific illness. They may want to know all the details about symptoms, condition, prognosis and treatment, but in most cases, it’s probably best that any personally identifiable information is redacted.

Having role-based security is one thing, but what’s more powerful is to be able to control accessibility to the most granular level – right down to a partial data record within a dataset. In the MarkLogic Pharma Research Hub, we have created a user interface built on top of the MarkLogic Data Hub that embraces security best practices. Users have access to only the data they are allowed access to. By eliminating legacy concerns around sharing data, this level of granular security is actually the key to driving collaboration that accelerates value across the organisation.

Another concern is accessibility from an interface level. We want to be able to use standards and protocols so that it is easy to input and output data from a particular system, and for multiple systems to be able to interface with each other. A non-standard interface would take time to learn how to use and may lack support – all of which would be a hindrance to accessibility. However, MarkLogic embraces well-established, open standards. Thanks to its advanced data integration capabilities, you can load all kinds of structured and unstructured data from disparate data sources into MarkLogic without having to do any costly extract, transform, load (ETL) upfront.

Outputting data from MarkLogic to end-user applications is very flexible and there are REST endpoints for HTTPS communication, SPARQL for linked data and ontology queries, JavaScript/Java interfaces and SQL for BI tools like Tableau. JavaScript can be used for all manner of data queries, manipulation and insertion, and as one of the most commonly used programming languages in the world, there is a really low barrier to entry for developers.

RDF and SPARQL are both W3C standards. Using semantics is a really powerful way to describe pieces of your data. With MarkLogic, you are not limited to using just semantic triples, instead you can enrich existing datasets with an ontology of your choosing. Thus, metadata, triples and documents can all live encapsulated together. We find that this flexible envelope pattern is a really great way to achieve FAIR data.

When it comes to security, MarkLogic has one of the most secure architectures and is the only NoSQL database to be awarded Common Criteria certification – a government grade standard for security engineered from the ground up and by which vendors demonstrate their commitment and ability to provide security to their customers. This is just one of many reasons why we are trusted by five of the largest global pharmaceutical companies with managing their most sensitive data.

Other protocols used by MarkLogic ensure that data can be fully encrypted and secure in its transmission. MarkLogic is also fully ACID compliant, meaning that your data is consistent and reliable.

MarkLogic’s role-based security also allows you to implement roles as a hierarchy. You can specify that individual documents require one or more roles in order to access them. Overall, our approach to security and accessibility means that you have greater flexibility and control over your data. This all leads to MarkLogic being a great solution to integrate data from silos, increasing accessible data in a centralised store without any security concerns.

What’s more, earlier this month, MarkLogic announced its Title 21 CFR Part 11 compliance for the MarkLogic Data Hub, which supports leading healthcare, pharmaceutical and life sciences companies in meeting U.S. Food and Drug Administration (FDA) regulatory guidance for the management and storage of electronic records and signatures.

Learn More

Duncan Grant

Duncan is a Solutions Engineer at MarkLogic with a particular interest in Linked Data, web applications and visualisations. Duncan has a background in web application development, having worked in several UK startups before specialising in Linked Data and enterprise data integration.

Duncan has spoken at meetups and conferences on topics such as graph visualisation, databases, Linked Data and JavaScript development.