The inescapable headlines about data breaches mean that we’re all aware of the proliferation of data security issues as well as industry and government efforts to stem them. In addition, data crises at Equifax, Anthem, eBay, JP Morgan Chase, Yahoo, and the U.S. Office of Personnel Management indicate that being a large organization with ample resources does not preclude security vulnerabilities. At a time when integrating and sharing data is more critical to risk management in healthcare, financial services and national security, protecting data means taking measures at the company, database, document and element levels.
Element level security, also known as field-level security, granular-level security, and even cell level security in relational databases, allows you to identify and protect sensitive information within documents. Let’s think about a medical record. We certainly need to protect the entire record from external parties. But it is possible to protect personally identifiable information (PII), such as a social security number, within the medical record. This allows medical staff and even researchers to see health information while hiding information that is relevant to billing process. The reverse is also true—we can limit the amount of health information that billing personnel see.
Schema-Driven Security
Some database platforms, particularly relational databases, achieve this level of security using a schema. If we think of all of the forms we have to fill out a doctor’s office, it’s easy to see how a schema can help label the information. Last name, first name, date of birth, gender, married, insurance number, social security number … it all has a label and it fits neatly into columns and rows. Adding a flag or tag to one piece of that information, or even a series of pieces of information, is easy to do—and then saying person A can see the information in column H … That makes sense, right?
But schemas can be rigid and difficult to change, and we haven’t talked about the free form part of those doctor’s forms—where you can write in information in the “please explain” options or about the text written in the rest of the record. Further, it’s more difficult to secure textual information with cell level security when on a relational database platform.
Element Level Security
A better approach is to use XML or JSON to identify pieces of information within a document or other entity, since both of those formats are self-describing. That is to say, that every field has an element name for XML (or a property name for JSON) as well as a value that is assigned on import. Now you have some context for every piece of data just by looking at the element or property name.
An electronic medical record likely comprises data of all sorts—documents, PDFs, and relational data spring easily to mind. Access to the Word document will give you access to all of the information in it—as Word does not define elements within documents. Converting Word documents to XML and/or JSON documents affords us an opportunity to add information to the contents of the document, and these become the elements and properties. In relational terms, we store the information that lives in columns and rows in a relational database within each XML and JSON document.
That means that each document holds all of the information on how to use it—all without having to define a schema. For security, that means that we can create rules using every defined element within the XML and JSON documents. Those can be added to and changed, but with no impact on the information already described within them.