Your data tells a story—where it started and how it has been transformed over time to get to its present state. This is data lineage. Data lineage is a critical component of any data-driven business, but especially so for firms operating in heavily regulated industries such as capital markets.
Evolving Regulatory Landscape Places Premium on Data Lineage
Regulatory compliance requirements are putting greater transparency demands on firms to trace and audit data. For capital markets trading firms, data lineage must be implemented to support risk management, data governance and reporting for various regulations such as BCBS 239 and MiFID II. Although these regulations have been in effect for a while, financial firms are still struggling to meet the ongoing regulatory requirements as ad hoc reporting requests are exposing the limitations of existing compliance systems.
We’re not just talking about financial firms and requirements for surveilling capital markets trading activities. Every company is feeling pressure from data privacy regulations, such as GDPR, and would benefit from having data lineage built into its data-governance systems.
It is not enough to have a “line of sight” into individual lines of business. Data must be governed at an enterprise level, across every line of business, in every jurisdiction where business is conducted. Moreover, as regulators digitally transform, requirements will increase for improved data integration and lineage between financial firms and regulatory agencies to automate reporting, lower costs and facilitate greater transparency.
Data Lineage Makes Compliance Easier for Financial Firms
For financial firms, data lineage is an imperative. However, implementing with the right technology is paramount. Dr. Giles Nelson, CTO of Financial Services at MarkLogic, shares his perspective: “Governance of customer information, financial trade records and data usage threatens to become a crippling burden unless technology can keep pace. Data lineage features heavily in meeting these regulatory requirements and, in fact, can be part of promising paths towards making compliance easier (believe it or not).”
With regulators continuing to demand more effective data governance, financial firms may need to reassess their data-management capabilities around data lineage. Here are a few things to consider when implementing a data-lineage development approach:
1. Building the Business Case
One of the biggest challenges to adopting data lineage is gaining management buy-in for initial projects. To increase the likelihood of project funding, enterprise architects and data managers should not only make the case at the technical level, but also at the operational and business-user levels. The business opportunities of sustainable data lineage should justify the implementation cost, but can be difficult to prove in the planning phase. Here are some areas that can help make the business case for data-lineage development:
- Enterprise-wide data lineage leads to better data quality and reliability, which leads to more accurate analytics and decision-making.
- Improved data-discovery capabilities are critical to meeting evolving regulatory requirements and ad hoc compliance requests.
- Data-lineage projects have the potential to reduce the costs associated with redundant data and systems as well as identification of reusable data and processes.
- The insights that data lineage provides expand the ability to more easily identify new business opportunities, such as the creation of new products or services, through better understanding of data and visualization of data and processes.
2. Selecting a Technology Solution for Data Lineage
While many data-lineage projects in financial-services firms started as in-house, manual developments, an increasing number of firms have graduated to a mix of in-house and vendor solutions. When deciding on what technology to deploy, organizations should consider existing data-lineage systems, whether those systems are automated, semi-automated or manual, and what target outcomes the firm is aiming to achieve.
Existing systems may include data-lineage functionality built on top of relational database technology, but this combination can be limiting in large and complex environments where huge amounts of data must be traced but is frequently changing. In this case, a multi-model NoSQL database with the ability to more easily integrate various schemas for structured and unstructured data and metadata is likely to be a better alternative.
3. Implementing a Data-Lineage Approach
For financial services firms that have already implemented data lineage to meet regulatory compliance requirements, there is an opportunity to adopt the discipline for other business requests, such as use cases that involve a need for increased automation. For firms at the beginning of the process, here are some high-level implementation guidelines that should be considered ahead of any project:
- Consider the scope of a data-lineage project. What are the main drivers of the project? How many data flows will be covered? How will structured and unstructured data be handled? What systems will be impacted?
- Start small. Identify a pilot project with a well-defined scope that will have a relatively large impact on the firm. Successfully executing a smaller project quickly that shows true business value will help secure management buy-in and funding for new use cases.
- Determine if the required skills to implement and maintain a data-lineage project exists or whether external help is needed. Also ensure that the project not only has technical implementation team involvement but also includes business stakeholders and the right project champion.
Data-Lineage Resources
The object of this post is to provide an introduction to data lineage and help you begin thinking about best practices for implementation. For more detailed information on data lineage, please refer to the 2019 Data Lineage Handbook. The handbook is a comprehensive guide to understanding the business and technical challenges and opportunities of data lineage and why a solution like MarkLogic® should be a core component of any implementation approach.
In addition to the handbook, you may be interested in listening to a webinar hosted by Data Management Insights on data lineage to drive compliance and as a business imperative.
Ed Downs
Ed Downs is responsible for customer solutions marketing at MarkLogic. He draws on his considerable experience, having delivered large-scale big data projects and operational and analytical solutions for public and private sector organizations, to drive awareness and accelerate adoption of the MarkLogic platform.