Mortgage Document and Metadata Repositories – Securitization

April 11, 2016 Data & AI, MarkLogic

This is one in a series of blogs on building a Mortgage Document or Metadata Repository to enable firms in the mortgage industry handle today’s needs and exploit its opportunities.

Painful Securitization Background

Throughout its history the mortgage industry has undergone periodic upheavals. For example in the 1980s and 90s, 1,043 out of the 3,234 savings and loan associations failed at a cost of $160 billion, including $132.1 billion from taxpayers.

A major reason the mortgage industry has been so unstable is that in the past it funded itself through short-term deposits while making long-term loans, often at fixed interest rates. This mismatch between borrowing and lending meant that when interest rates suddenly spiked banks revenue remained fixed while costs skyrocketed. Bankruptcy was a common result.

Because banks generally held mortgages for the life of the loans, their ability to make loans was limited by their available capital, which limited economies of scale in loan origination. This potentially made the loan origination process less efficient than otherwise would have been the case.

After the S&L crisis it was felt that mortgage securitization offered a way to overcome these issues. A firm could make a mortgage loan and then securitize it, freeing up its capital and allowing it to make many more loans. It was expected that this would improve the quality of risk analysis because publically traded instruments would be priced and analyzed by the market as a whole (including risk assessment firms like Standards and Poor and Moodys) instead of just the originating bank.

Investors were to benefit from securitization because mortgages could be held by individual investors and non-mortgage firms.

While these potential benefits are real, mortgage securitization took almost a fatal hit when it was blamed for causing the 2008 mortgage crisis. In the period leading up to 2008 there was a breakdown in both the lending and securitization processes. Improper procedures were sometimes followed with mortgage documentation leading to mortgages being inappropriately given. Risk models predicting default risk sometimes contained unrealistic assumptions, leading to overinflated pricing for mortgage-backed securities, although the complexity inherent in securitization can limit investors’ ability to monitor risks.

How It Can Be Different Now

Today, mortgage securitization is still widely used. In 2011, of the $13.7 trillon in mortgage debt about $7 trillion of that was securitized or guaranteed by government-sponsored enterprises or government agencies with only about $5 trillion being held as direct mortgage by the private sector.

Given the potential benefits and the importance of mortgage securitization it is critical that the mortgage industry have an infrastructure that can track and provide consolidated access to all of the documentation relating to mortgages.

As it has become less and less common for an originator of a mortgage to hold the mortgage for its entire life, mortgages are commonly sold to other firms or bundled into packages and securitized. The ability to quickly pull all the information relating to specific mortgage and access them in an integrated fashion is critical in this model. The ability to utilize all available data when pricing mortgages also offers competitive advantage in this highly competitive industry.

Complexities of Securitization

Consolidating mortgage data is difficult because the mortgage securitization process is highly complex. There are large numbers of documents types, different states have different rules on how mortgages are handled, and many systems are used to manage the origination and processing of mortgages. Mortgages can last as long as 30 years and information about them needs to be maintained for years afterwards. All this makes querying, collecting, moving, and loading the data into a new system costly, time-consuming, and error prone. “See How to Drive the Mortgage Industry Forward” for a more in-depth review of the data challenges faced by the mortgage industry.

In addition to efficiency and competitive concerns, when securitizing mortgages some firms will have a responsibility to transfer mortgage documents, deeds of trust, and even provide alerts for when default events or other factors impacting price valuations occur. When the needed data is held in inaccessible siloes it can be difficult for a firm to meet its responsibilities.

Onboarding a Basket of Mortgages

Onboarding large baskets of mortgages can require extensive ETL, sometimes to the point of physically moving data sets from machine to machine so the data can be enriched, risk analysis performed, mortgages broken up into sub pools so they can be separately sold, and otherwise processed. Fully processing a new pool of mortgages can take up to a week. With a Mortgage Metadata Repository the promise is to get that to under a single day.

One reason the ETL process is so convoluted is that the ETL REST points are relational systems designed around different schemas. Because of this it can be necessary to transform the data between processing steps so the needed enrichment or other processing can occur.

With a NoSQL database like MarkLogic, the separate steps in processing the incoming basket can be performed against a single database. This is because each step in the process can access the underlying data as it needs to and not be bound by a rigid predefined schema – no transformations needed between steps. For a more in-depth explanation see “Building a Universal Mortgage Repository.”

Pricing a Mortgage

When pricing a basket of mortgages potential investors are bound by their analytical skills and the data they have access to. When the data relating to a basket of mortgages is spread out amongst a wide variety of independent silos, pricing algorithms can be constrained because valuable information within a firm may not be accessible to the pricing algorithms.

Data that may be considered useful in pricing a basket of mortgages can include documents, numeric data, geospatial data, and semantic triples. A relational-based repository will struggle to provide modelers with access to all the types of information they may consider important.

As an example of the data issues involved with pricing mortgages, properly modeling prepayment risk is a concern for any basket of mortgages. Modeling of prepayment risk can be highly complex and often involves variables such as demographic trends, local unemployment, the behavior of housing prices in homes close to those in the mortgage basket, etc. As can be imagined, many types of data including geospatial, numeric, and textual are required to fully allow modelers to do their job. The ability to pull together all the available data for a mortgage can help validate that the algorithms used to price a basket of mortgages were used correctly and were not biased in a way that would result in inaccurate pricing.

Dave Grant put together a terrific demo that lets you draw polygons in a geographic area and see the risks to your portfolio — should there be a flood or plant closings. You can see exactly what I am referring to in this webinar where we show that and much, much more.

Most legacy infrastructures cannot handle these needs. On the other hand, a MarkLogic-based mortgage document or metadata repository can, not only handle all this data, but also provide integrated access to it. This can allow a firm to consolidate siloed data into a single source of truth and make all of its data queryable as an integrated whole.

David Kaaret

David Kaaret has worked with major investment banks, mutual funds, and online brokerages for over 15 years in technical and sales roles.

He has helped clients design and build high performance and cutting edge database systems and provided guidance on issues including performance, optimal schema design, security, failover, messaging, and master data management.