Changing market dynamics, including regulatory compliance, mergers and acquisitions, and information exchange are driving the need for new approaches to managing enterprise information. In this podcast Rob Steward explains why traditional approaches to data integration, such as ETL and data replication, do not meet requirements for real-time information access.
// Rob:
Within the software industry and within business generally, there is a need for the operational responsiveness – and the demands are increasing and increasing.
For example, if I go to an airline and I’m standing at a ticket counter and I want to purchase -- and let’s say my particular flight has been delayed -- and I’m standing there talking to the agent. That agent needs to be able to understand if the passenger took a flight a week ago and his baggage was lost. Or the passenger is in the frequent flyer program at a silver or platinum level, or not at all. Businesses are starting to demand the kind of responsiveness that says, OK, when I get in that situation and I have a passenger who may be platinum on my frequent flyer program and we may have lost his luggage last week, and now he’s got a delayed flight or a canceled flight -- we want to make sure as one of our most important customers, that he is well-taken care of. We know he’s not going to be happy now.
I use that example for a reason, because I recently was talking to an airline. We recently had done business with an airline who had exactly that problem. You’re standing there, and the agent has no way to see into all of their systems at the same time. Their baggage system was a separate system from their revenue system, which was separate from their ticketing system, which was separate from their frequent flyer system. The ticket agent who is standing there in front of the customer cannot get a single view of that customer.
But not only that, they can’t necessarily get up-to-date data. So, specifically within the question, Mike, you asked, traditional approaches to data integration such as ETL and data replication. If I have a silo of data -- if I have data that I need to share with another system, or in the case of mergers and acquisitions, I may have two companies that need to share data. What the traditional approach has been is to make a copy of that data into a data warehouse, typically using an ETL tool, and then allow that second system to see the data from the first system. But what it’s really looking at is a copy of the data.
Now, the problems are starting to happen because, again, back up to that operational responsiveness that’s being demanded of these IT systems. We can’t afford to look at replicated data -- what we’re starting to see is, people can’t afford to look at data that’s a week old or a day old -- or maybe even in some cases, half a day old. Because typically the way data warehouses work is, they get refreshed every so often. That might be once a week or once a month or once a day. But whatever I’m looking at from that second system is a day old or a week old. That introduces a lot of problems.
One, how do I respond? Let’s go back to my original example. I’m standing in front of that ticket agent and they lost my bags last week, but if I only had a copy of that data from two weeks ago, or if that flight had been yesterday and I only had copy from a week ago -- I wouldn’t know that. I wouldn’t know from my systems that we had lost the passenger’s bags yesterday or last week. And that’s where the traditional approaches are starting to fall down. Real-time access is just not there.
In addition, I ran into an interesting problem with a customer a couple of years ago, which kind of sums up what I hear from a lot of people but it’s the best example that I have. Had a company, large financial -- they had a large IT shop in London, they had a large IT shop in New York. Every night, they would copy data back and forth. So it had come from a merger of some bank that had merged and they had, again, both those IT shops. They were moving data every single night. Now it never occurred to me, until I was talking to them, but when the daylight savings time change happened in the US and we started going in the US a month earlier and staying a month earlier and daylight savings time -- but Europe didn’t make that change. The normal time difference that they had between London and New York was five hours. Except during those two months out of the year where we were out of synch on the daylight savings time and the time difference became four hours. What this customer told me was, “We don’t think we can actually copy that data in four hours. We know five hours is fine, but four hours is not enough.” That one hour that they lost was a huge difference. Why was it a huge deal for them? It was a huge deal for them because they were building data warehouses. They were doing data integration by copying pieces of data back and forth. The data over time had gotten to a size that was so large that it took more than four hours to do the copy.
I hear this over and over. That was just an example that I like to bring up because it really exemplifies the point, which is -- not only is your business environment getting to where they demand that responsiveness, but at the same time, those traditional data warehouses and the amount of data that we copy keeps getting bigger and bigger and bigger. And you just don’t have enough hours in the day and you don’t have enough time, necessarily, to move that stuff around.
I would say in a nutshell, those two things are really driving the need for more of a real-time approach to data integration. If I did a real time approach, then what happens is I don’t copy any data anywhere, I just access that data where it lives and I do it in a way that allows you access to that data at real-time rather than having to make those copies and move them around. So we don’t have stale data anymore and we don’t have problems with how long it actually takes to make those copies.
Again, in a nutshell, I would say those are the two big trends that I see in business that are causing those traditional approaches to start to fall down. I hear these same stories over and over.