Data Access and Virtualization

Default Blog Top Image
by Rob Steward Posted on January 04, 2010

In this podcast Rob Steward explains why data access middleware makes a difference within a virtualization environment. The podcast lasts for 9:14.

Rob Steward:

play_blip_movie_3121683(); I’ve been talking to a lot folks who have undergone a very large virtualization project. So, for example, they may be taking their entire infrastructure, and trying to virtualize as much of it as they can. And obviously, the reason for doing virtualization is to lower costs. When you say virtualization most people think of VMWare, a Windows hypervisor, or some kind of virtualization environment where they take multiple machines and put them on a single piece of hardware. Interestingly enough, as people have started to do this, they’ve started to virtualize their entire environment. I’m starting to see much more maturity in terms of how they approach what they’re going to virtualize, and how they’re going to make the decisions of what to move to those environments.

I’m seeing one big trend in that people are having issues getting their databases themselves onto virtualized environments and maintaining anywhere close to the throughput that they’re used in a non-virtualized environment. So, for example, one of the guys that works for me went to the VMWorld Conference that VMWare recently held and noticed in a lot of the sessions people brought up the question of: can I run Oracle database efficiently on a VMWare environment? And obviously, from the approach that WMWare themselves took at that conference, this is an issue that’s near and dear to their heart. And it’s also -- the reason I bring that up is that, it’s also a concern that I’ve heard multiple times in talking to customers. Specifically with Oracle, but I’m sure the same concerns exist with every database. And that is, when we put the database on that virtual machine, we don’t get the same amount of throughput that we do in the non-virtualized environment.

For example, I visited one customer who was, again, virtualizing their whole environment, and they came to me and said, ‘Rob, can you help with this problem that we’re having in that when we put our databases on virtual machines, we can’t get the throughput that we need to support our business.’ This particular customer had set up a very nice testing infrastructure where they would take a machine that was not virtualized, they would move it over with its applications, with its operating system, with everything, put it on a VMWare machine, run it, run their tests and measure whether the throughput that they got through that application, or through that entire application’s stack was at least 90% as good as it was in the non-virtualized environment. And for any app, or any application’s stack that they could get to that 90% throughput, they would then virtualize it, put it in production in a virtual environment. Which I thought was a very interesting way to go about it, because they had done a lot of setup for a lab to measure these things, which of course the VMWare tools give you really good ability to measure these things, but they were testing every single application and every single application stack. Now, they never could get that 90% number out of any Oracle databases that they had virtualized. So, they approached me to ask, ‘Rob, you know, can the database middleware help us get that database server running at 90%?’ And the answer that I had for them was, to some degree, we can help that processing that goes on in the database server, because the database middleware influences the amount of processing on the database server, because of the way it sends, say the way it may execute a statement, the way it may minimize the amount of data that has to be returned, or minimize the number of networks round trips. And also there are options within database middleware, your JDBC drivers are going to be drivers that can affect whether certain processing happens on a server on the database itself or whether it happens on decline. So, for example, and probably one of the biggest things that the database middleware can influence on how much processing goes on on the database is, whether character set conversions happen within the drivers or within that database middleware, or whether they happen on the database server. Typically, they’re options which you can set to affect where that processing occurs, so where do you want it? Do you want to offload it to a database; do you want it in the middleware? So I sat with this customer, and looked at some things and we were able to make a difference, but we still were not able to get that database running at 90% throughput.

This is one of the areas that I'm doing some research within my team on how we can help virtualized environments. How Progress DataDirect can help to minimize the amount of processing that happens on the database server. Why does data access middleware matter at all? Why does it make a difference in a virtualization environment? In the book and on this blog we’ve talked a lot about why it really matters and what kind of differences it can make in terms of performance. If you think through what the value of virtualization really is, then you see we've got machines that are sitting around that are only being utilized part of the time, so what we want to do is instead of the 80% of the time that a machine is sitting there doing nothing, let's put it to work. So the idea is I don't have to have five machines, I can have one. So that saves you not only in terms of the machine cost but in terms of electricity, in terms of air conditioning, in terms of space in your data center. These things can add up to millions of dollars for companies of any size. That's why everyone is looking to virtualization. If you think through any sort of normal business application, it spends typically between 60 and 80 percent of its time accessing the data, so if we can cut that amount of time and those amount of resources in half it makes a huge difference in the overall resource usage and throughput of your application.

Let's say your application is 60% of its time doing data access and 60% of its resources. If we can cut that in half then we've just reduced the resource usage of that application by 30%. That 30%, as we start to add additional virtual machines and additional applications to those same virtual machines, that difference means we can start to put a lot more virtual machines on the same piece of hardware. The data access middleware is making that kind of difference on the resource usage within the application.


Rob Steward
View all posts from Rob Steward on the Progress blog. Connect with us about all things application development and deployment, data integration and digital business.
More from the author

Related Tags

Related Articles

Progress DataDirect Achieves Google Cloud Ready—AlloyDB Designation
Progress DataDirect’s Drivers for Google AlloyDB offer a high-performing, secure and reliable connectivity solution for JDBC applications to access data in AlloyDB.
Better Data: Progress’ Acquisition of MarkLogic Is a Win for Customers
Bringing Progress and MarkLogic together will allow customers to quickly access more meaningful data.
Prefooter Dots
Subscribe Icon

Latest Stories in Your Inbox

Subscribe to get all the news, info and tutorials you need to build better business apps and sites

Loading animation