Criteria for a Highly Effective Database Driver: Part 2 JDBC

May 19, 2009 Data & AI, DataDirect

This podcast is the second of a two part series. In it, Rob Steward discusses what to look for in a highly effective database JDBC driver. The podcast runs for 6:42

Click on the following link to listen to the podcast: http://dataaccesshandbook.com/media/Rob10.mp3

Rob Steward:

Now in JDBC terms, JDBC the specifications formalized this concept of the architecture of the driver itself. They called it Type 1, Type 2, Type 3, and Type 4. So a Type 1 driver – which there was only ever one – was a bridge from JDBC to ODBC. Type 2 is what I just described where the driver, the Java piece of that driver sits on top of some native Windows or Solaris or Linux client piece that talked to the database. Type 3 was pure Java, and it talked to some intermediate server. So it may be just a JDBC driver that’s pure Java, but it had some server component that it would talk to that it would then in tern talk to the Oracle or the DB2 database, or whatever database. And then Type 4 – which is most common – is pure Java, opens up that TCIP socket to the database and talks directly to the database. In ODBC terms we call that wire protocol. In JDBC it’s a Type 4. So that architecture makes a big difference. In the ADO.NET world, we call that 100% managed. Something that is completely running within that CLR that opens up that socket to the database server.

Now this is a huge deal. This architecture, and the reason that I’m spending so much time in answering this question on that one particular subject, is that architecture – not only does it matter for say the versioning conflict that I talked about – if you’ve got a Type 2 in JDBC or anything that’s not completely managed in .NET, then you’re giving up one of the biggest benefits of those environments: the platform independence; the ability to, within your single process, be able to have all the assemblies or the components that you need for that application. If you have some dependence on the native operating system, then you’re giving up those big advantages. So you run into those versioning issues that I talk about, or you run into conflicts among shared objects. If you have a Type 4 JDBC or 100% pure managed .NET, you don’t have those issues. With ODBC you can eliminate a bunch of these conflicts because just eliminate a number of components that you need.

In addition to this versioning and compatibility issue, it actually makes a really big difference in terms performance and scalability. So if you think about it, in computer science we’re always taught in school to simplify things. The simpler the algorithm the better. It’s not just more elegant; it’s actually better performance.

The first class I took in college where I was dealing with data structures and sorting, the professor walked in and said, ‘okay, write a bubble sort algorithm.’ So we wrote a bubble sort, and we turned it in, and as soon as we turned it the professor said, ‘now that you’ve done that, never do that again.’ Now why did he do that? He did that because the algorithm is somewhat complex, but the reason that we were never supposed to write it again is because it was inefficient. We can write a much better binary search or something like that, which is actually much simpler but also performs significantly better than that bubble sort. This is the kind of thing we’re taught in computer science, and that’s the reason that we’re taught it: scalability and performance.

So if you have less layers and less complex interactions, what you end up with is better scalability and performance. For example, specifically, you may retrieve some data from the database and it may be buffered in that client layer. Well then it’s got to make a copy to hand up to that driver layer above it. So we may end up using twice the amount or memory that we need as apposed to if that driver was stand-alone and doesn’t have that other layer. Also, if you get a driver that’s wire protocol ODBC, Type 4 JDBC, 100% managed ADO.NET, that driver is built specifically to handle the API that you’re writing to. So if you’ve written an ODBC application, then that driver has the capabilities and the code written into it to handle ODBC. It doesn’t need to handle other things that are not ODBC. So if you have that other layer under there – which is the database client piece – which is built to handle more than ODBC or JDBC or ADO.NET underneath it, then there are complexities and codes in it that you don’t need. This causes it to not perform or scale as well.

In a nutshell I would say that you want to look for the architecture of the driver that really matters. You also want to look for experience. A company that writes a single ODBC driver of a single JDBC driver is not going to do as well at writing those drivers as a company that writes 5 or 10 or 20 of them. Why is that important? Well when you write a bunch of different drivers, you understand what ODBC or JDBC or ADO.NET applications need. You understand how they interact with the drivers better because you have a much broader area of experience. And you’re able to optimize those things within those drivers. So I would say the broad experience of the company that writes those drivers, as well as that architecture.

Another thing that I would look at is of the vendor who writes the driver. Is the driver a profit source for them? If you have vendor who say, gives the driver away for free, then they don’t have the incentive to write as good of a driver. It’s kind of the ‘you get what you pay for’ kind of a thing; absolutely true with drivers as well. And, as we just wrote a book on the subject: What kind of difference can those drivers make? Absolutely huge.

I would say that you want to look at the vendor; you want to look at what they make; you want to look at the architecture of those drivers. Just a few tips there on what I would look for in terms of a driver.

Rob Steward

Read next Progress DataDirect Achieves Google Cloud Ready—AlloyDB Designation