About the Different Database Driver Architectures

August 18, 2009 Data & AI

In this podcast Rob Steward explains the differences between database drivers. The podcast runs for 6:37.

Click on the following link to listen to the podcast: http://blogs.datadirect.com/media/RobSteward_DatabaseDriverArchitecture.mp3

Rob, you talk a lot about the different database driver architectures within The Data Access Handbook. Will you provide a little bit of insight about what differences there are between these different types of architectures, for instance bridge, client based, database wire protocol and independent protocol architectures?

Rob Steward:

Sure Mike. There are certainly a number of different ways to build a database driver or provider. Some of the APIs that we talk about in The Data Access Handbook differentiate between the architectures of the driver. For example, in the JDBC specification, it actually describes the differences between what it calls a type 1, a type 2, a type 3 and a type 4 driver. I’m going to talk about this from that type 1 through type 4 metaphor; however, I will apply it to ODBC and to ADO.NET, as well as JDBC. So a type 1 JDBC driver is really just a bridge driver between JDBC and ODBC. So there really was only one type 1 JDBC driver ever built, and that’s what it was. It was shipped with Sun's Virtual Machine (VM), and it was the JDBC to ODBC bridge. Really all it was was this thin Java layer that translated from the JDBC calls to ODBC calls. The vast majority of the work was done in native code outside of the Java Virtual Machine (VM), and that’s kind of a good example of a bridge architecture. When I say bridge I mean something that translates from one API to another. Typically when we talk about data access, that bridge is from one API – one standards based API to another standards based API, or some native database API.

Another example of a bridge provider is, Microsoft produced an OLE DB to ODBC bridge provider for OLE DB. And most of us know that as MSDASQL, or COGERO was the codename for that. A lot of people in the industry call it COGERO. That’s another bridge provider. There are also bridge providers within the .NET framework for translating from ADO.NET to ODBC, and translating from ADO.NET to OLE DB. There’s a lot of other bridges out there that I could probably talk about, but I think you get the idea from the examples.

The thing about bridge providers is that they’re typically inefficient, because the real work doesn’t go on in the bridge; the real work goes on in the underlying provider, whatever it is. But you’re just sort of adding an additional layer of overhead. In the Java world – in the JDBC world – there was one type 1 driver, and that was the JDBC to ODBC bridge; however, it really only existed because not a lot of JDBC drivers were in existence when the specification was first released. So that allowed people to write JDBC code and use their existing ODBC drivers underneath it in order to get started with JDBC. Now no one in the industry is going to tell you to use that bridge provider in a case where you have a type 2 or a type 4 JDBC driver in existence, because the bridge tends to do things in an inefficient way. It tends to just add overhead that you don’t need.

A type 2 driver would be, some portion of the driver exists in Java – or is pure Java coded – and it sits on top of some native component that does the communication for the database. In a case of type 2, you’ve got partially managed code in .NET terms, or partially Java code in Java terms, and partially native whatever operating system you’re running on. And there are a lot of type 2 drivers on the market today. In that case, take the example of IBM's type 2 driver for DB2 or Oracle's type 2 driver for Oracle, those have a significant component that is Java. There is actually some work that happens in Java. But they sit on top of, in the case of Oracle, the Oracle Networking Software. It takes your JDBC call, it does some work, calls outside of the VM for those unmanaged components.

Same thing exists in the .NET world where you may have some portion of the data provider – the ADO.NET data provider – that runs within the Common Language Runtime (CLR), and then some other native component that runs on Windows that it calls out to. So that’s kind of a type 2 driver.

A type 3 solution – which there are a few but not a whole lot in the world – as defined by the JDBC specification means it’s a pure Java driver, but it talks to some server component which then in turn talks to the database. So you have a client included in your application, and it is pure Java. There are no native components that you have to distribute that are OS dependent, but there are some server components that you have to set up. So that driver – pure Java – talks to its server component, which then in turn talks to your SQL Server, or your Oracle, or your DB2 server or MySQL, or whatever it may be. So that’s a type 3.

A type 4 driver – in JDBC terms – is pure Java. It opens up network sockets, and talks directly to the database. It never calls native OS code. It stays purely within the VM. So that’s kind of a type 4. Same thing exists for ADO.NET where you have a 100% managed ADO.NET provider. It will stay completely within the CLR.

Rob Steward