Orbitz Harnessing Hadoop: Mac vs PC

July 04, 2012 Data & AI

Did you know that Mac users are 80% more likely to be vegetarian than PC Users? Or that PC users are more likely to prefer to read the USA Today? At our design previews for our upcoming Hadoop driver we explored these and other interesting factoids on consumer behavior that Orbitz is using to customize user's experience based on the computers they are browsing with.

Just last week, The Wall Street Journal article caused a huge stir across social media channels by showing the same statistic we showed the week of our Design Previews: Mac users are more likely to spend more money on a hotel room than PC users. This is a great example of how companies are using Hadoop to create and store large databases of unstructured data, and analyzing that data through a data refinery to make good business decisions through those analytics. In this case, Orbitz stands to win if it can sell the right hotel to the right customer for a positive travel experience. By presenting pricier hotels to Mac users, who are stereotypically thought to have higher incomes, it hopes to boost its business based on huge amounts of data it has been collecting and analyzing for some time.

Just how much data did they use? According to tnooz, Orbitz collected 750 terabytes of unstructured data on consumer trip-planning behavior and the cluster of 100 servers housing the Hadoop database is known in the office as EFX database. EFX stands for “every friggin' X.”

As more insights are gleaned from mining Big Data, more companies will seek to interact with Hadoop with their existing BI applications as those BI stacks have been built and well tuned over the last decade. Not only do they need to store "every friggin' X", but they need to analyze it all - which is what their BI suite was designed for in the first place.

The problem that companies are having in adopting Hadoop in this way are many. One such issue that we have seen lately: How are enterprises going to get data from their Hadoop refinery into their BI suite when most of the open source drivers are written without full ODBC spec support? Without full support for the ODBC core functions, many companies are having a hard time wrestling with their BI suites and Hadoop and are having to make special projects just to analyze hadoop data. These companies need a fast, fully ODBC compliant driver that was designed from the ground up to be used in enterprise critical BI suites.....they need a DataDirect driver.

 

 

Jesse Davis

As Senior Director of Research & Development, Jesse is responsible for the daily operations, product development initiatives and forward looking research for Progress DataDirect. Jesse has spent nearly 20 years creating enterprise data products and has served as an expert on several industry standards including JDBC, J2EE, DRDA and OData. Jesse holds a bachelor of science degree in Computer Engineering from North Carolina State university.