Hadoop Sqoop JDBC to SQL Server via AD Windows Authentication from Linux

Default Blog Top Image
by Sumit Sarkar Posted on March 31, 2014

For “big data” connectivity, I see a very high percentage of projects delivering business value from big data initiatives that require data connectivity from DataDirect.  This was specifically around providing Universal data connectivity to Hadoop Hive and related low latency interfaces (Impala, Hawq, Drill, Stinger, etc).

From recent active projects, we're connecting systems in the other direction which takes success to new levels; and that is when you can move data sets aggregated by the Hadoop big data ecosystem; and move them to relational databases managed by strict corporate security policies which means your data is above the radar now.  The most common is the requirement to connect to SQL Server from Linux using Windows Authentication (example from Hortonworks community).

How ODBC and JDBC SQL Server Windows Authentication from Linux works:

DataDirect is your exclusive partner in data connectivity providing unique support for Windows Authentication from Linux across both ODBC and JDBC connectivity.  Sqoop is a JDBC utility to move data between Hadoop Ecosystems and relational databases such as SQL Server.  On the ODBC side, we’re seeing Python/Ruby wrappers that need to connect to relational DBs.  Microsoft recommends using integrated security over mixed mode authentication for secure SQL Server platforms.

DataDirect invented “wire protocol” technology to connect to data without data source specific dependencies; and our implementation for windows authentication from Unix/Linux is no exception.  You simply need to configure the driver with your Windows credentials in plain text.  For Kerberos support, no credentials would be required.

Getting started with ODBC and JDBC SQL Server NTLM authentication from Linux:

Not even the sexiest data scientist and Hadoop platform are above corporate security, so here’s how you can connect from JDBC and ODBC to SQL Server to remain in compliance:

ODBC

Download the DataDirect Connect(64) for ODBC drivers for Unix/Linux and follow the quick start guides supplied with the download.  The SQL Server data source in the odbc.ini file should be configured as follows for NTLM authentication:

[SQLServer] Driver=/opt/Progress/DataDirect/Connect64_for_ODBC_71/lib/ddsqls27.so Description=DataDirect 7.1 SQL Server Wire Protocol AuthenticationMethod=9 Database=test Domain=AMERICAS EnableBulkLoad=0 #Connecting to SQL Server Named Instance HostName=172.30.57.75\SQL2008R2SP2 LoadBalancing=0 Pooling=0 # PortNumber not required when connecting to Named Instance #PortNumber=1433

During test connect with the DataDirect samples/example utility, specify the following credentials for windows ntlm authenticated user “AMERICAS\SUMITS” and note domain is not required as part of user name since it was configured in the data source:

sumits@nc-lnx16 /opt/Progress/DataDirect/Connect64_for_ODBC_71/samples/example> example example DataDirect Technologies, Inc. ODBC Example Application. Enter the data source name : topptt01_cs Enter the user name        : SUMITS Enter the password         : my_windows_auth_pwd Enter SQL statements (Press ENTER to QUIT) SQL>

Here is a related article on connecting with Windows Auth credentials from Unix/Linux: http://progresscustomersupport-survey.force.com/ConnectKB/articles/Article/000030786

JDBC

Download the DataDirect Connect for JDBC drivers for and follow the quick start guides supplied with the download.  The SQL Server JDBC connection URL should be configured as follows for NTLM authentication:

jdbc:datadirect:sqlserver://server1:1433;AuthenticationMethod=ntlmjava;Domain=Americas;User=SUMITS;Password=my_windows_auth_pwd

See the Connect (XE) for JDBC User’s Guide for complete reference.

Got a huge big data driven project on the horizon?

1. Download free 15 day trial of the SQL Server ODBC Driver and SQL Server JDBC Driver to securely connect to your SQL Server databases from Hadoop land; or vice versa.

2. Or call 1-800-876-3101 to speak with a Systems Engineer to learn more.


Sumit Sakar
Sumit Sarkar
Technology researcher, thought leader and speaker working to enable enterprises to rapidly adopt new technologies that are adaptive, connected and cognitive. Sumit has been working in the data access infrastructure field for over 10 years servicing web/mobile developers, data engineers and data scientists. His primary areas of focus include cross platform app development, serverless architectures, and hybrid enterprise data management that supports open standards such as ODBC, JDBC, ADO.NET, GraphQL, OData/REST. He has presented dozens of technology sessions at conferences such as Dreamforce, Oracle OpenWorld, Strata Hadoop World, API World, Microstrategy World, MongoDB World, etc.
More from the author

Related Tags

Related Articles

Progress DataDirect Achieves Google Cloud Ready—AlloyDB Designation
Progress DataDirect’s Drivers for Google AlloyDB offer a high-performing, secure and reliable connectivity solution for JDBC applications to access data in AlloyDB.
Top 5 Reasons to Use DataDirect with Salesforce
Customers pick Progress DataDirect for Salesforce connectivity because of its security, performance, high availability and more.
Prefooter Dots
Subscribe Icon

Latest Stories in Your Inbox

Subscribe to get all the news, info and tutorials you need to build better business apps and sites

Loading animation