Export SAS Data Sets into operational database using Bulk Load

June 03, 2011 Data & AI, DataDirect

One of the use cases for the ground breaking DataDirect bulk load functionality is to load data into a data warehouse at lightning fast speeds.  After speaking with prospects and existing customers, the use case works in the reverse direction as well, i.e. organizations want to move big data from SAS data set libraries into an operational database for use with existing corporate applications and websites.

To illustrate, the SAS/Access ODBC interface queries your data warehouse to solve complex business problems, achieve key objectives and more effectively manage your information assets.  Users can then move the data back into the data warehouse using the SAS BULKLOAD=YES.  However, how can the solution data sets from SAS servers running on Unix/Linux be consumed by applications running against an operational SQL Server database?

Answer: Progress DataDirect SQL Server bulk load.

There are 3 easy ways to export big data from SAS using DataDirect Connect SQL Server ODBC drivers 6.1 out of the box without additional software to expand your investment in SAS intelligence.   And don't forget about our SQL Server Wire Protocol driver on Windows since it can load data with ease faster than native utilities such as BCP currently leveraged by SAS BULKLOAD.

1. Using SAS with codeless configuration:

a. On Unix/Linux, set EnableBulkLoad=1 connection property in the data source for supported drivers in the odbc.ini configuration.

b. On Windows, check "Enable Bulk Load" box in the "bulk" tab of the ODBC driver setup for supported drivers.

c. Insert data using SAS/Access with the generic ODBC interface.

The driver transparently converts ODBC parameter array inserts into the bulk API of the target data source without any application changes.  Keep reading if performance is not what you expect, or if you have additional requirements.

2. Windows using out-of-box bulk streaming sample:

Use bulk streaming  sample with the SAS ODBC driver as the source data source to leverage bulk load functionality with codeless configuration.

a. Setup SAS ODBC driver data source as "SAS_source", and target data source with a Connect for ODBC driver version 6.1 that supports bulk load.  For this example, we'll use the SQL Server Wire Protocol driver with data source as "SQLServer_destination"

b. Check EnableBulkLoad in Bulk tab of driver setup.

c. Run %INSTALL_DIR%samplesbulkstrmbulkstrm.exe and configure source connection string, SQL to generate result set, and destination connection string.

d. Click "Bulk Load Table".  The driver transparently transforms the source result set into the required format for the target bulk API of the target data source.

3. Unix/Linux using out-of-box bulk sample:

Export SAS data sets to CSV file, and use the bulk sample to load the file.  The bulkstrm sample exists on Unix/Linux, however the source data source using the SAS ODBC driver is only available for 32-bit Windows in version 9.23 and previous.

a. Export SAS data sets to CSV.  Here are 5 ways to do the export.

b. Create a bulk load configuration file for the CSV file.  See the chapter on the bulk load configuration file in the User's Guide for reference.  Below is an example:

"http://www.datadirect.com/ns/bulk/BulkData.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">            "false">INTEGERCOL           "true">VARBINCOL           externalfilecodepage="Windows-1252" nullable="true">VCHARCOL           externalfilecodepage="Windows-1252" nullable="true">UNIVCHARCOL

c. Launch the interactive bulk sample in $INSTALL_DIR/samples/bulk and pass a connection string.

d. Choose "Load"

e. Specify the file names and accept default values.

And let's say you run into an error after loading billions of records after an hour.  With our bulk API, you can specify how many errors/warnings you can tolerate; and additionally replay the bulk insert at the point of failure!

Pretty cool, huh?  With DataDirect Bulk Load's unique standards based implementation, the above steps will work seamlessly against DB2, Sybase and Oracle as well. Download a free 15 day trial of the latest Connect for ODBC drivers today for Windows/Unix/Linux and show your colleagues today!!

Sumit Sarkar

Technology researcher, thought leader and speaker working to enable enterprises to rapidly adopt new technologies that are adaptive, connected and cognitive. Sumit has been working in the data access infrastructure field for over 10 years servicing web/mobile developers, data engineers and data scientists. His primary areas of focus include cross platform app development, serverless architectures, and hybrid enterprise data management that supports open standards such as ODBC, JDBC, ADO.NET, GraphQL, OData/REST. He has presented dozens of technology sessions at conferences such as Dreamforce, Oracle OpenWorld, Strata Hadoop World, API World, Microstrategy World, MongoDB World, etc.

Read next Progress DataDirect Now Connects to Denodo