JDBC TUTORIAL

Connect to Salesforce from Azure Databricks

Updated: 02 Dec 2024

Introduction

Azure Databricks is a Spark-based analytics platform that will let you read your data from multiple data sources such as Azure Blob, Azure Data Lake, Azure SQL Databases etc., and turn it into breakthrough insights using Spark. Apart from the data sources you can connect to from Azure Databricks, there are several external data sources you would want to connect to like Salesforce, Eloqua, IBM DB2, Oracle etc., to get better insights from all your data in different silos.

You can connect to these various data sources using Progress DataDirect JDBC connectors. In this tutorial we will walk you through how you can connect to these data sources from Azure Databricks. To get started, we will be showing how you can connect to Salesforce – but you can use the same steps to connect to Eloqua, IBM Db2, Oracle and other DataDirect JDBC data sources.

Download and Install Progress DataDirect Salesforce JDBC Driver

You would have to upload the Salesforce JDBC driver to Azure Databricks, but before you do that you would have to install Salesforce JDBC driver on your local machine.
Download DataDirect Salesforce JDBC driver from here.
To install the driver, you would have to execute the .jar package and you can do it by running the following command in terminal or just by double clicking on the jar package.
java -jar PROGRESS_DATADIRECT_JDBC_SF_ALL.jar
This will launch an interactive java installer which you can use to install the Salesforce JDBC driver to your desired location as either a licensed or evaluation installation.
Note that this will install Salesforce JDBC driver and a bunch of other drivers too for your trial purposes in the same folder.

Upload Salesforce JDBC Driver to Azure Databricks

Navigate to the folder where you installed Salesforce JDBC driver, it should be found in the below path

C:\Program Files\Progress\DataDirect\JDBC_60\lib\sforce.jar
Go to Clusters, select the cluster where you will be running the workloads. In that cluster, go to Libraries tab and Click on Install new. Select the Salesforce driver in the above path, click on Install to install the driver to that cluster as shown below.