Azure Databricks is a Spark-based analytics platform that will let you read your data from multiple data sources such as Azure Blob, Azure Data Lake, Azure SQL Databases etc., and turn it into breakthrough insights using Spark. Apart from the data sources you can connect to from Azure Databricks, there are several external data sources you would want to connect to like Salesforce, Eloqua, IBM DB2, Oracle etc., to get better insights from all your data in different silos.
You can connect to these various data sources using Progress DataDirect JDBC connectors. In this tutorial we will walk you through how you can connect to these data sources from Azure Databricks. To get started, we will be showing how you can connect to Salesforce – but you can use the same steps to connect to Eloqua, IBM Db2, Oracle and other DataDirect JDBC data sources.
C:\Program Files\Progress\DataDirect\JDBC_60\lib\sforce.jar
Class.forName("com.ddtek.jdbc.sforce.SForceDriver")
val jdbcHostname =
"login.salesforce.com"
val jdbcSecurityToken =
"Your Security Token"
// Create the JDBC URL without passing in the user and password parameters.
val jdbcUrl = s
"jdbc:datadirect:sforce://${jdbcHostname};SecurityToken=${jdbcSecurityToken}"
// Create a Properties() object to hold the parameters.
import java.util.Properties
val connectionProperties =
new
Properties()
connectionProperties.put(
"user"
,
"Your username"
)
connectionProperties.put(
"password"
,
"Your password"
)
val driverClass =
"com.ddtek.jdbc.sforce.SForceDriver"
connectionProperties.setProperty(
"Driver"
,driverClass)
val opportunity_table = spark.read.jdbc(jdbcUrl,
"Opportunity"
, connectionProperties)
opportunity_table.printSchema
display(opportunity_table.select(
"AMOUNT"
,
"FISCALQUARTER"
).groupBy(
"FISCALQUARTER"
).avg(
"AMOUNT"
))
Feel free to try the Salesforce JDBC driver and other Progress DataDirect JDBC drivers and let us know if you have any questions.