Apache Nifi is an important and powerful tool for automating data flow between systems. Importing data from a REST API in Apache Nifi is extremely useful, but can involve a significant amount of time writing, testing and maintaining a custom plugin. In this tutorial, we will show how you can import data from any REST API in Apache Nifi without the additional custom headaches. What’s the trick? Progress DataDirect’s Autonomous REST Connector- the newest addition to our JDBC suite of connectors - which can help you sample, connect and query any REST API using the JDBC interface.
By following along with the steps below, you’ll learn how to import data from any REST API using Autonomous REST Connector and ingest that data into Hive. For this tutorial we’ll be using Alpha Vantage API, which provides stock prices in multiple intervals.
Download and install Apache Nifi on your machine.
Setup DataDirect Autonomous REST Connector
/home/<
user
>/Progress/DataDirect/JDBC_60/lib/autorest.jar
cp /home/<
user
>/Progress/DataDirect/JDBC_60/lib/autorest.jar <
path-to
>/nifi-x.x.0/lib/autorest.jar
Setup DataDirect Hive Connector
/home/<
user
>/Progress/DataDirect/JDBC_60/lib/hive.jar
cp /home/<
user
>/Progress/DataDirect/JDBC_60/lib/hive.jar <
path-to
>/nifi-x.x.0/lib/hive.jar
To connect to the Alpha Vantage API using Autonomous REST connector, you can point it to the endpoint with the default values.
https:
/
/
www.alphavantage.co
/
query?function
=
TIME_SERIES_INTRADAY&symbol
=
PRGS&interval
=
1min
&apikey
=
<apikey>
jdbc:datadirect:autorest:sample=https://www.alphavantage.co/query?
function
=TIME_SERIES_INTRADAY&symbol=PRGS&interval=1min&apikey=<apikey>
{
"query"
:{
"#path"
:[
],
"Meta Data"
:{
"1. Information"
:
"VarChar(84)"
,
"2. Symbol"
:
"VarChar(64),#key"
,
"3. Last Refreshed"
:
"Timestamp(0)"
,
"4. Interval"
:
"VarChar(64)"
,
"5. Output Size"
:
"VarChar(64)"
,
"6. Time Zone"
:
"VarChar(64)"
},
"Time Series (1min){Timestamp(0)}"
:{
"1. open"
:
"Double"
,
"2. high"
:
"Double"
,
"3. low"
:
"Double"
,
"4. close"
:
"Double"
,
"5. volume"
:
"Integer"
},
"function"
:{
"#type"
:
"VarChar(64),#key"
,
"#virtual"
:
true
,
"#default"
:
"TIME_SERIES_INTRADAY"
,
"#eq"
:
"function"
},
"symbol"
:{
"#type"
:
"VarChar(64),#key"
,
"#virtual"
:
true
,
"#default"
:
"PRGS"
,
"#eq"
:
"symbol"
},
"interval"
:{
"#type"
:
"VarChar(64),#key"
,
"#virtual"
:
true
,
"#default"
:
"1min"
,
"#eq"
:
"interval"
},
"apikey"
:{
"#type"
:
"VarChar(64),#key"
,
"#virtual"
:
true
,
"#default"
:
"<your-api-key>"
,
"#eq"
:
"apikey"
}
}
}
Launch Apache Nifi by running the bin/nifi.sh run command on your terminal. Note that your password must be the root directory of Nifi when you run this command.
Database Driver Class Name:
com.ddtek.jdbc.autorest.AutoRESTDriver
Database Connection URL:
jdbc:datadirect:autorest:config=/home/<path-to>/alphavantage.rest
Note: Config option should have path to the alphavantage.rest file we created in the previous step.
Database Driver Class Name:
com.ddtek.jdbc.hive.HiveDriver
Database Connection URL:
jdbc:datadirect:hive://<server_address>:<port>;TransactionMode=Ignore
Database User:
Fill in your Hive Username
Password:
Fill in your Hive Password
Database Connection Pooling Service:
Choose the controller service where you configured Alpha Vantage REST Connection in previous steps. Choose the controller AlphaVantageREST where we configured the connection to Alpha Vantage REST API using the Progress DataDirect Autonomous REST Connector.
Table Name:
Choose the table that you would like to ingest incrementally. I choose to sync the table TIMESERIES1MIN
Below is a screenshot of the final settings for your reference.
JDBC Connection Pool:
Choose the controller service where you configured Hive Connection in previous steps. In my case, I chose the controller HiveConnect where I configured the connection to Hive using the Progress DataDirect JDBC driver.
Statement Type:
Insert
Table name:
The table where you want to insert the data in Hive.
Note: If you don’t have the table in Hive, create it using the below SQL statement:
CREATE
TABLE
TIMESERIES1MIN(QUERY_METADATA_2SYMBOL
VarChar
(64),QUERY_FUNCTION
VarChar
(64),QUERY_SYMBOL
VarChar
(64),QUERY_INTERVAL
VarChar
(64),QUERY_APIKEY
VarChar
(64),`
KEY
`
Timestamp
,OPEN_
Double
,HIGH
Double
,LOW
Double
,CLOSE_
Double
,VOLUME
INT
)
To start the flow, right click every processor and select Start. Once the processors are started you can see the data flow from the Alpha Vantage REST API to Hive. By creating a connection to the Alpha Vantage REST API, Autonomous REST Connector will help you read the data from the API and provide it to the QueryDatabaseTable processor.
Once the PutSQL processor has completed its process, you should see the data in your Hive instance.
If you have scheduled the QueryDatabasetable to run after an elapsed time, confirm that the fetch incremental data was pulled from the REST API and was ingested into Hive automatically.
This is just one example of how easy and painless it can be with Progress DataDirect Autonomous REST Connector to read the data from any REST API. Feel free to try the connector with Nifi or any application you want. If you have any questions or issues, please contact us or comment below.