Creating Linked Services In Azure Data Factory

Introduction

This article will help you to create an Azure Data Factory and create a linked service in it.

Why do we need a Data Factory?


The Data Factory will help us in creating pipelines which will help us in copying the data from one data store to another.

Requirements

  1. Microsoft Azure account

Follow the steps given below

Step 1

Login to the Azure portal, using the link given below and you will get the home screen of Azure portal.

Link- www.portal.azure.com

Azure portal

Step 2

Click New -->Databases --> Data Factory

Azure portal

You will get a new blade now for configuring your new Data Factory.

Fill in the details for the name of Data Factory, Subscription, Resource Group and Location, and pin to the dashboard what you wish to do. Click Create once the details are given.

Azure portal

Your Azure Data Factory will be deployed now.

Your Data Factory on Azure, which has been deployed, goes here.

Azure portal

Need for Linked Services

We need to create a data factory with a few entities first before we start working with the pipeline. Thus, now we will be creating a Linked Service to link the data stores to our data store and to define the input/output and to represent the same. Later, we will be creating the pipeline.

Further, we will be linking our Azure storage account with Azure HDInsight Cluster towards our Azure Data Factory. The storage account will have the input and the output data for the pipeline here.

Step 3

Open Azure Data Factory, which was created now. Go for author and deploy.

Azure portal

Now, click "New Data Store" and go for "Azure Storage".

Azure portal

We will be getting a JSON script to create Azure Storage Linked Service now.

Azure portal

Here is the editor with JSON script.

Note

You should have a storage account, which was created earlier to configure this connection string

Step 4

Replace the connection string code given below with your storage account credentials

"connectionString" "DefaultEndpointsProtocol=https;AccountName=<accountname>;AccountKey=<accountkey>"


Click "Deploy" once, as the connection string for the storage is configured.

Azure portal

Once the Linked Service is deployed, we can find the Draft-1 editor, which will be unavailable on the pane and we can see AzureStorageLinkedService in the left side of the Data Factory pane.

Azure portal

Step 5

We will be creating an Azure HDInsight Linked Service cluster now to the Data Factory.

Move to the Data Factory Editor and click "more" at the top most right pane in the "New Data store".

Azure portal

Click "New compute" here.

Select the “OnDemand HDInisght Cluster”.

Azure portal

Step 6

Copy the code snippet given below and place it in the editor of the Drafts/Drafts-1.
  1. {  
  2.     "name"  
  3.     "HDInsightOnDemandLinkedService""properties" {  
  4.         "type"  
  5.         "HDInsightOnDemand""typeProperties" {  
  6.             "version"  
  7.             "3.2""clusterSize"  
  8.             1, "timeToLive"  
  9.             "3""linkedServiceName"  
  10.             "AzureStorageLinkedService"  
  11.         }  
  12.     }  
  13. }  
Azure portal

The code given above defines the JSON properties including Version, ClusterSize, TimeToLive, LinkedServiceName. Once the code is copied towards the editor, click Deploy.

Azure portal

Azure portal

Now, you can find the two things at Linked Services as AzureStorageLinkedService and HDInsightOnDemandLinkedService.

Follow my next article to work on pipelines in Azure Data Factory.

Up Next
    Ebook Download
    View all
    Learn
    View all