Step 1 Login to the Azure portal, using the link given below and you will get the home screen of Azure portal.
Link- www.portal.azure.com
Step 2 Click New -->Databases --> Data Factory
You will get a new blade now for configuring your new Data Factory.
Fill in the details for the name of Data Factory, Subscription, Resource Group and Location, and pin to the dashboard what you wish to do. Click Create once the details are given.
Your Azure Data Factory will be deployed now.
Your Data Factory on Azure, which has been deployed, goes here.
Need for Linked Services We need to create a data factory with a few entities first before we start working with the pipeline. Thus, now we will be creating a Linked Service to link the data stores to our data store and to define the input/output and to represent the same. Later, we will be creating the pipeline.
Further, we will be linking our Azure storage account with Azure HDInsight Cluster towards our Azure Data Factory. The storage account will have the input and the output data for the pipeline here.
Step 3 Open Azure Data Factory, which was created now. Go for author and deploy.
Now, click "New Data Store" and go for "Azure Storage".
We will be getting a JSON script to create Azure Storage Linked Service now.
Here is the editor with JSON script.
Note You should have a storage account, which was created earlier to configure this connection string
Step 4 Replace the connection string code given below with your storage account credentials
"connectionString" "DefaultEndpointsProtocol=https;AccountName=<accountname>;AccountKey=<accountkey>" Click "Deploy" once, as the connection string for the storage is configured.
Once the Linked Service is deployed, we can find the Draft-1 editor, which will be unavailable on the pane and we can see AzureStorageLinkedService in the left side of the Data Factory pane.
Step 5 We will be creating an Azure HDInsight Linked Service cluster now to the Data Factory.
Move to the Data Factory Editor and click "more" at the top most right pane in the "New Data store".
Click "New compute" here.
Select the “OnDemand HDInisght Cluster”.
Step 6 Copy the code snippet given below and place it in the editor of the Drafts/Drafts-1.
The code given above defines the JSON properties including Version, ClusterSize, TimeToLive, LinkedServiceName. Once the code is copied towards the editor, click Deploy.
Now, you can find the two things at Linked Services as AzureStorageLinkedService and HDInsightOnDemandLinkedService.