Create Storage And Hadoop Cluster From Azure Management Portal

Introduction

This article will help you create the storage and Hadoop Cluster from Azure Management Portal. Go through my previous articles to understand in detail about Hadoop, HDInsight, and Hive queries in Azure.

Note- Here, in this demo, we will be working with the old portal of Azure

Links

What is Hadoop?

Hadoop is a framework that helps in distributing the storage and distributed processing of very large datasets on Computer Clusters built from commodity hardware.

What is HDInsight?

HDInsight provides data storage concepts like HDFS (Hadoop Distributed File System) and a simple MapReduce programming model to process and analyze the data.

Why do we need storage?

Azure HDInsight works with Azure Storage to store the data. When we create an HDInsight cluster, we need to specify the storage account in which a specific blob container is used. But, the file system should be of HDFS (Hadoop Distributed File System).

Follow the below steps to create a cluster and to create storage.

Step 1 - Log into Azure new portal using www.portal.azure.com and click on New >> Data + Analytics >> HDInsight.



Step 2 - Enter the Cluster Name, select Subscription, Cluster type, credentials, Data Source, pricing, and Resource group details.



In my case, I will be entering the following details.

  • Cluster Name- HDCCUG
  • Cluster Type- Configure required settings - Cluster Type (Hadoop) - Operating System (Windows) - Cluster Tier (Standard). Click on Select followed by all these options.


Step 3 - Provide credentials to your HDInsight cluster and click on “Select”.



Step 4 - For data source, create a new storage followed by container name and location. I will be deploying this storage account at South-East Asia. Click on Select, once it's done.



Step 5 - Under Pricing, select the node as “1” and click on Select.



Step 6 - Enter a Resource group name and click on Create.



Now, you can find the deployment undergone and that the HDInsight Cluster gets created. This may take a maximum of 15 minutes.



And finally, the cluster has been created.

Next Recommended Readings