Hi there!

You want to start learning big data Hadoop in the multinode environment but you don’t have the hardware required to deploy it? If you have AWS account, you can use it to spin up your own Ambari cluster. This series of articles will give you a step by step deployment of Ambari in AWS based on a video in Udacity’s course.

Disclaimer before we start: the deployment of EC2 infrastructure can cost you considerable charge. Don’t forget to stop your VM and clean up when you are done. Deploy it at your own risk!

With that said, let’s start by spinning up EC2 instances.

  1. In AWS console, select EC2 service. Click “Launch Instance” button to create a VM. Select Ubuntu AMI. Since we are going to deploy Ambary 2.2.2, use Ubuntu 14.04 image.
  2. We will create a NameNode. Because this NameNode will be able to handle a relatively large cluster, choose m3.large instance. Click Next.

  3. On “Number of instances”, use 1. Leave other options as default or change them to your configuration preference. Click Next.
  4. Choose 30 GB of storage. Click Next. 

  5. Add “Name” tag for this instance with value “Ambari Server”. This is so that we can quickly recognize the instance from the console. Click Next. 

  6. Add new rule as follow. Here we use 0.0.0.0/0 as Source which open connection from any IP address. Typically in a production environment you will restrict the access to the specific IP or IP address ranges of your organization.

  7. Click “Review and Launch” button. Verify if everything is correct. Click “Launch” button. You can either create a new key or use an existing one.