Terraform Multi-AZ Deployments on AWS

Introduction
Step-by-Step Guide
Code Example
Additional Notes
Summary
Conclusion
References

Introduction

In AWS, deploying resources across multiple Availability Zones (AZs) is crucial for achieving high availability and fault tolerance. This ensures that your application can withstand the failure of a single AZ. This guide will walk you through the process of using Terraform to provision resources across multiple AZs in your AWS environment.

Step-by-Step Guide

Get Available AZs: Use the aws_availability_zones data source to fetch a list of available Availability Zones in your AWS region.
```
data "aws_availability_zones" "available" {}
```

Create Subnets per AZ: Define a resource for your subnets and use count and element to iterate through the available AZs.

resource "aws_subnet" "public" {
  count = length(data.aws_availability_zones.available.names)
  availability_zone = element(data.aws_availability_zones.available.names, count.index)
  # ... other subnet configurations
}

Reference Subnets: When creating resources like EC2 instances, reference the subnets using their index in the list.

resource "aws_instance" "example" {
  # ... other instance configurations
  subnet_id = aws_subnet.public[0].id 
}

Distribute Instances: To distribute instances across AZs, use the count parameter for your EC2 instance resource and match it to the number of available AZs.

resource "aws_instance" "example" {
  count = length(data.aws_availability_zones.available.names)
  # ... other instance configurations
  subnet_id = aws_subnet.public[count.index].id 
}

This setup ensures that your infrastructure spans multiple Availability Zones, enhancing fault tolerance and high availability.

Code Example

This Terraform code sets up a basic AWS infrastructure with high availability in mind. It provisions a VPC with public subnets spanning across multiple availability zones. It then deploys EC2 instances distributed across these subnets, ensuring redundancy and fault tolerance. The setup includes an internet gateway, route tables, and security groups to enable public access to the instances.

# Configure the AWS Provider
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
}

# Get Available Availability Zones
data "aws_availability_zones" "available" {}

# Create a VPC
resource "aws_vpc" "example" {
  cidr_block = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support = true

  tags = {
    Name = "example-vpc"
  }
}

# Create Public Subnets (one per AZ)
resource "aws_subnet" "public" {
  count = length(data.aws_availability_zones.available.names)
  vpc_id                  = aws_vpc.example.id
  cidr_block              = cidrsubnet(aws_vpc.example.cidr_block, 8, count.index)
  availability_zone       = element(data.aws_availability_zones.available.names, count.index)
  map_public_ip_on_launch = true

  tags = {
    Name = "example-public-subnet-${count.index + 1}"
  }
}

# Create an Internet Gateway
resource "aws_internet_gateway" "example" {
  vpc_id = aws_vpc.example.id

  tags = {
    Name = "example-internet-gateway"
  }
}

# Create a Public Route Table
resource "aws_route_table" "public" {
  vpc_id = aws_vpc.example.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.example.id
  }

  tags = {
    Name = "example-public-route-table"
  }
}

# Associate Public Subnets with the Public Route Table
resource "aws_route_table_association" "public" {
  count = length(data.aws_availability_zones.available.names)
  subnet_id      = element(aws_subnet.public[*].id, count.index)
  route_table_id = aws_route_table.public.id
}

# Create Security Group for SSH access
resource "aws_security_group" "allow_ssh" {
  name = "allow_ssh"
  description = "Allow SSH inbound traffic"
  vpc_id = aws_vpc.example.id

  ingress {
    from_port = 22
    to_port = 22
    protocol = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port = 0
    to_port = 0
    protocol = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# Launch EC2 Instances (one per AZ)
resource "aws_instance" "example" {
  count = length(data.aws_availability_zones.available.names)
  ami = "ami-00537a5a13e492245" # Replace with your desired AMI ID
  instance_type = "t2.micro"
  subnet_id = element(aws_subnet.public[*].id, count.index)
  key_name = "your_key_pair_name" # Replace with your existing key pair name
  vpc_security_group_ids = [aws_security_group.allow_ssh.id]

  tags = {
    Name = "example-instance-${count.index + 1}"
  }
}

Explanation:

Get Available AZs: The data "aws_availability_zones" "available" {} block retrieves the list of available AZs in your region.
Create Subnets per AZ: The aws_subnet.public resource creates a public subnet in each available AZ.
- count = length(data.aws_availability_zones.available.names) ensures that the number of subnets created matches the number of available AZs.
- availability_zone = element(data.aws_availability_zones.available.names, count.index) assigns each subnet to a different AZ.
Create EC2 Instances per AZ: The aws_instance.example resource launches an EC2 instance in each available AZ.
- count = length(data.aws_availability_zones.available.names) ensures that the number of instances launched matches the number of available AZs.
- subnet_id = element(aws_subnet.public[*].id, count.index) places each instance in a different subnet, effectively distributing them across AZs.

Before running this code:

Replace placeholders:
- Update the ami value in the aws_instance resource with the ID of the AMI you want to use.
- Replace "your_key_pair_name" with the name of your existing key pair.
Ensure prerequisites:
- You have Terraform installed and configured.
- You have an AWS account and the necessary credentials configured.

This setup creates a basic VPC with public subnets spanning multiple AZs and launches EC2 instances distributed across those AZs, improving the fault tolerance and availability of your infrastructure.

Additional Notes

General:

Fault Tolerance: Distributing resources across AZs is fundamental to AWS architecture for high availability and fault tolerance. This setup ensures redundancy and minimizes the impact of outages in a single AZ.
Regional Services: Remember that some AWS services operate at a regional level (e.g., S3, Route 53). These services inherently provide high availability without needing explicit AZ distribution.
Cost Optimization: Be mindful of potential data transfer costs between AZs, especially for high-volume traffic.

Code Specific:

aws_availability_zones Filtering: The aws_availability_zones data source can be further filtered using the state argument to select only "available" AZs, excluding "unavailable" or "opt-in-not-required" zones.
Subnet CIDR Planning: Carefully plan your subnet CIDR blocks to accommodate future growth and avoid IP address exhaustion. Consider using a smaller CIDR block for your VPC if you don't anticipate a large number of subnets.
Security Groups: The provided code includes a basic security group allowing SSH access. Tailor your security group rules to match your specific application requirements.
Instance Size and Type: Choose an appropriate EC2 instance size and type based on your workload needs. Consider using Auto Scaling groups for dynamic scaling and cost optimization.
Data Persistence: For stateful applications, ensure data persistence by using services like EBS volumes with appropriate backups or managed database services like RDS.
Modularization: For larger infrastructures, consider breaking down your Terraform code into modules for better organization and reusability.

Beyond the Basics:

Load Balancing: Implement a load balancer (e.g., Elastic Load Balancer) to distribute traffic evenly across instances in different AZs, further enhancing availability and fault tolerance.
Database Replication: For databases, configure replication across AZs to ensure data redundancy and high availability.
Disaster Recovery: Consider implementing a multi-region disaster recovery strategy for critical applications to withstand regional outages.

This setup provides a solid foundation for building highly available and fault-tolerant applications on AWS. Remember to adapt and expand upon these concepts based on your specific application requirements and best practices.

Summary

This code snippet demonstrates how to use Terraform to automatically provision resources across multiple Availability Zones (AZs) in AWS, enhancing fault tolerance and high availability.

Key Steps:

Fetch Available AZs: The aws_availability_zones data source retrieves a list of available AZs in your region.
Create Subnets per AZ: The aws_subnet resource creates a subnet in each available AZ using count and element for iteration.
Reference Subnets: Resources like EC2 instances can reference specific subnets using their index within the aws_subnet resource.
Distribute Instances: By setting the count parameter of the aws_instance resource to match the number of AZs, instances are automatically distributed across them, ensuring redundancy and high availability.

Benefits:

Enhanced Fault Tolerance: Distributing resources across AZs minimizes the impact of outages within a single AZ.
Improved High Availability: Multiple instances across AZs ensure service continuity even during failures.
Automated Provisioning: Terraform simplifies the process of creating and managing resources across multiple AZs.

Conclusion

By combining the aws_availability_zones data source with count and element, Terraform enables you to effortlessly provision subnets and EC2 instances across multiple AZs. This approach significantly enhances the fault tolerance and high availability of your AWS infrastructure, ensuring that your applications can withstand disruptions at the AZ level. Remember to adapt security groups, instance types, and other configurations to align with your specific application requirements and best practices for optimal performance and security.

References

aws_availability_zones | Data Sources | hashicorp/aws | Terraform ... | The Availability Zones data source allows access to the list of AWS Availability Zones which can be accessed by an AWS account within the region configured in ...
Terraform - Create ec2 instances in each availability zone - Stack ... | Oct 23, 2021 ... There are several ways of accomplishing this. What I would recommend is to create a VPC with 3 subnets and place an instance in each subnet:
Terraform MSK creation fails on "The number of Availability Zones in ... | Hi, I'm trying to deploy a simple 3 node MSK cluster and I'm getting the following error message: Error: error creating MSK cluster: BadRequestException: The number of Availability Zones in the Bro...
AWS Balanced Multi Zone configuration with Terraform | by Roberto ... | In two previous articles I explored how to use Terraform to automate infrastructure provisioning on an AWS Single Zone (you can read it…
Create subnet per availability zone in aws through terraform | by ... | When building basic AWS network infrastructure for your application, many a times you need to create public and private subnets in each…
How do I get a map of AWS availability zones : r/Terraform | Posted by u/Pumpkin-Main - 4 votes and 5 comments
amazon web services - Terraform AWS AZ error - manually specify ... | Jun 18, 2018 ... If anyone else has this issue, here's how I got around it. Specify a variable with the availability zones you want to target
AWS RDS-Convert Single-AZ to Multi-AZ with Terraform | by Nidhi ... | Moving from a Single-AZ (Availability Zone) deployment to a Multi-AZ deployment in Amazon RDS involves configuring your database instance…
Availability Zone Subnetting - AWS - HashiCorp Discuss | Hello All, I am new to Terraform and Cloud Computing in general. I am trying to create a main.tf that sets up an AWS service and my main.tf looks like this: data "aws_availability_zones" "zone_scaling" {} #define AMI data "aws_ami" "ubuntu" { most_recent = true owners = ["099720109477"] # Canonical filter { name = "name" values = ["ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-*"] } } resource "aws_vpc" "main_vpc" { cidr_block = "10.0.0.0/16" enabl...