Auto Scaling & Load Balancing

Auto Scaling Overview

Auto Scaling automatically adjusts the number of EC2 instances based on demand.

┌─────────────────────────────────────────────┐
│           Auto Scaling Group                 │
│                                             │
│  Min: 2    Desired: 3    Max: 10            │
│                                             │
│  ┌──────┐  ┌──────┐  ┌──────┐              │
│  │ EC2  │  │ EC2  │  │ EC2  │  ← Current   │
│  └──────┘  └──────┘  └──────┘              │
│                                             │
│  Scale Out ──▶ when CPU > 70%               │
│  Scale In  ──▶ when CPU < 30%               │
└─────────────────────────────────────────────┘

Components

Component	Purpose
Launch Template	Instance configuration (AMI, type, SG)
Auto Scaling Group	Manages instance count
Scaling Policy	When and how to scale
Health Check	Detect unhealthy instances

Launch Template

aws ec2 create-launch-template \
  --launch-template-name nextgen-web-template \
  --version-description "v1" \
  --launch-template-data '{
    "ImageId": "ami-0c55b159cbfafe1f0",
    "InstanceType": "t3.small",
    "KeyName": "nextgen-key",
    "SecurityGroupIds": ["sg-0123456789abcdef0"],
    "UserData": "'$(base64 -w0 userdata.sh)'",
    "TagSpecifications": [{
      "ResourceType": "instance",
      "Tags": [{"Key": "Name", "Value": "nextgen-web"}]
    }]
  }'

Auto Scaling Group

aws autoscaling create-auto-scaling-group \
  --auto-scaling-group-name nextgen-web-asg \
  --launch-template LaunchTemplateName=nextgen-web-template,Version='$Latest' \
  --min-size 2 \
  --max-size 10 \
  --desired-capacity 3 \
  --availability-zones us-east-1a us-east-1b us-east-1c \
  --target-group-arns arn:aws:elasticloadbalancing:... \
  --health-check-type ELB \
  --health-check-grace-period 300

Scaling Policies

Target Tracking (Recommended)

# Keep average CPU at 60%
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name nextgen-web-asg \
  --policy-name cpu-target-tracking \
  --policy-type TargetTrackingScaling \
  --target-tracking-configuration '{
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ASGAverageCPUUtilization"
    },
    "TargetValue": 60.0,
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 60
  }'

Step Scaling

# Scale based on CPU thresholds
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name nextgen-web-asg \
  --policy-name cpu-step-scaling \
  --policy-type StepScaling \
  --adjustment-type ChangeInCapacity \
  --step-adjustments '[
    {"MetricIntervalLowerBound": 0, "MetricIntervalUpperBound": 20, "ScalingAdjustment": 1},
    {"MetricIntervalLowerBound": 20, "ScalingAdjustment": 3}
  ]'

Scheduled Scaling

# Scale up for business hours
aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name nextgen-web-asg \
  --scheduled-action-name scale-up-morning \
  --recurrence "0 8 * * MON-FRI" \
  --desired-capacity 5
 
# Scale down at night
aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name nextgen-web-asg \
  --scheduled-action-name scale-down-night \
  --recurrence "0 20 * * MON-FRI" \
  --desired-capacity 2

Application Load Balancer

# Create ALB
aws elbv2 create-load-balancer \
  --name nextgen-alb \
  --subnets subnet-aaa subnet-bbb subnet-ccc \
  --security-groups sg-0123456789abcdef0 \
  --scheme internet-facing \
  --type application
 
# Create target group
aws elbv2 create-target-group \
  --name nextgen-web-tg \
  --protocol HTTP \
  --port 80 \
  --vpc-id vpc-0123456789abcdef0 \
  --health-check-path /health \
  --health-check-interval-seconds 30 \
  --healthy-threshold-count 2 \
  --unhealthy-threshold-count 3
 
# Create listener
aws elbv2 create-listener \
  --load-balancer-arn arn:aws:elasticloadbalancing:... \
  --protocol HTTP \
  --port 80 \
  --default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:...

Path-Based Routing

# Route /api/* to API target group
aws elbv2 create-rule \
  --listener-arn arn:aws:elasticloadbalancing:... \
  --priority 10 \
  --conditions '[{"Field":"path-pattern","Values":["/api/*"]}]' \
  --actions '[{"Type":"forward","TargetGroupArn":"arn:aws:elasticloadbalancing:..."}]'

High Availability Architecture

┌─────────────────────────────────────────────┐
│              Application Load Balancer        │
└──────┬──────────────┬──────────────┬────────┘
       │              │              │
┌──────▼─────┐ ┌─────▼──────┐ ┌────▼───────┐
│   AZ-1a    │ │   AZ-1b    │ │   AZ-1c    │
│  ┌──────┐  │ │  ┌──────┐  │ │  ┌──────┐  │
│  │ EC2  │  │ │  │ EC2  │  │ │  │ EC2  │  │
│  └──────┘  │ │  └──────┘  │ │  └──────┘  │
└────────────┘ └────────────┘ └────────────┘

Summary

You've learned:

Auto Scaling Groups for dynamic capacity
Scaling policies (target tracking, step, scheduled)
Application Load Balancer setup and routing
Health checks and high availability patterns
Multi-AZ architecture for fault tolerance

Next Steps

Next, we'll explore AWS networking with VPCs, subnets, and security architecture.

Learning Objectives

Auto Scaling Overview

Components

Launch Template

Auto Scaling Group

Scaling Policies

Target Tracking (Recommended)

Step Scaling

Scheduled Scaling

Application Load Balancer

Path-Based Routing

High Availability Architecture

Summary

Next Steps