What is CloudWatch?
Amazon CloudWatch is AWS's monitoring and observability service. It collects metrics, logs, and events from AWS resources and applications.
CloudWatch Components
| Component | Purpose |
|---|---|
| Metrics | Numerical data points over time |
| Alarms | Trigger actions based on thresholds |
| Logs | Centralized log collection |
| Dashboards | Visual monitoring displays |
| Events | React to state changes |
EC2 Default Metrics
CloudWatch automatically collects these EC2 metrics (5-minute intervals):
| Metric | Description |
|---|---|
| CPUUtilization | CPU usage percentage |
| NetworkIn/Out | Network bytes transferred |
| DiskReadOps/WriteOps | Disk I/O operations |
| StatusCheckFailed | Instance health checks |
Enabling Detailed Monitoring
# Enable 1-minute metrics (additional cost)
aws ec2 monitor-instances --instance-ids i-0123456789abcdef0
# Verify
aws cloudwatch list-metrics \
--namespace AWS/EC2 \
--dimensions Name=InstanceId,Value=i-0123456789abcdef0CloudWatch Alarms
Creating a CPU Alarm
aws cloudwatch put-metric-alarm \
--alarm-name "HighCPU-nextgen-web" \
--alarm-description "CPU above 80% for 5 minutes" \
--metric-name CPUUtilization \
--namespace AWS/EC2 \
--statistic Average \
--period 300 \
--threshold 80 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 2 \
--dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
--alarm-actions arn:aws:sns:us-east-1:123456789012:alertsCommon Alarm Configurations
# Disk space alarm (requires CloudWatch Agent)
aws cloudwatch put-metric-alarm \
--alarm-name "LowDisk-nextgen-web" \
--metric-name disk_used_percent \
--namespace CWAgent \
--statistic Average \
--period 300 \
--threshold 85 \
--comparison-operator GreaterThanThreshold \
--evaluation-periods 2 \
--dimensions Name=InstanceId,Value=i-0123456789abcdef0
# Status check alarm (auto-recover)
aws cloudwatch put-metric-alarm \
--alarm-name "StatusCheck-nextgen-web" \
--metric-name StatusCheckFailed \
--namespace AWS/EC2 \
--statistic Maximum \
--period 60 \
--threshold 1 \
--comparison-operator GreaterThanOrEqualToThreshold \
--evaluation-periods 3 \
--dimensions Name=InstanceId,Value=i-0123456789abcdef0 \
--alarm-actions arn:aws:automate:us-east-1:ec2:recoverCloudWatch Agent
The CloudWatch Agent collects system-level metrics (memory, disk) and custom application logs.
Installation
# Download and install
sudo yum install -y amazon-cloudwatch-agent
# Or download directly
wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
sudo rpm -U ./amazon-cloudwatch-agent.rpmConfiguration
{
"metrics": {
"namespace": "CWAgent",
"metrics_collected": {
"mem": {
"measurement": ["mem_used_percent"],
"metrics_collection_interval": 60
},
"disk": {
"measurement": ["disk_used_percent"],
"metrics_collection_interval": 60,
"resources": ["/", "/data"]
}
}
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/messages",
"log_group_name": "nextgen-system-logs",
"log_stream_name": "{instance_id}"
},
{
"file_path": "/var/log/nginx/access.log",
"log_group_name": "nextgen-nginx-access",
"log_stream_name": "{instance_id}"
}
]
}
}
}
}# Start the agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config \
-m ec2 \
-c file:/opt/aws/amazon-cloudwatch-agent/etc/config.json \
-sCloudWatch Logs
Querying Logs with Insights
-- Find errors in the last hour
fields @timestamp, @message
| filter @message like /ERROR/
| sort @timestamp desc
| limit 50
-- Count requests by status code
fields @timestamp, status
| stats count(*) by status
| sort count desc
-- Average response time
fields @timestamp, response_time
| stats avg(response_time) as avg_time, max(response_time) as max_time by bin(5m)CloudWatch Dashboards
aws cloudwatch put-dashboard \
--dashboard-name "NextGen-Overview" \
--dashboard-body '{
"widgets": [
{
"type": "metric",
"properties": {
"metrics": [
["AWS/EC2", "CPUUtilization", "InstanceId", "i-0123456789abcdef0"]
],
"period": 300,
"stat": "Average",
"title": "CPU Utilization"
}
}
]
}'SNS Notifications
# Create an SNS topic for alerts
aws sns create-topic --name nextgen-alerts
# Subscribe email
aws sns subscribe \
--topic-arn arn:aws:sns:us-east-1:123456789012:nextgen-alerts \
--protocol email \
--notification-endpoint team@nextgenplayground.org
# Use topic ARN in alarm actionsSummary
You've learned:
- CloudWatch metrics, alarms, and dashboards
- Monitoring EC2 with default and custom metrics
- Installing and configuring the CloudWatch Agent
- Centralized logging with CloudWatch Logs
- SNS notifications for alert delivery
Next Steps
You now have a solid AWS compute foundation. Combine these skills with your Terraform knowledge to provision and monitor infrastructure as code.