🌏 閱讀中文版本
In cloud architecture, backup is a critical component for ensuring data security. AWS EC2 snapshots provide an easy way to create backups for EC2 instances. However, manually creating snapshots can be time-consuming and error-prone. Using AWS Lambda with CloudWatch Events for automated backups is an ideal solution.
This article will guide you step-by-step on how to:
- Create a Lambda function to execute snapshot operations
- Configure CloudWatch Events to schedule monthly backups
- Verify snapshot creation success
Why Automate EC2 Backups?
Use Cases:
- Disaster Recovery: Quickly restore EC2 instances from snapshots when failures occur
- Compliance Requirements: Many industry regulations require regular data backups
- Data Migration: Migrate EC2 instances to different regions or accounts
- Version Management: Maintain system states at different points in time
Why Choose Lambda + CloudWatch Events?
- Serverless Architecture: No need to manage backup execution servers
- Cost-Effective: Pay only for execution time, no cost when idle
- High Reliability: AWS managed services guarantee high availability
- Flexible Scheduling: Configure monthly, weekly, daily, or custom frequencies
Architecture Overview
System Components:
- Lambda Function: Core logic for snapshot operations
- CloudWatch Events (EventBridge): Define schedule (e.g., monthly execution)
- IAM Role: Provide Lambda with necessary EC2 permissions
- CloudWatch Logs: Record execution logs and error messages
- EC2 Snapshots: Store backup data
Workflow:
- CloudWatch Events triggers Lambda function based on schedule
- Lambda function obtains EC2 permissions through IAM role
- Lambda reads all volumes from specified EC2 instance
- Creates snapshot for each volume
- Verifies snapshot status until completion
- Records execution results to CloudWatch Logs
Step 1: Create Lambda Function
- Log in to AWS Lambda Console
- Click Create function
- Configure the following parameters:
- Function name:
MonthlyEC2Backup - Runtime:
Python 3.12(or latest version) - Permissions: Select
Create a new role with basic Lambda permissions
- Function name:
- Click Create function
Step 2: Write Lambda Code
In the Function code section, add the following code:
import boto3
import datetime
import time
def lambda_handler(event, context):
ec2 = boto3.client('ec2')
# Set EC2 instance IDs to backup
instances = ['<INSTANCE_ID>'] # Replace with your EC2 instance ID
for instance_id in instances:
# Create snapshot description
description = f"Backup of {instance_id} - {datetime.datetime.now().strftime('%Y-%m-%d')}"
# Get all volumes associated with the instance
volumes = ec2.describe_volumes(Filters=[
{'Name': 'attachment.instance-id', 'Values': [instance_id]}
])
# Create snapshots for each volume and check status
for volume in volumes['Volumes']:
volume_id = volume['VolumeId']
print(f"Creating snapshot for volume: {volume_id}")
# Create snapshot
response = ec2.create_snapshot(
VolumeId=volume_id,
Description=description
)
snapshot_id = response['SnapshotId']
print(f"Snapshot {snapshot_id} created for volume {volume_id}")
# Verify snapshot status
print(f"Verifying snapshot {snapshot_id} status...")
while True:
snapshot_status = ec2.describe_snapshots(
SnapshotIds=[snapshot_id]
)['Snapshots'][0]['State']
if snapshot_status == 'completed':
print(f"Snapshot {snapshot_id} for volume {volume_id} completed successfully!")
break
elif snapshot_status == 'error':
print(f"Snapshot {snapshot_id} for volume {volume_id} failed!")
break
else:
print(f"Snapshot {snapshot_id} is in progress...")
time.sleep(5)
return {
'statusCode': 200,
'body': 'EC2 backup completed successfully'
}
Code Explanation:
boto3.client('ec2'): Create EC2 clientdescribe_volumes(): Get all volumes from EC2 instancecreate_snapshot(): Create snapshotdescribe_snapshots(): Check snapshot statustime.sleep(5): Check status every 5 seconds
Important: Replace <INSTANCE_ID> with your actual instance ID (e.g., i-0abcd1234efgh5678), then click Deploy.
Step 3: Configure IAM Permissions
- Go to Lambda function’s Configuration → Permissions
- Click the execution role link
- In IAM Console, click Add permissions → Create inline policy
- Use the following JSON policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"ec2:DescribeVolumes",
"ec2:CreateSnapshot",
"ec2:DescribeSnapshots",
"ec2:CreateTags"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": "arn:aws:logs:*:*:*"
}
]
}
Permission Explanation:
ec2:DescribeVolumes: Read volume informationec2:CreateSnapshot: Create snapshotsec2:DescribeSnapshots: Check snapshot statusec2:CreateTags: Add tags to snapshots (optional)logs:*: Write to CloudWatch Logs
Step 4: Create CloudWatch Events Schedule
- Go to Amazon EventBridge Console → Rules
- Click Create rule
- Configure rule:
- Name:
MonthlyEC2BackupSchedule - Rule type:
Schedule
- Name:
- Set schedule pattern:
- Select Cron-based schedule
- Enter schedule expression:
cron(0 0 1 * ? *)
- Select target:
- Target:
Lambda function - Function:
MonthlyEC2Backup
- Target:
- Click Create rule
Cron Expression Explanation:
cron(0 0 1 * ? *)
│ │ │ │ │ │
│ │ │ │ │ └─ Year (* means any year)
│ │ │ │ └─── Day of week (? means not specified)
│ │ │ └───── Month (* means every month)
│ │ └─────── Day (1 means 1st of month)
│ └───────── Hour (0 means midnight)
└─────────── Minute (0 means on the hour)
Other Common Schedule Examples:
- Every Sunday at 2 AM:
cron(0 2 ? * SUN *) - Every day at 3 AM:
cron(0 3 * * ? *) - 15th of every month at 1 AM:
cron(0 1 15 * ? *)
Step 5: Testing and Verification
Test Lambda Function
- In Lambda Console, click Test
- Create test event (can use empty JSON:
{}) - Execute test and check output
Check CloudWatch Logs
- Go to CloudWatch Console → Log groups
- Find
/aws/lambda/MonthlyEC2Backup - Check execution logs for error messages
Verify Snapshots
- Go to EC2 Console → Snapshots
- Confirm snapshot status is Completed
- Check snapshot description and creation time
Common Issues and Solutions
Issue 1: Lambda Timeout
Cause:
- Default Lambda timeout is 3 seconds
- Creating large snapshots requires more time
Solution:
- Adjust Timeout in Lambda Configuration → General configuration to 5-10 minutes
- Or remove snapshot status verification loop, use asynchronous processing
Issue 2: Permission Denied Error
Error Message:
An error occurred (UnauthorizedOperation) when calling the CreateSnapshot operation
Solution:
- Confirm IAM role includes
ec2:CreateSnapshotpermission - Check for SCP (Service Control Policy) restrictions
Issue 3: How to Backup Multiple EC2 Instances?
Solution:
Expand instances to a list:
instances = [
'i-0abcd1234efgh5678',
'i-0ijkl5678mnop1234',
'i-0qrst9012uvwx3456'
]
Issue 4: How to Automatically Clean Up Old Snapshots?
Solution:
Add cleanup logic to Lambda function:
# Delete snapshots older than 30 days
retention_days = 30
cutoff_date = datetime.datetime.now() - datetime.timedelta(days=retention_days)
snapshots = ec2.describe_snapshots(OwnerIds=['self'])['Snapshots']
for snapshot in snapshots:
if snapshot['StartTime'].replace(tzinfo=None) < cutoff_date:
print(f"Deleting old snapshot: {snapshot['SnapshotId']}")
ec2.delete_snapshot(SnapshotId=snapshot['SnapshotId'])
Cost Considerations
Cost Components:
- Lambda Execution: Free tier includes 1 million requests + 400,000 GB-seconds compute time per month
- EBS Snapshot Storage: Approximately $0.05 per GB per month (varies by region)
- CloudWatch Logs: Approximately $0.50 per GB
Cost Estimation Example:
Assuming backup of 1 EC2 instance with 100GB, executed once per month:
- Lambda execution cost: Nearly $0 (within free tier)
- Snapshot storage cost: 100GB × $0.05 = $5.00/month
- CloudWatch Logs: Negligible (< $0.10)
Cost Optimization Recommendations:
- Regularly clean up old snapshots (e.g., retain last 3 months)
- Use EBS Snapshot Lifecycle Management (Data Lifecycle Manager)
- Evaluate need for cross-region snapshot replication (adds transfer costs)
Advanced Optimizations
1. Add SNS Notifications
Send notification after snapshot completion:
sns = boto3.client('sns')
sns.publish(
TopicArn='arn:aws:sns:us-east-1:123456789012:EC2BackupNotification',
Subject='EC2 Backup Completed',
Message=f'Snapshot {snapshot_id} created successfully'
)
2. Add Tags to Snapshots
ec2.create_tags(
Resources=[snapshot_id],
Tags=[
{'Key': 'Name', 'Value': f'Backup-{instance_id}'},
{'Key': 'CreatedBy', 'Value': 'Lambda'},
{'Key': 'BackupDate', 'Value': datetime.datetime.now().strftime('%Y-%m-%d')}
]
)
3. Use Environment Variables
Set instance IDs as environment variables to avoid hardcoding:
import os
instances = os.environ['INSTANCE_IDS'].split(',')
Conclusion
Through this implementation, you have learned to:
- Build automated EC2 backup system: Using Lambda + CloudWatch Events
- Configure appropriate IAM permissions: Ensuring security and least privilege principle
- Verify snapshot status: Ensuring backup completion
- Handle common issues: Resolve permissions, timeouts, etc.
- Optimize costs: Reduce expenses through automatic cleanup of old snapshots
Next Steps:
- Implement cross-region snapshot replication for enhanced disaster recovery
- Integrate AWS Backup service for centralized management
- Configure CloudWatch alarms to monitor backup failures
- Establish snapshot restore testing procedures to verify backup usability
Related Articles
- AWS Lambda + CloudWatch Events 實現每月 EC2 快照備份與驗證
- AWS VPC Public and Private Subnet Design and Configuration Guide
- AWS VPC Public 與 Private Subnet 設計與配置指南
- AWS Storage Solutions Complete Comparison: Amazon FSx vs S3 Deep Dive & Selection Guide
- Ubuntu Server Auto-Update Complete Guide: Enterprise Strategy, Risk Control & Failure Recovery