ECS AutoScaling with CloudFormation
Autoscaling ECS services with combination of target tracking and stepped scaling policies.
The topic of ECS autoscaling is a vast area of heated discussions and broken dreams. It is quite hard to come up with efficient scaling policies for your ECS services. And the more distributed your architecture, the more issues with cascading load and increasing latency you are going to face. But fear not, the promised salvation in form of autoscaling for your services is here to save the day and distribute your computing load evenly across your micro services. So lets examine what we have to work with to achieve that.
Scaling services
Autoscaling of ECS services is implemented as an automated action executed upon an event: scale in or scale out. The source of such event can be either an alarm with StepScaling type of policy or TargetTrackingScaling type. Usage of the target tracking is very similar to the implementation for DynamoDB, with selection of ECSServiceAverageCPUUtilization and ECSServiceAverageMemoryUtilization metrics available for tracking. Notice that ECS can track only average metrics of the service, so it means you need to make sure that tasks have load distributed evenly on load balancer. Significant gaps between maximum and average consumption can lead to a termination of a task with out of memory or CPU and lead to 502 errors.
New and Old
ECS scaling policies can be combined to produce even greater efficiency in load distribution. Usage of the `StepScaling` policies to handle scale out events on ALB or SQS metrics by estimating the load in the input source. ALB's Target Group metrics such as `AWS/ApplicationELB/RequestCountPerTarget` are good baseline to start policies. Size of the SQS queue is another example of deterministic metric to estimate incoming load for service.
Combination of StepScaling and TargetTrackingScaling looking at ECSServiceAverageCPUUtilization or ECSServiceAverageMemoryUtilization can allow greater flexibility in the how your service can react on load. If it is possible to determine of the service in question is mostly CPU or memory bound, then selection of a threshold for one of these average metrics should be pretty easy by observing the service under generated test load.
CloudFormation support for ECS scaling
To define an ECS service with scaling policies in CloudFormation you need to have a cluster, instance role for EC2 hosts and other essentials omitted from this example.
First we need a service role to perform scaling actions on our behalf.
ScalingRole: Type: AWS::IAM::Role Properties: RoleName: ScalingRole AssumeRolePolicyDocument: Version: "2012-10-17" Statement: - Effect: Allow Principal: Service: - application-autoscaling.amazonaws.com Action: - sts:AssumeRole ScalingRolePolicy: Type: AWS::IAM::Policy Properties: Roles: - !Ref ScalingRole PolicyName: ScalingRolePolicyPolicy PolicyDocument: Version: '2012-10-17' Statement: - Effect: Allow Resource: '*' Action: - application-autoscaling:* - ecs:RunTask - ecs:UpdateSerice - ecs:DescribeServices - cloudwatch:PutMetricAlarm - cloudwatch:DescribeAlarms - cloudwatch:GetMetricStatistics - cloudwatch:SetAlarmState - cloudwatch:DeleteAlarms
Now we're going to have a look at a service definition, its target group for ALB, scaling targets and policies and a CloudWatch alarm. For this example we are going to define ExampleCPUAutoScalingPolicy for a new capacity to grow to a value so that current usage ECSServiceAverageCPUUtilization accounts for 50% and ExampleRequestsAutoScalingPolicy when we have more than 1000 requests per target in a minute.
ExampleTargetGroup: Type: AWS::ElasticLoadBalancingV2::TargetGroup Properties: Port: 80 Protocol: HTTP VpcId: !Ref VpcId HealthCheckIntervalSeconds: 30 HealthCheckPath: /status HealthCheckTimeoutSeconds: 15 HealthyThresholdCount: 2 UnhealthyThresholdCount: 6 Matcher: HttpCode: 200 TargetGroupAttributes: - Key: deregistration_delay.timeout_seconds Value: 30 ExampleService: Type: AWS::ECS::Service Properties: TaskDefinition: !Ref ExampleTask # omitted PlacementStrategies: - Field: attribute:ecs.availability-zone Type: spread DesiredCount: 1 Cluster: example-cluster # omitted LoadBalancers: - TargetGroupArn: !Ref ExampleTargetGroup ContainerPort: 8080 ContainerName: example-service ExampleAutoScalingTarget: Type: AWS::ApplicationAutoScaling::ScalableTarget Properties: MaxCapacity: !Ref MaxServicesCount # parameters MinCapacity: !Ref MinServicesCount ResourceId: Fn::Sub: - service/ExampleCluster/${ServiceName} - ServiceName: !GetAtt ExampleService.Name RoleARN: !GetAtt ScalingRole.Arn ScalableDimension: ecs:service:DesiredCount ServiceNamespace: ecs ExampleCPUAutoScalingPolicy: Type: AWS::ApplicationAutoScaling::ScalingPolicy Properties: PolicyName: ExampleCPUAutoScalingPolicy PolicyType: TargetTrackingScaling ScalingTargetId: !Ref ExampleAutoScalingTarget TargetTrackingScalingPolicyConfiguration: DisableScaleIn: True TargetValue: 50 ScaleInCooldown: 60 ScaleOutCooldown: 60 PredefinedMetricSpecification: PredefinedMetricType: ECSServiceAverageCPUUtilization ExampleRequestsAutoScalingPolicy: Type: AWS::ApplicationAutoScaling::ScalingPolicy Properties: PolicyName: ExampleRequestsAutoScalingPolicy PolicyType: StepScaling ScalingTargetId: !Ref ExampleAutoScalingTarget ScalableDimension: ecs:service:DesiredCount ServiceNamespace: ecs StepScalingPolicyConfiguration: AdjustmentType: ChangeInCapacity Cooldown: 60 MetricAggregationType: Average StepAdjustments: - MetricIntervalLowerBound: 0 ScalingAdjustment: 1 - MetricIntervalUpperBound: 0 ScalingAdjustment: -1 ExampleRequestsAlarm: Type: AWS::CloudWatch::Alarm Properties: MetricName: RequestCountPerTarget Namespace: AWS/ApplicationELB Statistic: Sum Period: 60 EvaluationPeriods: 1 Threshold: 1000 ComparisonOperator: GreaterThanOrEqualToThreshold AlarmActions: - !Ref ExampleRequestsAutoScalingPolicy OKActions: - !Ref ExampleRequestsAutoScalingPolicy Dimensions: - Name: TargetGroup Value: !GetAtt ExampleTargetGroup.TargetGroupFullName
Notice that parameters section of the ExampleCPUAutoScalingPolicy resource contains DisableScaleIn: true for a specific reason. In order to guarantee that requests per target scaling events have priority over target tracking, the scale in logic of tracking can be disabled completely.
Stability is the key
Ok, so now we have the service scaling up and down based on the number of requests per target in ELB. However, you would notice that the threshold in StepAdjustments for scale up starts right after scale down. It means that your service's desired count would oscillate around some value, going up and down with new tasks spun up. To allow for a window of stability, you need to have a range with ScalingAdjustment: 0, where you would have a boundary to increase & decrease desired count. That way it is possible to make alarm to alert on the scale in boundary, and StepAdjustments to interpret the range. Lets see the example, where we want to scale out on more than RequestsScaleOutThreshold requests per target, and scale in on less than RequestsScaleInThreshold:
ExampleRequestsAlarm: Type: AWS::CloudWatch::Alarm Properties: MetricName: RequestCountPerTarget Namespace: AWS/ApplicationELB Statistic: Sum Period: 60 EvaluationPeriods: 1 Threshold: 500 # scale in boundary to trigger the alarm ComparisonOperator: GreaterThanOrEqualToThreshold AlarmActions: - !Ref ExampleRequestsAutoScalingPolicy Dimensions: - Name: TargetGroup Value: !GetAtt ExampleTargetGroup.TargetGroupFullName ExampleRequestsAutoScalingPolicy: Type: AWS::ApplicationAutoScaling::ScalingPolicy Properties: PolicyName: ExampleRequestsAutoScalingPolicy PolicyType: StepScaling ScalingTargetId: !Ref ExampleAutoScalingTarget ScalableDimension: ecs:service:DesiredCount ServiceNamespace: ecs StepScalingPolicyConfiguration: AdjustmentType: ChangeInCapacity Cooldown: 60 MetricAggregationType: Average StepAdjustments: - MetricIntervalLowerBound: !Ref RequestsScaleOutThreshold ScalingAdjustment: 1 - MetricIntervalLowerBound: !Ref RequestsScaleInThreshold MetricIntervalUpperBound: !Ref RequestsScaleOutThreshold ScalingAdjustment: 0 - MetricIntervalUpperBound: !Ref RequestsScaleInThreshold ScalingAdjustment: -1
Here we have a range between MetricIntervalLowerBound=RequestsScaleInThreshold and MetricIntervalUpperBound=RequestsScaleOutThreshold where ScalingAdjustment=0 and no changes are done to desired count. This will ensure that oscillation of desired count does not happen for you.
Further reading
Another approach would be to define to alarms, one to scale out and one to scale in. Each would have a specific range and specific policy associated. Such approach in fact is used quite a lot, but the problem is that CloudWatch alarms are not free, in fact they are pretty expensive.
Additional details are found over the AWS::ApplicationAutoScaling::ScalableTarget and AWS::CloudWatch::Alarm documentation.