Fargate Spot instances offer significant cost savings by leveraging AWS’s spare EC2 capacity, but they come with the caveat of potential interruptions.

Let’s see this in action. Imagine a typical web application running on Fargate. Normally, you’d provision tasks with cpu: 1024 and memory: 2048 and pay on-demand rates. This is safe but expensive.

# Original on-demand Fargate task definition
Resources:
  TargetGroup:
    Type: AWS::ECS::TargetGroup
    Properties:
      # ... other properties ...
      Port: 80
      Protocol: HTTP
      VpcId: vpc-1234567890abcdef0
  Service:
    Type: AWS::ECS::Service
    Properties:
      Cluster: arn:aws:ecs:us-east-1:123456789012:cluster/my-cluster
      ServiceName: my-web-app
      TaskDefinition: !Ref TaskDefinition
      DesiredCount: 3
      LaunchType: FARGATE
      NetworkConfiguration:
        AwsvpcConfiguration:
          Subnets:
            - subnet-abcdef1234567890
            - subnet-fedcba0987654321
          SecurityGroups:
            - sg-0123456789abcdef0
      LoadBalancers:
        - TargetGroupArn: !Ref TargetGroup
          ContainerName: webapp
          ContainerPort: 80
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: my-web-app-task
      RequiresCompatibilities:
        - FARGATE
      NetworkMode: awsvpc
      Cpu: "1024" # 1 vCPU
      Memory: "2048" # 2 GiB
      ContainerDefinitions:
        - Name: webapp
          Image: public.ecr.aws/nginx/nginx:latest
          PortMappings:
            - ContainerPort: 80
              HostPort: 80

Now, let’s introduce cost-saving measures.

1. Fargate Spot

The most immediate saving comes from using Fargate Spot. Instead of launchType: FARGATE, you use launchType: FARGATE_SPOT. AWS will then run your tasks on spare EC2 capacity. The key difference is that Fargate Spot tasks can be interrupted with a two-minute warning if AWS needs the capacity back. This is suitable for stateless applications or those that can gracefully handle interruptions. The cost savings can be substantial, often 50-70% compared to on-demand.

# Fargate Spot configuration
  Service:
    Type: AWS::ECS::Service
    Properties:
      # ... other properties ...
      LaunchType: FARGATE_SPOT # Changed from FARGATE
      # ... other properties ...

2. ARM (Graviton2) Processors

Fargate now supports AWS Graviton2 processors, which are ARM-based. These processors offer better price-performance than comparable x86 processors. For many workloads, you can achieve similar performance with lower CPU and memory configurations, or even better performance at the same cost. To use Graviton2, you specify cpuArchitecture: ARM64 in your task definition.

# Task definition with ARM architecture
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: my-web-app-task-arm
      RequiresCompatibilities:
        - FARGATE
      NetworkMode: awsvpc
      CpuArchitecture: ARM64 # Added for ARM support
      Cpu: "1024"
      Memory: "2048"
      ContainerDefinitions:
        - Name: webapp
          Image: public.ecr.aws/nginx/nginx:latest # Ensure your image is multi-arch or ARM-native
          # ...

Important: Ensure your container images are built for linux/arm64 or are multi-architecture. Many official images, like the Nginx one shown, are already multi-arch and will automatically pull the correct ARM variant. If you build your own images, you’ll need to configure your build pipeline (e.g., Docker Buildx) to produce ARM64 artifacts.

3. Right-Sizing

This is arguably the most impactful, yet often overlooked, cost-saving technique. It involves accurately determining the CPU and memory resources your application actually needs, rather than over-provisioning.

Diagnosis:

  1. Enable CloudWatch Container Insights: Ensure Container Insights is enabled for your ECS cluster. This provides detailed metrics on CPU and memory utilization per task.
  2. Analyze Metrics: Go to CloudWatch -> Metrics -> Container Insights -> ECS -> Cluster Name -> Service Name -> Task Name. Look at the CPUUtilization and MemoryUtilization metrics. Pay attention to the 95th percentile over a representative period (e.g., a week or a full business cycle). This accounts for peak usage without being overly sensitive to transient spikes.
  3. Identify Bottlenecks (or lack thereof): If your 95th percentile CPU utilization is consistently below 70-80% and memory utilization is below 80-90%, you are likely over-provisioned. If your application is experiencing performance issues and utilization is consistently high, you might be under-provisioned.

Action: Adjust the Cpu and Memory values in your ECS task definition.

For example, if your analysis shows the 95th percentile CPU is 600m and memory is 1500MiB, you could potentially downsize.

# Right-sized task definition
  TaskDefinition:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: my-web-app-task-optimized
      RequiresCompatibilities:
        - FARGATE
      NetworkMode: awsvpc
      Cpu: "512"      # Reduced from 1024
      Memory: "1024"  # Reduced from 2048
      # ...

Why it works: You are paying for the provisioned CPU and memory, not just what you use. By matching provisioned resources to actual needs, you reduce the baseline cost per task. Fargate’s smallest unit is 256 CPU / 512 Memory, and it scales up in increments. Choosing values closer to your actual need, within these increments, is key.

Combining Techniques

The real power comes from combining these. Running a right-sized, ARM-based task on Fargate Spot can yield dramatic cost reductions.

Consider a scenario where you’ve right-sized to cpu: 512, memory: 1024, and are using ARM.

# Combined Fargate Spot, ARM, and Right-Sizing
  Service:
    Type: AWS::ECS::Service
    Properties:
      Cluster: arn:aws:ecs:us-east-1:123456789012:cluster/my-cluster
      ServiceName: my-web-app-optimized
      TaskDefinition: !Ref TaskDefinitionOptimized # Reference the new optimized task definition
      DesiredCount: 3
      LaunchType: FARGATE_SPOT # Using Spot
      NetworkConfiguration:
        AwsvpcConfiguration:
          Subnets:
            - subnet-abcdef1234567890
            - subnet-fedcba0987654321
          SecurityGroups:
            - sg-0123456789abcdef0
      LoadBalancers:
        - TargetGroupArn: !Ref TargetGroupOptimized
          ContainerName: webapp
          ContainerPort: 80
  TaskDefinitionOptimized:
    Type: AWS::ECS::TaskDefinition
    Properties:
      Family: my-web-app-task-optimized
      RequiresCompatibilities:
        - FARGATE
      NetworkMode: awsvpc
      CpuArchitecture: ARM64 # Using ARM
      Cpu: "512"             # Right-sized
      Memory: "1024"         # Right-sized
      ContainerDefinitions:
        - Name: webapp
          Image: public.ecr.aws/nginx/nginx:latest
          PortMappings:
            - ContainerPort: 80
              HostPort: 80

The actual cost savings are realized when you look at the Fargate pricing page. For example, on-demand Fargate might be ~$0.04048 per vCPU-hour and ~$0.00441 per GB-hour. Fargate Spot can be 60-70% less. ARM instances offer better performance-per-dollar. Right-sizing means fewer vCPU-hours and GB-hours are consumed in total.

A common pitfall when optimizing is focusing solely on CPU or memory. Many applications are memory-bound. If your application has a small, fixed memory overhead for its runtime (e.g., JVM heap, Python interpreter), and then uses memory dynamically, provisioning memory slightly above the peak dynamic usage plus a buffer is often more stable than aggressively capping it. If you cap memory too low, the task will be OOMKilled and restart, which is worse than paying a little more for stability.

The next challenge you’ll face is managing the interruptions inherent in Fargate Spot and ensuring your application’s resilience.

Want structured learning?

Take the full Ecs course →