AWS Step Functions Distributed Map: Massive Parallel Processing

AWS Step Functions Distributed Map, announced at re:Invent 2022, enables processing up to 10,000 concurrent items in a Map state—compared to the previous 40-item limit. This makes Step Functions viable for large-scale ETL, data processing, and batch workflows that previously required custom orchestration or EMR. This guide covers architecture patterns, S3 integration, and cost optimization for high-volume processing.

Traditional Map vs Distributed Map

FeatureInline MapDistributed Map
Max Concurrency4010,000
Max Items~40,000Unlimited (S3)
Input SourceJSON array in stateS3 objects/manifest
OutputJSON arrayS3 (ResultWriter)
BillingPer state transitionPer child execution

Architecture Pattern: S3 Inventory Processing

flowchart TB
    S3Inventory["S3 Inventory (CSV)"] --> DistMap["Distributed Map (10K concurrent)"]
    
    subgraph ChildExecutions ["Child Executions"]
        CE1["Process File 1"]
        CE2["Process File 2"]
        CE3["..."]
        CE10000["Process File 10000"]
    end
    
    DistMap --> ChildExecutions
    ChildExecutions --> ResultWriter["ResultWriter (S3)"]
    ResultWriter --> S3Output["S3 Output Bucket"]
    
    style DistMap fill:#FFF3E0,stroke:#E65100

State Machine Definition

{
  "StartAt": "ProcessS3Objects",
  "States": {
    "ProcessS3Objects": {
      "Type": "Map",
      "ItemReader": {
        "Resource": "arn:aws:states:::s3:getObject",
        "ReaderConfig": {
          "InputType": "CSV",
          "CSVHeaderLocation": "FIRST_ROW"
        },
        "Parameters": {
          "Bucket": "my-inventory-bucket",
          "Key": "manifest.csv"
        }
      },
      "ItemProcessor": {
        "ProcessorConfig": {
          "Mode": "DISTRIBUTED",
          "ExecutionType": "EXPRESS"
        },
        "StartAt": "TransformRecord",
        "States": {
          "TransformRecord": {
            "Type": "Task",
            "Resource": "arn:aws:lambda:...:transform-function",
            "End": true
          }
        }
      },
      "MaxConcurrency": 1000,
      "ResultWriter": {
        "Resource": "arn:aws:states:::s3:putObject",
        "Parameters": {
          "Bucket": "my-output-bucket",
          "Prefix": "results/"
        }
      },
      "End": true
    }
  }
}

Key Configuration Options

ItemReader Sources

  • S3 Object List: Process every object in a prefix
  • S3 Manifest: CSV/JSON file listing objects to process
  • S3 Inventory: AWS-generated inventory report

ProcessorConfig Options

{
  "Mode": "DISTRIBUTED",
  "ExecutionType": "EXPRESS", // or "STANDARD"
}

Use EXPRESS for high-volume, short-duration child executions (up to 5 minutes). Use STANDARD for long-running or exactly-once semantics.

Error Handling

{
  "ItemProcessor": {
    "StartAt": "TryProcess",
    "States": {
      "TryProcess": {
        "Type": "Task",
        "Resource": "arn:aws:lambda:...:process",
        "Retry": [
          {
            "ErrorEquals": ["Lambda.TooManyRequestsException"],
            "IntervalSeconds": 1,
            "MaxAttempts": 5,
            "BackoffRate": 2.0
          }
        ],
        "Catch": [
          {
            "ErrorEquals": ["States.ALL"],
            "Next": "RecordFailure"
          }
        ],
        "End": true
      },
      "RecordFailure": {
        "Type": "Pass",
        "Result": {"status": "FAILED"},
        "End": true
      }
    }
  },
  "ToleratedFailurePercentage": 5
}

ToleratedFailurePercentage allows the workflow to succeed even if some items fail—essential for large-scale processing.

Cost Optimization

Distributed Map charges per child execution:

  • EXPRESS: $0.000025 per 64KB of state + duration
  • STANDARD: $0.025 per 1,000 state transitions

For 1 million items with EXPRESS child executions:

1,000,000 items × $0.000001 = ~$1.00 for Step Functions
+ Lambda costs for processing each item

Key Takeaways

  • Distributed Map enables 10,000 concurrent child executions
  • Read directly from S3 objects, manifests, or inventory
  • Use EXPRESS mode for high-volume, short processing
  • ResultWriter aggregates output to S3
  • ToleratedFailurePercentage prevents total workflow failure

Discover more from C4: Container, Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.