AWS S3 Object Lambda is one of the most innovative yet underutilized AWS services. It allows you to intercept S3 GET requests and transform data on-the-fly using a Lambda function—without storing multiple copies of the data. Use cases include PII redaction, image resizing, format conversion, decompression, and dynamic watermarking. This comprehensive guide covers architecture patterns, implementation details, performance considerations, and production deployment strategies.
Architecture Deep Dive
S3 Object Lambda works by inserting a Lambda function between the S3 bucket and the client. Requests are routed through an “Object Lambda Access Point” rather than directly to the bucket.
flowchart LR
Client["Application"] --> OLAP["Object Lambda Access Point"]
OLAP --> Lambda["Transform Lambda"]
Lambda --> SAP["Supporting Access Point"]
SAP --> S3["S3 Bucket (Original Data)"]
Lambda --> Response["Transformed Response"]
Response --> Client
style Lambda fill:#FFF3E0,stroke:#E65100
style OLAP fill:#E1F5FE,stroke:#0277BD
The flow is:
- Step 1: Client requests object via Object Lambda Access Point
- Step 2: S3 invokes your Lambda function with metadata
- Step 3: Lambda fetches original object via Supporting Access Point
- Step 4: Lambda transforms the data
- Step 5: Lambda writes transformed data back via
WriteGetObjectResponse
Use Case 1: PII Redaction
Imagine storing customer records in S3, but different applications need different levels of access. A support dashboard should see masked SSNs, while the billing system needs full access. Object Lambda enables this without duplicating data:
import json
import re
import boto3
import requests
s3 = boto3.client('s3')
def lambda_handler(event, context):
# Get the presigned URL for the original object
object_get_context = event['getObjectContext']
request_route = object_get_context['outputRoute']
request_token = object_get_context['outputToken']
s3_url = object_get_context['inputS3Url']
# Fetch original object
response = requests.get(s3_url)
original_data = response.text
# Redact SSN patterns (XXX-XX-XXXX)
redacted_data = re.sub(
r'\b\d{3}-\d{2}-\d{4}\b',
'XXX-XX-XXXX',
original_data
)
# Redact credit card patterns
redacted_data = re.sub(
r'\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b',
'XXXX-XXXX-XXXX-XXXX',
redacted_data
)
# Write transformed object back
s3.write_get_object_response(
Body=redacted_data,
RequestRoute=request_route,
RequestToken=request_token
)
return {'statusCode': 200}
Use Case 2: Dynamic Image Resizing
Store one high-resolution image and serve different sizes based on query parameters:
from PIL import Image
import io
import boto3
s3 = boto3.client('s3')
def lambda_handler(event, context):
object_get_context = event['getObjectContext']
request_route = object_get_context['outputRoute']
request_token = object_get_context['outputToken']
s3_url = object_get_context['inputS3Url']
# Parse query parameters (e.g., ?width=200&height=200)
user_request = event['userRequest']
params = dict(x.split('=') for x in user_request['url'].split('?')[1].split('&') if '=' in x)
width = int(params.get('width', 800))
height = int(params.get('height', 600))
# Download original image
import requests
response = requests.get(s3_url)
image = Image.open(io.BytesIO(response.content))
# Resize
image = image.resize((width, height), Image.LANCZOS)
# Convert to bytes
buffer = io.BytesIO()
image.save(buffer, format='JPEG', quality=85)
buffer.seek(0)
s3.write_get_object_response(
Body=buffer.read(),
RequestRoute=request_route,
RequestToken=request_token,
ContentType='image/jpeg'
)
return {'statusCode': 200}
Infrastructure Setup with Terraform
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket"
}
resource "aws_s3_access_point" "supporting" {
bucket = aws_s3_bucket.data.id
name = "supporting-ap"
}
resource "aws_s3control_object_lambda_access_point" "redact" {
name = "redact-pii-ap"
configuration {
supporting_access_point = aws_s3_access_point.supporting.arn
transformation_configuration {
actions = ["GetObject"]
content_transformation {
aws_lambda {
function_arn = aws_lambda_function.redact.arn
}
}
}
}
}
resource "aws_lambda_permission" "allow_s3" {
statement_id = "AllowS3ObjectLambda"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.redact.function_name
principal = "s3-object-lambda.amazonaws.com"
source_arn = aws_s3control_object_lambda_access_point.redact.arn
}
Performance Considerations
- Latency: Object Lambda adds 50-200ms latency depending on Lambda cold start and transformation complexity
- Memory: Lambda must hold the entire object in memory. For large files, use streaming
- Concurrency: Lambda concurrency limits apply. Request Provisioned Concurrency for high-throughput use cases
- Cost: You pay for Lambda invocations + S3 requests + data transfer
Security Best Practices
- Grant
s3-object-lambda:WriteGetObjectResponseonly to the specific Lambda - Use VPC endpoints if Lambda needs to access private resources
- Enable CloudTrail for access point requests
- Limit who can create Object Lambda Access Points via SCPs
Key Takeaways
- Object Lambda transforms data on-read without storing duplicates
- Use for PII redaction, image resizing, format conversion
- Expect 50-200ms additional latency
- Lambda must fit entire object in memory
- Combine with IAM policies to serve different views to different consumers
Discover more from C4: Container, Code, Cloud & Context
Subscribe to get the latest posts sent to your email.