Azure Functions Flex Consumption is the newest hosting tier, combining the best of Consumption (pay-per-use, scale-to-zero) and Premium (always-ready instances, VNET support). Now in General Availability, Flex Consumption addresses the main pain points of both existing tiers: Consumption’s cold starts and Premium’s minimum monthly cost. This guide covers when to choose Flex, configuration strategies, and migration from existing plans.
Tier Comparison
| Feature | Consumption | Premium | Flex Consumption |
|---|---|---|---|
| Scale to Zero | ✅ | ❌ (min 1) | ✅ |
| Always Ready | ❌ | ✅ | ✅ (configurable) |
| VNET | ❌ | ✅ | ✅ |
| Instance Memory | 1.5 GB | 3.5-14 GB | 2-4 GB |
| Max Timeout | 10 min | 60+ min | 30 min |
| Min Monthly Cost | $0 | ~$150 | $0 |
Configuration
resource flexFunctionApp 'Microsoft.Web/sites@2023-01-01' = {
name: 'myapp-func'
location: location
kind: 'functionapp,linux'
properties: {
serverFarmId: flexPlan.id
functionAppConfig: {
deployment: {
storage: {
type: 'blobContainer'
value: '${storageAccount.properties.primaryEndpoints.blob}deployments'
authentication: {
type: 'SystemAssignedIdentity'
}
}
}
scaleAndConcurrency: {
alwaysReady: [
{
name: 'http'
instanceCount: 2 // Keep 2 instances warm
}
]
maximumInstanceCount: 100
instanceMemoryMB: 2048
triggers: {
http: {
perInstanceConcurrency: 16
}
}
}
runtime: {
name: 'dotnet-isolated'
version: '8.0'
}
}
virtualNetworkSubnetId: vnetSubnet.id
}
}
Always Ready Instances
The key innovation is per-trigger “Always Ready” configuration. You can keep HTTP triggers warm while letting queue triggers scale to zero:
{
"alwaysReady": [
{ "name": "http", "instanceCount": 1 },
{ "name": "serviceBus", "instanceCount": 0 }
]
}
You only pay for Always Ready instances when they’re actually running—if there’s no traffic, Flex can still scale to zero after a timeout period.
Concurrency Model
Flex Consumption uses per-instance concurrency rather than per-function:
flowchart TB
subgraph Instance1 ["Instance 1 (concurrency: 16)"]
R1["Request 1"]
R2["Request 2"]
R3["..."]
R16["Request 16"]
end
subgraph Instance2 ["Instance 2 (concurrency: 16)"]
R17["Request 17"]
R18["Request 18"]
end
LB["Load Balancer"] --> Instance1
LB --> Instance2
style Instance1 fill:#E1F5FE,stroke:#0277BD
With perInstanceConcurrency: 16, each instance handles up to 16 concurrent requests. When all instances are saturated, Flex scales out.
Migration from Premium
Flex Consumption can replace Premium for many workloads at lower cost:
- Same: VNET integration, managed identity, private endpoints
- Different: No Durable Functions support (yet), 30-min timeout limit
- Cost savings: Pay only for active execution time
Key Takeaways
- Flex Consumption combines Consumption pricing with Premium features
- Configure Always Ready per trigger type
- VNET integration available without Premium pricing
- 30-minute timeout limit (shorter than Premium)
- Ideal for HTTP APIs that need occasional warm instances
Discover more from C4: Container, Code, Cloud & Context
Subscribe to get the latest posts sent to your email.