Azure Functions Flex Consumption: Complete Guide

Azure Functions Flex Consumption is the newest hosting tier, combining the best of Consumption (pay-per-use, scale-to-zero) and Premium (always-ready instances, VNET support). Now in General Availability, Flex Consumption addresses the main pain points of both existing tiers: Consumption’s cold starts and Premium’s minimum monthly cost. This guide covers when to choose Flex, configuration strategies, and migration from existing plans.

Tier Comparison

FeatureConsumptionPremiumFlex Consumption
Scale to Zero❌ (min 1)
Always Ready✅ (configurable)
VNET
Instance Memory1.5 GB3.5-14 GB2-4 GB
Max Timeout10 min60+ min30 min
Min Monthly Cost$0~$150$0

Configuration

resource flexFunctionApp 'Microsoft.Web/sites@2023-01-01' = {
  name: 'myapp-func'
  location: location
  kind: 'functionapp,linux'
  properties: {
    serverFarmId: flexPlan.id
    functionAppConfig: {
      deployment: {
        storage: {
          type: 'blobContainer'
          value: '${storageAccount.properties.primaryEndpoints.blob}deployments'
          authentication: {
            type: 'SystemAssignedIdentity'
          }
        }
      }
      scaleAndConcurrency: {
        alwaysReady: [
          {
            name: 'http'
            instanceCount: 2  // Keep 2 instances warm
          }
        ]
        maximumInstanceCount: 100
        instanceMemoryMB: 2048
        triggers: {
          http: {
            perInstanceConcurrency: 16
          }
        }
      }
      runtime: {
        name: 'dotnet-isolated'
        version: '8.0'
      }
    }
    virtualNetworkSubnetId: vnetSubnet.id
  }
}

Always Ready Instances

The key innovation is per-trigger “Always Ready” configuration. You can keep HTTP triggers warm while letting queue triggers scale to zero:

{
  "alwaysReady": [
    { "name": "http", "instanceCount": 1 },
    { "name": "serviceBus", "instanceCount": 0 }
  ]
}

You only pay for Always Ready instances when they’re actually running—if there’s no traffic, Flex can still scale to zero after a timeout period.

Concurrency Model

Flex Consumption uses per-instance concurrency rather than per-function:

flowchart TB
    subgraph Instance1 ["Instance 1 (concurrency: 16)"]
        R1["Request 1"]
        R2["Request 2"]
        R3["..."]
        R16["Request 16"]
    end
    
    subgraph Instance2 ["Instance 2 (concurrency: 16)"]
        R17["Request 17"]
        R18["Request 18"]
    end
    
    LB["Load Balancer"] --> Instance1
    LB --> Instance2
    
    style Instance1 fill:#E1F5FE,stroke:#0277BD

With perInstanceConcurrency: 16, each instance handles up to 16 concurrent requests. When all instances are saturated, Flex scales out.

Migration from Premium

Flex Consumption can replace Premium for many workloads at lower cost:

  • Same: VNET integration, managed identity, private endpoints
  • Different: No Durable Functions support (yet), 30-min timeout limit
  • Cost savings: Pay only for active execution time

Key Takeaways

  • Flex Consumption combines Consumption pricing with Premium features
  • Configure Always Ready per trigger type
  • VNET integration available without Premium pricing
  • 30-minute timeout limit (shorter than Premium)
  • Ideal for HTTP APIs that need occasional warm instances

Discover more from C4: Container, Code, Cloud & Context

Subscribe to get the latest posts sent to your email.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.