String-based logs are grep nightmares. “User 12345 failed
login” across millions of log lines? Good luck filtering, aggregating, or alerting on that. Structured logging
transforms logs from unstructured text into queryable JSON, enabling instant filtering, powerful aggregations, and
automated alerting—turning log chaos into observability.
This guide covers production-ready structured logging
patterns that enable fast debugging, performance analysis, and proactive monitoring. We’ll build observable systems
where finding issues takes seconds, not hours.
Why Structured Logging Transforms Observability
The String Log Problem
Unstructured logs suffer from:
- Hard to search: Grep for user ID across formats
- No aggregation: Can’t count errors by type
- Fragile parsing: Log format changes break queries
- No correlation: Can’t link related events
- Poor performance: Full-text search on millions of lines
- No alerting: Can’t reliably detect patterns
Structured Logging Benefits
- Instant filtering: WHERE userId = ‘12345’
- Powerful aggregation: Count errors by type, user, endpoint
- Fast queries: Index on any field
- Correlation: TraceId links entire request chain
- Automated alerts: Trigger on specific field values
- Machine readable: Tools can parse automatically
Pattern 1: Basic Structured Logging
JSON Instead of Strings
using Serilog;
// ❌ BAD: Unstructured string logging
logger.LogInformation($"User {userId} placed order {orderId} for ${amount}");
// Output: "User 12345 placed order ORD-999 for $49.99"
// Hard to query, parse, or filter
// ✅ GOOD: Structured logging with properties
logger.LogInformation(
"User placed order",
new { UserId = userId, OrderId = orderId, Amount = amount }
);
// Better: Use Serilog's structured syntax
Log.Information(
"User {UserId} placed order {OrderId} for {Amount:C}",
userId, orderId, amount
);
// JSON output:
// {
// "timestamp": "2024-01-15T10:30:45.123Z",
// "level": "Information",
// "message": "User 12345 placed order ORD-999 for $49.99",
// "properties": {
// "UserId": "12345",
// "OrderId": "ORD-999",
// "Amount": 49.99
// }
// }
// Now can query: WHERE properties.UserId = '12345'
// Or aggregate: COUNT(*) GROUP BY properties.OrderId
// Configure Serilog for JSON output
Log.Logger = new LoggerConfiguration()
.WriteTo.Console(new JsonFormatter())
.WriteTo.File(
new JsonFormatter(),
"logs/app-.json",
rollingInterval: RollingInterval.Day
)
.CreateLogger();
Pattern 2: Enrichment and Context
Add Contextual Information Automatically
// Using Winston with structured logging
const winston = require('winston');
const logger = winston.createLogger({
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({ stack: true }),
winston.format.json()
),
defaultMeta: {
service: 'order-service',
environment: process.env.NODE_ENV,
version: process.env.APP_VERSION
},
transports: [
new winston.transports.Console(),
new winston.transports.File({ filename: 'app.log' })
]
});
// Middleware to enrich with request context
function loggingMiddleware(req, res, next) {
// Generate trace ID for request correlation
const traceId = req.headers['x-trace-id'] || randomUUID();
// Create child logger with request context
req.logger = logger.child({
traceId,
userId: req.user?.id,
requestId: randomUUID(),
ip: req.ip,
userAgent: req.headers['user-agent'],
path: req.path,
method: req.method
});
// Log request
req.logger.info('Request started', {
query: req.query,
body: sanitize(req.body)
});
// Track duration
const startTime = Date.now();
res.on('finish', () => {
const duration = Date.now() - startTime;
req.logger.info('Request completed', {
statusCode: res.statusCode,
duration,
responseSize: res.get('content-length')
});
});
next();
}
// Usage in handlers
app.post('/orders', async (req, res) => {
try {
const order = await createOrder(req.body);
// Logger already has traceId, userId, etc.
req.logger.info('Order created', {
orderId: order.id,
amount: order.total,
itemCount: order.items.length
});
res.json(order);
} catch (error) {
req.logger.error('Order creation failed', {
error: error.message,
stack: error.stack,
orderData: req.body
});
res.status(500).json({ error: 'Failed to create order' });
}
});
// Every log now includes:
// - traceId (correlate entire request)
// - userId
// - service name
// - environment
// - request details
Pattern 3: Correlation with Distributed Tracing
Link Logs Across Microservices
import logging
import uuid
from contextvars import ContextVar
# Context variable for trace ID (thread-safe)
trace_id_context: ContextVar[str] = ContextVar('trace_id', default=None)
# Custom log formatter
class StructuredFormatter(logging.Formatter):
def format(self, record):
log_data = {
'timestamp': self.formatTime(record),
'level': record.levelname,
'logger': record.name,
'message': record.getMessage(),
'trace_id': trace_id_context.get(),
'module': record.module,
'function': record.funcName,
'line': record.lineno
}
# Include custom fields
if hasattr(record, 'user_id'):
log_data['user_id'] = record.user_id
if hasattr(record, 'order_id'):
log_data['order_id'] = record.order_id
if record.exc_info:
log_data['exception'] = self.formatException(record.exc_info)
return json.dumps(log_data)
# Setup structured logging
handler = logging.StreamHandler()
handler.setFormatter(StructuredFormatter())
logger = logging.getLogger('app')
logger.addHandler(handler)
logger.setLevel(logging.INFO)
# Middleware to set trace ID
class TraceMiddleware:
def __init__(self, app):
self.app = app
async def __call__(self, scope, receive, send):
# Get or generate trace ID
headers = dict(scope.get('headers', []))
trace_id = headers.get(b'x-trace-id', str(uuid.uuid4()).encode()).decode()
# Set in context
trace_id_context.set(trace_id)
# Add to response headers
async def send_with_trace(message):
if message['type'] == 'http.response.start':
message['headers'].append((b'x-trace-id', trace_id.encode()))
await send(message)
await self.app(scope, receive, send_with_trace)
# Service A calls Service B
async def call_service_b(data):
trace_id = trace_id_context.get()
logger.info('Calling service B', extra={
'service': 'service-b',
'operation': 'process_order'
})
# Pass trace ID to next service
async with httpx.AsyncClient() as client:
response = await client.post(
'http://service-b/process',
json=data,
headers={'x-trace-id': trace_id}
)
logger.info('Service B responded', extra={
'status_code': response.status_code,
'duration': response.elapsed.total_seconds()
})
return response.json()
# Now all logs have same trace_id - can follow request across services!
# Query: WHERE trace_id = 'abc-123' to see entire request flow
Pattern 4: Performance Metrics in Logs
Embed Timing and Performance Data
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;
public class OrderService {
private static final Logger log = LoggerFactory.getLogger(OrderService.class);
public Order createOrder(CreateOrderRequest request) {
// Add context to MDC (Mapped Diagnostic Context)
MDC.put("userId", request.getUserId());
MDC.put("operation", "createOrder");
long startTime = System.currentTimeMillis();
try {
// Database query timing
long dbStart = System.currentTimeMillis();
Customer customer = customerRepo.findById(request.getUserId());
long dbDuration = System.currentTimeMillis() - dbStart;
log.info("Customer lookup completed",
kv("duration_ms", dbDuration),
kv("customer_id", customer.getId()),
kv("customer_tier", customer.getTier())
);
// Validate inventory
long invStart = System.currentTimeMillis();
boolean available = inventoryService.checkAvailability(request.getItems());
long invDuration = System.currentTimeMillis() - invStart;
if (!available) {
log.warn("Insufficient inventory",
kv("item_count", request.getItems().size()),
kv("check_duration_ms", invDuration)
);
throw new InsufficientInventoryException();
}
// Create order
Order order = new Order();
order.setId(UUID.randomUUID().toString());
order.setCustomerId(customer.getId());
order.setTotal(calculateTotal(request.getItems()));
orderRepo.save(order);
long totalDuration = System.currentTimeMillis() - startTime;
log.info("Order created successfully",
kv("order_id", order.getId()),
kv("total_amount", order.getTotal()),
kv("item_count", request.getItems().size()),
kv("total_duration_ms", totalDuration),
kv("db_duration_ms", dbDuration),
kv("inventory_check_ms", invDuration)
);
return order;
} catch (Exception e) {
long totalDuration = System.currentTimeMillis() - startTime;
log.error("Order creation failed",
kv("error_type", e.getClass().getSimpleName()),
kv("error_message", e.getMessage()),
kv("duration_ms", totalDuration),
e
);
throw e;
} finally {
MDC.clear();
}
}
// Helper for structured key-value pairs
private static Object kv(String key, Object value) {
return new AbstractMap.SimpleEntry<>(key, value);
}
}
// Now can query:
// - AVG(duration_ms) WHERE operation = 'createOrder'
// - MAX(db_duration_ms) WHERE customer_tier = 'VIP'
// - COUNT(*) WHERE error_type = 'InsufficientInventoryException'
Pattern 5: Log Levels and Sampling
Control Verbosity in Production
import pino from 'pino';
const logger = pino({
level: process.env.LOG_LEVEL || 'info',
// Sample debug logs in production (1%)
mixin() {
if (this.level === 10) { // DEBUG level
if (Math.random() > 0.01) {
return { skip: true };
}
}
return {};
},
// Redact sensitive fields
redact: {
paths: [
'password',
'creditCard',
'ssn',
'authorization',
'req.headers.authorization'
],
remove: true
},
// Custom serializers
serializers: {
req: (req) => ({
method: req.method,
url: req.url,
headers: {
host: req.headers.host,
'user-agent': req.headers['user-agent']
},
remoteAddress: req.remoteAddress
}),
err: pino.stdSerializers.err,
user: (user) => ({
id: user.id,
email: user.email,
// Don't log PII
})
}
});
// Smart log levels
class SmartLogger {
constructor(private logger: pino.Logger) {}
// Always log errors
error(message: string, data?: any) {
this.logger.error(data, message);
}
// Log warnings
warn(message: string, data?: any) {
this.logger.warn(data, message);
}
// Info for important business events
info(message: string, data?: any) {
this.logger.info(data, message);
}
// Debug sampled in production, full in dev
debug(message: string, data?: any) {
if (process.env.NODE_ENV === 'development') {
this.logger.debug(data, message);
} else {
// Sample 1% in production
if (Math.random() < 0.01) {
this.logger.debug({ ...data, sampled: true }, message);
}
}
}
// Trace only in development
trace(message: string, data?: any) {
if (process.env.NODE_ENV === 'development') {
this.logger.trace(data, message);
}
}
}
// Usage
const smartLogger = new SmartLogger(logger);
smartLogger.error('Payment failed', {
userId: '123',
amount: 99.99,
gateway: 'stripe'
});
// Full debug in dev, sampled in prod
smartLogger.debug('Cache hit', {
key: 'user:123',
hitRate: 0.85
});
Pattern 6: Log Aggregation and Querying
Elasticsearch/Splunk/DataDog Integration
# ELK Stack configuration
# Filebeat ships logs to Logstash
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/log/app/*.json
json.keys_under_root: true
json.add_error_key: true
output.logstash:
hosts: ["logstash:5044"]
# Logstash processes and enriches
input {
beats {
port => 5044
}
}
filter {
# Parse JSON
json {
source => "message"
}
# Add GeoIP for IP addresses
geoip {
source => "ip"
}
# Extract user agent details
useragent {
source => "userAgent"
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "app-logs-%{+YYYY.MM.dd}"
}
}
# Now query in Kibana:
# - userId: "12345" AND level: "ERROR"
# - COUNT(*) BY statusCode WHERE path: "/api/orders"
# - AVG(duration) BY endpoint WHERE level: "INFO"
Real-World Example: Complete Observable Service
Production-Ready Structured Logging
using Serilog;
using Serilog.Context;
public class OrderService
{
private readonly ILogger _logger;
public OrderService(ILogger logger)
{
_logger = logger;
}
public async Task<OrderResult> ProcessOrderAsync(OrderRequest request)
{
// Add persistent context for this operation
using (LogContext.PushProperty("OrderId", request.OrderId))
using (LogContext.PushProperty("UserId", request.UserId))
using (LogContext.PushProperty("Operation", "ProcessOrder"))
{
var sw = Stopwatch.StartNew();
try
{
_logger.Information(
"Processing order started",
new {
ItemCount = request.Items.Count,
TotalAmount = request.TotalAmount,
ShippingAddress = request.ShippingAddress.City
}
);
// Validate
var validationStart = sw.ElapsedMilliseconds;
var validationResult = await ValidateOrderAsync(request);
var validationDuration = sw.ElapsedMilliseconds - validationStart;
if (!validationResult.IsValid)
{
_logger.Warning(
"Order validation failed",
new {
Errors = validationResult.Errors,
ValidationDuration = validationDuration
}
);
return OrderResult.Failed(validationResult.Errors);
}
_logger.Debug(
"Order validated",
new { Duration = validationDuration }
);
// Process payment
var paymentStart = sw.ElapsedMilliseconds;
var payment = await _paymentService.ChargeAsync(
request.UserId,
request.TotalAmount
);
var paymentDuration = sw.ElapsedMilliseconds - paymentStart;
_logger.Information(
"Payment processed",
new {
PaymentId = payment.Id,
Gateway = payment.Gateway,
Duration = paymentDuration
}
);
// Create order
var order = await _orderRepository.CreateAsync(request);
sw.Stop();
_logger.Information(
"Order processed successfully",
new {
OrderId = order.Id,
TotalDuration = sw.ElapsedMilliseconds,
ValidationDuration = validationDuration,
PaymentDuration = paymentDuration,
Status = order.Status
}
);
// Metric log for monitoring
_logger.Information(
"METRIC: order.processing.duration",
new {
Value = sw.ElapsedMilliseconds,
Unit = "ms",
Success = true
}
);
return OrderResult.Success(order);
}
catch (PaymentException ex)
{
_logger.Error(
ex,
"Payment processing failed",
new {
PaymentGateway = ex.Gateway,
ErrorCode = ex.Code,
Duration = sw.ElapsedMilliseconds
}
);
return OrderResult.Failed("Payment failed");
}
catch (Exception ex)
{
_logger.Error(
ex,
"Order processing failed unexpectedly",
new {
ErrorType = ex.GetType().Name,
Duration = sw.ElapsedMilliseconds
}
);
throw;
}
}
}
}
// Query examples in log aggregation tool:
// - WHERE UserId = "12345" - see all logs for user
// - AVG(TotalDuration) BY Status - avg processing time by outcome
// - COUNT(*) WHERE ErrorType != null - error rate
// - MAX(PaymentDuration) - slowest payment processing
Best Practices
- Log in JSON: Structured data from the start
- Use trace IDs: Correlate logs across services
- Enrich with context: Add userId, requestId automatically
- Log timing data: Embed performance metrics
- Redact sensitive data: Never log passwords, tokens
- Use appropriate levels: ERROR for errors, INFO for business events
- Sample debug logs: 1% in production to reduce volume
Common Pitfalls
- Logging too much: DEBUG everywhere kills performance
- Logging PII: Credit cards, SSNs, passwords in logs
- No correlation: Can't trace requests across services
- String interpolation: Loses structure (use parameters)
- Missing context: Logs without userId/orderId
- No log rotation: Disk fills up
Key Takeaways
- Structured logging uses JSON instead of unstructured strings
- Enables powerful filtering: WHERE userId = '123'
- Aggregations reveal patterns: COUNT(*) BY errorType
- Trace IDs correlate logs across microservices
- Enrich logs with context (userId, requestId, traceId)
- Embed performance metrics (duration, status codes)
- Redact sensitive data automatically
- Sample debug logs in production (1%) to reduce volume
Structured logging is the foundation of observability. By logging JSON instead of strings, you transform logs
from grep nightmares into queryable data that reveals system behavior instantly. Every second saved debugging is
a second gained for building features. Invest in structured logging from day one—your future self will thank
you.
Discover more from C4: Container, Code, Cloud & Context
Subscribe to get the latest posts sent to your email.