C4: Container, Code, Cloud & Context

LLM Memory and Context Management: Building Conversational AI That Remembers

Posted on September 11, 2024 by Nithin Mohan TK 9 min read

Introduction: LLMs have no inherent memory—each API call is stateless. The model doesn’t remember your previous conversation, your user’s preferences, or the context you established five messages ago. Memory is something you build on top. This guide covers implementing different memory strategies for LLM applications: buffer memory for recent context, summary memory for long conversations, […]

Read more →

OpenAI API Complete Guide: From Chat Completions to Assistants

Posted on September 10, 2024 by Nithin Mohan TK 12 min read

A comprehensive guide to the OpenAI API covering GPT-4o, function calling, the Assistants API, vision capabilities, and production best practices with code examples.

Read more →

.NET 8 and C# 12: A Deep Dive into Native AOT, Primary Constructors, and Blazor United

Posted on September 9, 2024 by Nithin Mohan TK 9 min read

Introduction: .NET 8 represents a landmark release in Microsoft’s development platform evolution, bringing Native AOT to mainstream scenarios, unifying Blazor’s rendering models, and introducing C# 12’s powerful new features. Released in November 2023, this Long-Term Support version delivers significant performance improvements, reduced memory footprint, and enhanced developer productivity. After migrating several enterprise applications to .NET […]

Read more →

ML.NET for Custom AI Models: When to Use ML.NET vs Cloud APIs

Posted on September 5, 2024 by Nithin Mohan TK 6 min read

Six months ago, I faced a critical decision: build a custom ML model with ML.NET or use cloud APIs. The project required real-time fraud detection with zero latency tolerance. Cloud APIs were too slow. ML.NET was the answer. But when should you use ML.NET vs cloud APIs? After building 15+ production ML systems, here’s what […]

Read more →

LLM Application Logging and Tracing: Building Observable AI Systems

Posted on September 3, 2024 by Nithin Mohan TK 11 min read

Introduction: Production LLM applications require comprehensive logging and tracing to debug issues, monitor performance, and understand user interactions. Unlike traditional applications, LLM systems have unique logging needs: capturing prompts and responses, tracking token usage, measuring latency across chains, and correlating requests through multi-step workflows. This guide covers practical logging patterns: structured request/response logging, distributed tracing […]

Read more →

Architecting the Moment: Real-Time Data Processing in Modern Cloud Systems

Posted on September 1, 2024 by Nithin Mohan TK 8 min read

After two decades of architecting data systems across financial services, healthcare, and e-commerce, I’ve witnessed the evolution from batch-only processing to today’s sophisticated real-time architectures. The shift isn’t just about speed—it’s about fundamentally changing how organizations make decisions and respond to events. This article shares battle-tested insights on building production-grade real-time data processing systems in […]

Read more →

Searching in