August 2024 – Page 3 – C4: Container, Code, Cloud & Context

Searching in

Enter search term to find items

to navigate, to select, and to close

Async LLM Patterns: Concurrent Execution, Rate Limiting, and Task Queues for High-Throughput AI Applications

Posted on August 2, 2024 by Nithin Mohan TK 12 min read

Introduction: LLM API calls are inherently I/O-bound—waiting for network responses dominates execution time. Async programming transforms this bottleneck into an opportunity for massive parallelism. Instead of waiting sequentially for each response, async patterns enable concurrent execution of hundreds of requests while efficiently managing resources. This guide covers practical async patterns for LLM applications: concurrent request […]

Prompt Template Management: Engineering Discipline for LLM Prompts

Posted on August 1, 2024 by Nithin Mohan TK 9 min read

Introduction: Prompts are the interface between your application and LLMs. As applications grow, managing prompts becomes challenging—they’re scattered across code, hard to version, and difficult to test. A prompt template system brings order to this chaos. It separates prompt logic from application code, enables versioning and A/B testing, and makes prompts reusable across different contexts. […]