LLM – Page 22 – C4: Container, Code, Cloud & Context

Ollama: The Complete Guide to Running Open Source LLMs Locally

Posted on July 10, 2024 by Nithin Mohan TK 6 min read

Introduction: Ollama has revolutionized how developers run large language models locally. With a simple command-line interface and seamless hardware acceleration, you can have Llama 3.2, Mistral, or CodeLlama running on your laptop in minutes—no cloud API keys, no usage costs, complete privacy. Built on llama.cpp, Ollama abstracts away the complexity of model quantization, memory management, […]

Read more →

LLM Output Parsing: From Raw Text to Typed Objects

Posted on July 10, 2024 by Nithin Mohan TK 9 min read

Introduction: LLMs generate text, but applications need structured data. Parsing LLM output reliably is surprisingly tricky—models don’t always follow instructions, JSON can be malformed, and edge cases abound. This guide covers robust output parsing strategies: using JSON mode for guaranteed valid JSON, Pydantic for type-safe parsing, handling partial and streaming outputs, implementing retry logic for […]

Read more →

Cost Optimization for AI Workloads: Tracking and Reducing LLM Costs

Posted on July 8, 2024 by Nithin Mohan TK 5 min read

Last quarter, our LLM costs hit $12,000. In a single month. We had no idea where the money was going. No tracking, no budgets, no alerts. That’s when I realized: cost optimization isn’t optional for AI workloads—it’s survival. Here’s how we cut costs by 65% without sacrificing quality. Figure 1: Cost Optimization Architecture The $12,000 […]

Read more →

LLM Fine-tuning Fundamentals: When, Why, and How to Customize Language Models

Posted on July 1, 2024 by Nithin Mohan TK 16 min read

Introduction: Fine-tuning transforms a general-purpose LLM into a specialized model for your specific use case. While prompt engineering works for many applications, fine-tuning offers advantages when you need consistent formatting, domain-specific knowledge, or reduced latency from shorter prompts. This guide covers practical fine-tuning: when to fine-tune versus prompt engineer, preparing training data, running fine-tuning jobs […]

Read more →

Document Processing with LLMs: From PDFs to Structured Data (Part 1 of 2)

Posted on June 22, 2024 by Nithin Mohan TK 12 min read

Introduction: Documents are everywhere—PDFs, Word files, scanned images, spreadsheets. Extracting structured information from unstructured documents is one of the most valuable LLM applications. This guide covers building document processing pipelines: extracting text from various formats, chunking strategies for long documents, processing with LLMs for extraction and summarization, and handling edge cases like tables, images, and […]

Read more →

Prompt Performance Monitoring: Tracking LLM Response Quality

Posted on June 12, 2024 by Nithin Mohan TK 6 min read

Three weeks after launching our AI customer support system, we noticed something strange. Response quality was degrading—slowly, almost imperceptibly. Users weren’t complaining yet, but satisfaction scores were dropping. The problem? We had no way to measure prompt performance. We were optimizing blind. That’s when I built a comprehensive prompt performance monitoring system. Figure 1: Prompt […]

Read more →

Searching in

Tag: LLM

Ollama: The Complete Guide to Running Open Source LLMs Locally

LLM Output Parsing: From Raw Text to Typed Objects

Cost Optimization for AI Workloads: Tracking and Reducing LLM Costs

LLM Fine-tuning Fundamentals: When, Why, and How to Customize Language Models

Document Processing with LLMs: From PDFs to Structured Data (Part 1 of 2)

Prompt Performance Monitoring: Tracking LLM Response Quality