RAG – Page 2 – C4: Container, Code, Cloud & Context

Embedding Model Selection: Choosing the Right Model for Your RAG System

Posted on August 15, 2025 by Nithin Mohan TK 11 min read

Introduction: Choosing the right embedding model is critical for RAG systems, semantic search, and similarity applications. The wrong choice leads to poor retrieval quality, high costs, or unacceptable latency. OpenAI’s text-embedding-3-small is cheap and fast but may miss nuanced similarities. Cohere’s embed-v3 excels at multilingual content. Open-source models like BGE and E5 offer privacy and […]

Read more →

Retrieval Augmented Fine-Tuning (RAFT): Training LLMs to Excel at RAG Tasks

Posted on August 15, 2025 by Nithin Mohan TK 18 min read

Introduction: Retrieval Augmented Fine-Tuning (RAFT) represents a powerful approach to improving LLM performance on domain-specific tasks by combining the benefits of fine-tuning with retrieval-augmented generation. Traditional RAG systems retrieve relevant documents at inference time and include them in the prompt, but the base model wasn’t trained to effectively use retrieved context. RAFT addresses this by […]

Read more →

RAG Patterns: Advanced Retrieval Augmented Generation Strategies

Posted on August 15, 2025 by Nithin Mohan TK 20 min read

Introduction: Retrieval Augmented Generation (RAG) has become the standard pattern for grounding LLM responses in factual, up-to-date information. But basic RAG—retrieve chunks, stuff into prompt, generate—often falls short in production. Queries get misunderstood, irrelevant chunks pollute context, and answers lack coherence. This guide covers advanced RAG patterns that address these challenges: query transformation to improve […]

Read more →

LlamaIndex: The Data Framework for Building Production RAG Applications

Posted on April 15, 2025 by Nithin Mohan TK 8 min read

Introduction: LlamaIndex (formerly GPT Index) is the leading data framework for building LLM applications over your private data. While LangChain focuses on chains and agents, LlamaIndex specializes in data ingestion, indexing, and retrieval—the core components of Retrieval Augmented Generation (RAG). With over 160 data connectors through LlamaHub, sophisticated indexing strategies, and production-ready query engines, LlamaIndex […]

Read more →

Advanced RAG Patterns: From Naive Retrieval to Production-Grade Systems (Part 1 of 2)

Posted on April 7, 2025 by Nithin Mohan TK 12 min read

Introduction: Retrieval-Augmented Generation (RAG) has become the go-to architecture for building LLM applications that need access to private or current information. By retrieving relevant documents and including them in the prompt, RAG grounds LLM responses in factual content, reducing hallucinations and enabling knowledge that wasn’t in the training data. But naive RAG implementations often disappoint—the […]

Read more →

Enterprise Machine Learning in Production: Healthcare and Financial Services Case Studies

Posted on March 31, 2025 by Nithin Mohan TK 4 min read

Real-world enterprise ML implementations in healthcare diagnostics and financial fraud detection. Explore RAG and LLM integration patterns, ML maturity frameworks, and strategic recommendations for building ML-enabled organizations.

Read more →

Searching in

Tag: RAG

Embedding Model Selection: Choosing the Right Model for Your RAG System

Retrieval Augmented Fine-Tuning (RAFT): Training LLMs to Excel at RAG Tasks

RAG Patterns: Advanced Retrieval Augmented Generation Strategies

LlamaIndex: The Data Framework for Building Production RAG Applications

Advanced RAG Patterns: From Naive Retrieval to Production-Grade Systems (Part 1 of 2)

Enterprise Machine Learning in Production: Healthcare and Financial Services Case Studies