Category: Cloud Computing

Cloud computing is Internet-based computing, whereby shared resources, software, and information are provided to computers and other devices on demand, as with the electricity grid.
Cloud computing is a natural evolution of the widespread adoption of virtualization, Service-oriented architecture and utility computing. Details are abstracted from consumers, who no longer have need for expertise in, or control over, the technology infrastructure “in the cloud” that supports them.[1] Cloud computing describes a new supplement, consumption, and delivery model for IT services based on the Internet, and it typically involves over-the-Internet provision of dynamically scalable and often virtualized resources.[2][3] It is a byproduct and consequence of the ease-of-access to remote computing sites provided by the Internet.[4] This frequently takes the form of web-based tools or applications that users can access and use through a web browser as if it were a program installed locally on their…

Deploying LLM Applications on Cloud Run: A Complete Guide

Posted on November 5, 2024 by Nithin Mohan TK 6 min read

Last year, I deployed our first LLM application to Cloud Run. What should have taken hours took three days. Cold starts killed our latency. Memory limits caused crashes. Timeouts broke long-running requests. After deploying 20+ LLM applications to Cloud Run, I’ve learned what works and what doesn’t. Here’s the complete guide. Figure 1: Cloud Run […]

Mastering AWS EKS Deployment with Terraform: A Comprehensive Guide

Posted on October 29, 2024 by Nithin Mohan TK 3 min read

Introduction: Amazon Elastic Kubernetes Service (EKS) simplifies the process of deploying, managing, and scaling containerized applications using Kubernetes on AWS. In this guide, we’ll explore how to provision an AWS EKS cluster using Terraform, an Infrastructure as Code (IaC) tool. We’ll cover essential concepts, Terraform configurations, and provide hands-on examples to help you get started […]

Vector Databases: Why They Matter in the Age of Generative AI

Posted on October 26, 2024 by Nithin Mohan TK 8 min read

After two decades of architecting enterprise systems and spending the past year deeply immersed in Generative AI implementations, I can state with confidence that vector databases have become the cornerstone of modern AI infrastructure. If you’re building anything involving Large Language Models, semantic search, or Retrieval-Augmented Generation (RAG), understanding vector databases isn’t optional—it’s essential. This […]

.NET AI Performance Optimization: Reducing Latency and Costs

Posted on October 10, 2024 by Nithin Mohan TK 7 min read

Last year, I inherited a .NET AI application that was struggling. Response times averaged 2.3 seconds, costs were spiraling, and users were complaining. After three months of optimization, we cut latency by 87% and reduced costs by 72%. Here’s what I learned about optimizing .NET AI applications for production. Figure 1: .NET AI Performance Optimization […]

Harnessing AWS CDK for Python: Streamlining Infrastructure as Code

Posted on September 29, 2024 by Nithin Mohan TK 6 min read

After two decades of managing cloud infrastructure across enterprises of all sizes, I’ve witnessed the evolution of Infrastructure as Code from simple shell scripts to sophisticated declarative frameworks. AWS Cloud Development Kit (CDK) represents a paradigm shift that fundamentally changes how we think about infrastructure provisioning. Rather than wrestling with YAML or JSON templates, CDK […]

ML.NET for Custom AI Models: When to Use ML.NET vs Cloud APIs

Posted on September 5, 2024 by Nithin Mohan TK 6 min read

Six months ago, I faced a critical decision: build a custom ML model with ML.NET or use cloud APIs. The project required real-time fraud detection with zero latency tolerance. Cloud APIs were too slow. ML.NET was the answer. But when should you use ML.NET vs cloud APIs? After building 15+ production ML systems, here’s what […]

Searching in