December 2024 – Page 3 – C4: Container, Code, Cloud & Context

GitHub Copilot: A Solutions Architect’s Guide to AI-Assisted Development

Posted on December 8, 2024 by Nithin Mohan TK 4 min read

GitHub Copilot has fundamentally changed how I approach software development. After integrating it into my daily workflow over the past year, I want to share practical insights on maximizing its value while understanding its limitations. As someone who has been writing code for over two decades, I initially approached AI-assisted development with skepticism, but Copilot […]

Read more →

Google Gemini API: Building Multimodal AI Applications with 2M Token Context

Posted on December 8, 2024 by Nithin Mohan TK 7 min read

Introduction: Google’s Gemini API represents a significant leap in multimodal AI capabilities. Launched in December 2023, Gemini models are natively multimodal, trained from the ground up to understand and generate text, images, audio, and video. With context windows up to 2 million tokens and native Google Search grounding, Gemini offers unique capabilities for building sophisticated […]

Read more →

Mastering AWS, EKS, Python, Kubernetes, and Terraform for Monitoring and Observability for SRE: Unveiling the Secrets of Cloud Infrastructure Optimization

Posted on December 8, 2024 by Nithin Mohan TK 6 min read

As the world of software development continues to evolve, the need for robust infrastructures and efficient monitoring systems cannot be overemphasized. Whether you are an engineer, a site reliability engineer (SRE), or an IT manager, the need to harness the power of tools like Amazon Web Services (AWS), Elastic Kubernetes Service (EKS), Kubernetes, Terraform, and […]

Read more →

Serverless AI Architecture: Building Scalable LLM Applications

Posted on December 5, 2024 by Nithin Mohan TK 6 min read

Three years ago, I built my first serverless LLM application. It failed spectacularly. Cold starts made responses take 15 seconds. Timeouts killed long-running requests. Costs spiraled out of control. After architecting 30+ serverless AI systems, I’ve learned what works. Here’s the complete guide to building scalable serverless LLM applications. Figure 1: Serverless AI Architecture Overview […]

Read more →

Tips and Tricks – Implement Retry Logic for LLM API Calls

Posted on December 2, 2024 by Nithin Mohan TK 2 min read

Handle rate limits and transient failures gracefully with exponential backoff.

Read more →

AWS Bedrock: Building Enterprise Generative AI Applications on AWS

Posted on December 1, 2024 by Nithin Mohan TK 4 min read

AWS re:Invent 2024 brought significant updates to Amazon Bedrock, and after spending the past month integrating these capabilities into production systems, I want to share what actually matters for enterprise adoption. Having built generative AI applications across multiple cloud platforms over the past two decades, Bedrock represents a meaningful shift in how we can deploy […]

Read more →

Searching in

Month: December 2024

GitHub Copilot: A Solutions Architect’s Guide to AI-Assisted Development

Google Gemini API: Building Multimodal AI Applications with 2M Token Context

Mastering AWS, EKS, Python, Kubernetes, and Terraform for Monitoring and Observability for SRE: Unveiling the Secrets of Cloud Infrastructure Optimization

Tips and Tricks – Implement Retry Logic for LLM API Calls