Advanced Patterns

AI Engineering in Production

From RAG pipelines and vector databases to MCP servers and agent security — the operational patterns for shipping LLM-backed systems that survive contact with real traffic.

6 Lessons

Articles in this series

17 min read•Hard

Building Production RAG Pipelines: Chunking, Embeddings, and Retrieval at Scale

Build RAG systems that work in production: chunking strategies, embedding selection, pgvector ops, and retrieval quality evaluation.

19 min read•Hard

Vector Databases Compared: pgvector vs Pinecone vs Weaviate

Compare pgvector, Pinecone, Weaviate, Qdrant, Milvus, and Chroma on performance, cost, and operational fit — with real code and each database's documented performance envelope.

14 min read•Medium

LLM API Integration Patterns for Backend Engineers

Production LLM API patterns: streaming, function calling, retries, token budgets, cost optimization, and observability for backend engineers.

14 min read•Hard

Spring AI in Production: RAG Pipelines, Reliability, and Observability for Java Backends

Spring AI 1.1 deep-dive: production RAG pipeline with PII scrubbing, circuit breakers, Micrometer observability, and answer evaluation.

14 min read•Hard

Building an MCP Server in Go with Code Mode: From 1.17M Tokens to 1,000

2,500 API endpoints in one MCP server without blowing context windows. The Code Mode pattern uses search + execute to cut token cost by 1,000x.

13 min read•Hard

Securing AI Agent Infrastructure: MCP Servers, Tool Calls, and the Attack Surface You're Not Watching

AI agents calling tools via MCP create new attack surfaces: prompt injection through tool responses, credential leakage, and unauthorized execution.

Back to AI Engineering