Building an MCP Server in Go with Code Mode: From 1.17M Tokens to 1,000
2,500 API endpoints in one MCP server without blowing context windows. The Code Mode pattern uses search + execute to cut token cost by 1,000x.
Lessons from running systems in production — war stories and hard-won patterns.
2,500 API endpoints in one MCP server without blowing context windows. The Code Mode pattern uses search + execute to cut token cost by 1,000x.
AI agents calling tools via MCP create new attack surfaces: prompt injection through tool responses, credential leakage, and unauthorized execution.
Handle unpredictable JSON in Go: map[string]any, json.RawMessage, type switches, and defensive patterns for shifting schemas.
Go graceful shutdown: SIGTERM handling, health probe coordination, and Kubernetes drain patterns for zero dropped requests.
Stop fighting java.util.Date. Master LocalDateTime, ZonedDateTime, Instant, and Duration — predictable, thread-safe time handling.
Production LLM API patterns: streaming, function calling, retries, token budgets, cost optimization, and observability for backend engineers.
Build RAG systems that work in production: chunking strategies, embedding selection, pgvector ops, and retrieval quality evaluation.
B-tree internals, composite index ordering, GIN for full-text search, partial indexes, and preventing index bloat in production.
Go error handling: sentinel errors, wrapping, errors.Is/As, custom types, and production patterns that prevent silent failures.
Master context.Context in Go: cancellation propagation, deadline inheritance, goroutine leak patterns, and debugging with pprof.