Production-Grade Go API Design: Clean Architecture, Custom Errors, and Middleware That Actually Works
Key Takeaways
- →An un-paginated handler that eagerly joins multiple tables and serialises tens of thousands of rows per request will pin CPU within minutes when a frontend client toggles a query param — pagination, projection, and bounded response size are non-negotiable for production endpoints
- →Handler-service-repository layers with dependency inversion enable testing without infrastructure — mock the service in handler tests, mock the repo in service tests, test only database behavior separately
- →AppError with code, message, detail, cause, and fields lets on-call know exactly what broke without leaking sensitive data — log the detail internally, return only the safe message to clients
- →/healthz (liveness: is the process running) and /readyz (readiness: is it safe to route traffic) enable Kubernetes to manage pod restarts and rolling deployments without dropping requests
The classic Friday-afternoon-outage on a Go API. A frontend release ships a new dashboard widget that toggles a query param like
?include_details=true. Within minutes the Go backend's CPU pegs and P99 latency climbs into the seconds. The handler, which had never been paginated or optimised for deep relationship loading because "we never need that much data at once," eagerly joins multiple tables and attempts to serialise tens of thousands of rows per request. We've debugged this pattern on multiple Go services.
In our experience, most Go API tutorials stop when the server responds to a request[Go Language Specification] — that's maybe 20% of production work. The remaining 80% is structured error handling so on-call knows what broke, middleware chains that don't collapse under load, health probes Kubernetes trusts, and layered architecture that lets you change things without touching everything.
Organize code into handler, service, and repository layers inside internal/. Define custom error types with HTTP[RFC 9110, 2022] status codes and user-facing messages. Stack middleware in order: RequestID → Logging → Recovery → RateLimiter → Auth. Implement /healthz and /readyz probes and graceful shutdown to handle production gracefully.
- Handler, service, repository layers enforce dependency boundaries — no circular imports or reaching up the stack
- Custom AppError with code, message, internal detail, and cause enables logging and consistent JSON responses
- Middleware order matters — each layer depends on ones above; recovery catches panics before Auth runs expensive checks
- Health probes and graceful shutdown handle Kubernetes lifecycle and prevent request loss during rolling deployments
graph LR
Req[HTTP request] --> M1[RequestID]
M1 --> M2[Logging]
M2 --> M3[Recovery<br/>panic catcher]
M3 --> M4[RateLimiter]
M4 --> M5[Auth]
M5 --> H[Handler:<br/>parse + validate]
H --> S[Service:<br/>business logic]
S --> R[Repository:<br/>SQL / cache / APIs]
R -.no callback up the stack.-> S
S -.no HTTP types.-> H
style M3 fill:#fee
style H fill:#eef
style S fill:#eef
style R fill:#eef
The diagram is the dependency-flow lesson in one picture: request enters left, traverses the middleware stack in order (Recovery before Auth so panics in expensive auth checks are caught), then drops through Handler → Service → Repository with no upward calls and no leaked HTTP types into the lower layers. Get the order wrong and a panic in middleware skips your error handler; get the layering wrong and your service can't be tested without a real HTTP server.
The Quick Start — API Design Layers
[Go context]A production Go API needs three layers: HTTP transport, business logic, and data access. Dependencies flow inward only.
| Layer | Owns | Touches | Tested with |
|---|---|---|---|
Handler (internal/handler/) | HTTP parsing, validation, status codes, response shape | Service interface only | httptest.NewRecorder + mocked service |
Service (internal/service/) | Business logic, transaction boundaries, domain invariants | Repository interface only | Pure unit tests + mocked repository |
Repository (internal/repository/) | SQL, cache, external API calls | Database driver, HTTP client, redis client | testcontainers-go for real Postgres / Redis |
Middleware (internal/middleware/) | RequestID, logging, recovery, auth, rate-limit | None — pure http.Handler wrappers | httptest.NewServer + integration |
Domain (internal/domain/) | Shared types, sentinel errors, value objects | Nothing — leaf package | Plain unit tests |
order-service/
├── cmd/server/main.go # Wire and start
├── internal/
│ ├── handler/ # Parse HTTP, validate, respond
│ ├── service/ # Business logic — no HTTP, no DB
│ ├── repository/ # Data access — SQL, cache, APIs
│ ├── domain/ # Shared types and errors
│ └── middleware/ # RequestID, logging, recovery, auth
└── config/The rule: handlers call services, services call repositories, repositories call databases. A repository never calls a handler. A service never talks directly to HTTP. This inversion prevents circular imports and makes testing trivial — mock the service in handler tests, mock the repo in service tests.
Why structure matters: when the Friday afternoon outage hits and you need to trace a request through three services and a failing database, you need to know exactly which layer is responsible. Handler bugs are HTTP problems. Service bugs are logic problems. Repository bugs are data problems. Without clear boundaries, everything becomes "the API is broken" with no fast path to diagnosis.
Custom Error Types and Error Middleware
[Go 1.13 error wrapping]Generic errors.New("something failed") starts a long on-call shift. You need errors that carry context for logging, status codes for HTTP, and safe messages for clients.
type ErrorCode string
const (
ErrCodeNotFound ErrorCode = "NOT_FOUND"
ErrCodeValidation ErrorCode = "VALIDATION_ERROR"
ErrCodeUnauthorized ErrorCode = "UNAUTHORIZED"
ErrCodeExternal ErrorCode = "EXTERNAL_SERVICE_ERROR"
ErrCodeInternal ErrorCode = "INTERNAL_ERROR"
ErrCodeRateLimited ErrorCode = "RATE_LIMIT_EXCEEDED"
)
type AppError struct {
Code ErrorCode
Message string // Safe for API response
Detail string // For logs only
Cause error // Wrapped error
Fields map[string]any
}
func (e *AppError) Error() string {
if e.Cause != nil {
return fmt.Sprintf("%s: %s: %v", e.Code, e.Message, e.Cause)
}
return fmt.Sprintf("%s: %s", e.Code, e.Message)
}
// Unwrap exposes Cause so errors.Is / errors.As traverse the wrapped error.
func (e *AppError) Unwrap() error { return e.Cause }
func (e *AppError) HTTPStatus() int {
switch e.Code {
case ErrCodeNotFound:
return http.StatusNotFound
case ErrCodeValidation:
return http.StatusBadRequest
case ErrCodeUnauthorized:
return http.StatusUnauthorized
case ErrCodeRateLimited:
return http.StatusTooManyRequests
case ErrCodeExternal:
return http.StatusBadGateway
default:
return http.StatusInternalServerError
}
}
// Constructors keep call sites lean
func NotFound(resource, id string) *AppError {
return &AppError{
Code: ErrCodeNotFound,
Message: fmt.Sprintf("%s not found", resource),
Fields: map[string]any{"resource": resource, "id": id},
}
}
func ValidationError(field, reason string) *AppError {
return &AppError{
Code: ErrCodeValidation,
Message: fmt.Sprintf("invalid %s: %s", field, reason),
}
}In your service layer, return these errors:
func (s *OrderService) GetOrder(ctx context.Context, id uuid.UUID) (*domain.Order, error) {
order, err := s.repo.GetByID(ctx, id)
if err != nil {
if errors.Is(err, pgx.ErrNoRows) {
return nil, domain.NotFound("order", id.String())
}
return nil, &domain.AppError{
Code: domain.ErrCodeInternal,
Message: "failed to retrieve order",
Cause: err,
Fields: map[string]any{"order_id": id},
}
}
return order, nil
}Then in handlers, use a shared error helper that logs internals and returns only the safe message:
type APIResponse struct {
Data any `json:"data,omitempty"`
Error *APIError `json:"error,omitempty"`
}
type APIError struct {
Code string `json:"code"`
Message string `json:"message"`
}
func respondError(w http.ResponseWriter, r *http.Request, err error) {
logger := loggerFromContext(r.Context())
var appErr *domain.AppError
if errors.As(err, &appErr) {
// Log internal details with full context
fields := []any{"code", appErr.Code, "request_id", requestIDFromContext(r.Context())}
if appErr.Detail != "" {
fields = append(fields, "detail", appErr.Detail)
}
if appErr.Cause != nil {
fields = append(fields, "cause", appErr.Cause)
}
for k, v := range appErr.Fields {
fields = append(fields, k, v)
}
if appErr.HTTPStatus() >= 500 {
logger.Error("request failed", fields...)
} else {
logger.Warn("request failed", fields...)
}
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(appErr.HTTPStatus())
json.NewEncoder(w).Encode(APIResponse{
Error: &APIError{Code: string(appErr.Code), Message: appErr.Message},
})
return
}
// Unknown error type
logger.Error("unhandled error", "error", err)
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(http.StatusInternalServerError)
json.NewEncoder(w).Encode(APIResponse{
Error: &APIError{Code: "INTERNAL_ERROR", Message: "an unexpected error occurred"},
})
}Never expose internal error details, database table names, or stack traces to clients. Only send the user-facing message. Log the full context internally with request ID for correlation during post-mortems.
Middleware Chain and Dependency Injection
[Go net/http]Middleware order matters. Each layer depends on ones above. Stack them as:
- RequestID — generates and stores X-Request-Id for all downstream logs
- Logging — captures method, path, status, duration; requires RequestID
- Recovery — catches panics before they crash goroutines
- RateLimiter — rejects over-limit requests before expensive work
- Auth — validates tokens; runs after RateLimiter to avoid waste
This ordering prevents resource exhaustion: rate limiting blocks bad actors before authentication burns CPU validating tokens, and recovery catches panics before any of them crash goroutines.
Wire in main.go:
func main() {
db, _ := connectDB(os.Getenv("DATABASE_URL"))
defer db.Close()
// Bottom-up wiring: database → repo → service → handler
orderRepo := repository.NewOrderRepository(db)
orderService := service.NewOrderService(orderRepo)
orderHandler := handler.NewOrderHandler(orderService)
healthHandler := handler.NewHealthHandler(db)
router := chi.NewRouter()
// Middleware in order
router.Use(middleware.RequestID)
router.Use(middleware.Logging(slog.Default()))
router.Use(middleware.Recovery(slog.Default()))
router.Use(middleware.RateLimiter(100)) // per second
// Health endpoints outside auth
router.Get("/healthz", healthHandler.Liveness)
router.Get("/readyz", healthHandler.Readiness)
// Protected routes
router.Route("/v1", func(r chi.Router) {
r.Use(middleware.Auth(os.Getenv("JWT_SECRET")))
r.Post("/orders", orderHandler.CreateOrder)
r.Get("/orders/{id}", orderHandler.GetOrder)
})
server := &http.Server{
Addr: ":8080",
Handler: router,
ReadTimeout: 10 * time.Second,
WriteTimeout: 30 * time.Second,
}
// Graceful shutdown on SIGTERM
sigChan := make(chan os.Signal, 1)
signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)
shutdownDone := make(chan struct{})
go func() {
<-sigChan
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
server.Shutdown(ctx)
close(shutdownDone)
}()
// ListenAndServe returns ErrServerClosed the instant Shutdown is called;
// block on shutdownDone so in-flight requests finish draining before exit.
if err := server.ListenAndServe(); !errors.Is(err, http.ErrServerClosed) {
log.Fatalf("listen: %v", err)
}
<-shutdownDone
}Validation and Health Probes
[Kubernetes docs]graph LR
K8s[Kubernetes] -->|"every periodSeconds"| LP["/healthz<br/>liveness probe"]
K8s -->|"every periodSeconds"| RP["/readyz<br/>readiness probe"]
LP -->|always returns 200<br/>if process running| Process[Go process running?]
RP -->|checks dependencies| DB[(DB connection)]
RP -->|checks dependencies| Cache[(Cache client)]
RP -->|checks dependencies| Down[Downstream services]
Process -.->|fail → restart pod| Restart[Pod restart]
DB -.->|fail → drop from Service endpoints| Drop[No new traffic]
Cache -.->|fail → drop| Drop
Down -.->|fail → drop| Drop
style Process fill:#eef
style Drop fill:#fee
style Restart fill:#fee
The diagram shows the liveness vs readiness split: liveness asks "is the process running?" — failing it triggers a restart. Readiness asks "can this pod safely serve traffic right now?" — failing it just drops the pod from Service endpoints without restarting. Most production incidents are someone wiring readiness into a probe that should be liveness, causing endless restart loops when a downstream service blips.
Use go-playground/validator for declarative validation. Struct tags become self-documenting API contracts:
var validate = validator.New(validator.WithRequiredStructEnabled())
type CreateOrderRequest struct {
CustomerID uuid.UUID `json:"customer_id" validate:"required"`
Items []Item `json:"items" validate:"required,min=1,dive"`
Currency string `json:"currency" validate:"required,oneof=USD EUR GBP"`
}
func decodeAndValidate[T any](r *http.Request) (T, error) {
var req T
decoder := json.NewDecoder(r.Body)
decoder.DisallowUnknownFields() // Catch typos early
if err := decoder.Decode(&req); err != nil {
return req, domain.ValidationError("body", err.Error())
}
if err := validate.Struct(req); err != nil {
return req, domain.ValidationError("body", err.Error())
}
return req, nil
}
func (h *OrderHandler) CreateOrder(w http.ResponseWriter, r *http.Request) {
req, err := decodeAndValidate[CreateOrderRequest](r)
if err != nil {
respondError(w, r, err)
return
}
order, err := h.service.CreateOrder(r.Context(), req)
if err != nil {
respondError(w, r, err)
return
}
respondOK(w, http.StatusCreated, order)
}For Kubernetes, expose two health endpoints with different semantics:
func (h *HealthHandler) Liveness(w http.ResponseWriter, r *http.Request) {
// Returns 200 if the process is alive. No dependency checks.
// Kubernetes uses this to restart crashed pods.
w.WriteHeader(http.StatusOK)
}
func (h *HealthHandler) Readiness(w http.ResponseWriter, r *http.Request) {
// Checks dependencies: database, cache, external services.
// Kubernetes uses this to remove pods from load balancers.
ctx, cancel := context.WithTimeout(r.Context(), 5*time.Second)
defer cancel()
if err := h.db.Ping(ctx); err != nil {
w.WriteHeader(http.StatusServiceUnavailable)
return
}
w.WriteHeader(http.StatusOK)
}Liveness probes restart crashed or deadlocked pods; readiness probes control whether a pod receives traffic. During a rolling deployment Kubernetes drops the terminating pod from Service endpoints asynchronously — concurrent with SIGTERM, not guaranteed before it — so the drain window in your shutdown handler is what actually prevents dropped requests.
Production Checklist
- Handler layer parses requests and writes responses only; no business logic
- Service layer has zero HTTP or database dependencies; testable with mocked repos
- Repository layer wraps database calls; services never import sql drivers
- AppError carries code, user message, internal detail, and cause; error middleware logs internals only
- Middleware stacked in order: RequestID → Logging → Recovery → RateLimiter → Auth
- /healthz (liveness) and /readyz (readiness) probes separate and correct
- SIGTERM handler calls server.Shutdown(ctx) with timeout; no force kill
- Request validation uses validator tags; JSON decoder disallows unknown fields
- No PII (names, emails, passwords, card numbers) logged in request/response bodies
- Concurrent requests tested; no race conditions in context-local storage
Prometheus middleware that doesn't lie about latency
Most teams instrument with time.Since(start) at the wrong layer and end up with histograms that exclude middleware time. The middleware below times the full request — including downstream RateLimiter, Auth, and JSON encoding — and labels by route pattern (not raw URL) to keep cardinality bounded: [Prometheus Best Practices]
package middleware
import (
"net/http"
"strconv"
"time"
"github.com/go-chi/chi/v5"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
)
var (
httpDuration = promauto.NewHistogramVec(prometheus.HistogramOpts{
Name: "http_request_duration_seconds",
Help: "HTTP request duration by method, route pattern, and status.",
// Buckets tuned for typical p50=10ms .. p99=2s API workloads.
Buckets: []float64{0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0, 2.5, 5.0},
}, []string{"method", "route", "status"})
httpInFlight = promauto.NewGauge(prometheus.GaugeOpts{
Name: "http_requests_in_flight",
Help: "Currently-executing HTTP requests; sustained > pool size means saturation.",
})
)
type statusRecorder struct {
http.ResponseWriter
status int
}
func (s *statusRecorder) WriteHeader(code int) {
s.status = code
s.ResponseWriter.WriteHeader(code)
}
func Metrics(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
httpInFlight.Inc()
defer httpInFlight.Dec()
rec := &statusRecorder{ResponseWriter: w, status: http.StatusOK}
next.ServeHTTP(rec, r)
// chi.RouteContext gives the *pattern* (/users/{id}), never the raw path
// (/users/123) — keeps the label-set bounded regardless of traffic shape.
route := chi.RouteContext(r.Context()).RoutePattern()
if route == "" {
route = "unmatched"
}
httpDuration.
WithLabelValues(r.Method, route, strconv.Itoa(rec.status)).
Observe(time.Since(start).Seconds())
})
}The corresponding circuit breaker for database calls — wrap every *sql.DB (or pgx pool) at the repository layer, not at the handler, so a slow primary doesn't burn the connection pool while you wait on the request budget:
package repository
import (
"context"
"database/sql"
"errors"
"time"
"github.com/sony/gobreaker"
)
type CircuitDB struct {
db *sql.DB
cb *gobreaker.CircuitBreaker
}
func NewCircuitDB(db *sql.DB) *CircuitDB {
settings := gobreaker.Settings{
Name: "primary-db",
MaxRequests: 1,
Interval: 60 * time.Second,
Timeout: 30 * time.Second,
ReadyToTrip: func(c gobreaker.Counts) bool {
// Trip when 60% of the last 20+ calls failed.
return c.Requests >= 20 && float64(c.TotalFailures)/float64(c.Requests) >= 0.6
},
IsSuccessful: func(err error) bool {
// Treat context cancellation as success — the client gave up,
// not the database. Otherwise a slow client would trip the breaker.
return err == nil || errors.Is(err, context.Canceled)
},
}
return &CircuitDB{db: db, cb: gobreaker.NewCircuitBreaker(settings)}
}
func (c *CircuitDB) QueryRowContext(ctx context.Context, q string, args ...any) (*sql.Row, error) {
res, err := c.cb.Execute(func() (interface{}, error) {
row := c.db.QueryRowContext(ctx, q, args...)
// We can't detect query failure here — Scan returns the error — so the
// breaker only trips on connection-acquisition failures. That's fine:
// repeated Scan errors are usually data-shape bugs, not degradation.
return row, nil
})
if err != nil {
return nil, err
}
return res.(*sql.Row), nil
}The IsSuccessful rule matters: without it, a flood of cancelled contexts (slow clients, noisy load test) trips the breaker on a perfectly healthy database. Cancellations are the client's failure, not the dependency's.
Always-On pprof and runtime/trace
The hardest production incidents are the ones you cannot reproduce locally. A goroutine leak that takes six hours to manifest, a heap that climbs a megabyte per minute under real traffic, a scheduler stall that only shows up when GOMAXPROCS hits the cgroup ceiling — none of these surface in unit tests, and none survive a restart. The only durable answer is to ship every Go binary with profiling endpoints permanently mounted on a private port and to sample runtime/trace on demand. The cost is negligible (an idle pprof endpoint adds nothing measurable), the value when you need it is the difference between a five-minute fix and a war room.
Mount pprof on a separate 127.0.0.1-bound listener so it never appears on the public port, then attach a guarded /debug/trace endpoint that streams an execution trace for a bounded duration. Five seconds is usually enough to capture a stall; thirty seconds is more than enough to characterise a steady-state workload:
package main
import (
"context"
"errors"
"net/http"
_ "net/http/pprof" // mounts handlers on http.DefaultServeMux
"runtime/trace"
"strconv"
"time"
)
func startDebugServer(token string) *http.Server {
mux := http.DefaultServeMux
// /debug/trace?seconds=5 — bounded execution trace download.
mux.HandleFunc("/debug/trace", func(w http.ResponseWriter, r *http.Request) {
if r.Header.Get("X-Debug-Token") != token {
http.Error(w, "forbidden", http.StatusForbidden)
return
}
secs, _ := strconv.Atoi(r.URL.Query().Get("seconds"))
if secs <= 0 || secs > 60 {
secs = 5
}
w.Header().Set("Content-Type", "application/octet-stream")
w.Header().Set("Content-Disposition", `attachment; filename="trace.out"`)
if err := trace.Start(w); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
defer trace.Stop()
select {
case <-time.After(time.Duration(secs) * time.Second):
case <-r.Context().Done():
}
})
srv := &http.Server{Addr: "127.0.0.1:6060", Handler: mux}
go func() {
if err := srv.ListenAndServe(); err != nil && !errors.Is(err, http.ErrServerClosed) {
panic(err)
}
}()
return srv
}
func main() {
dbg := startDebugServer("rotate-me-via-secret-manager")
defer dbg.Shutdown(context.Background())
// ... rest of main
}The 127.0.0.1 bind plus the token header are non-negotiable: net/http/pprof registers handlers like /debug/pprof/cmdline that leak the full process argv, which often contains connection strings on platforms that pass them as flags. Anyone with kubectl access port-forwards the debug port; nobody from the internet ever should.
A Structured Logging Contract: request_id Everywhere
A logging stack is only useful if every line emitted during a single request can be joined back together. That requires three discipline points enforced by code, not by convention. First, every request gets exactly one request_id — generated at the edge if the client did not supply one, propagated downstream in headers and contexts. Second, that request_id lands on every log line emitted while handling the request, which means no goroutine spawned from a handler may use the package-level slog.Default() without first deriving a child logger from the request context. Third, the field name is fixed across services so log queries do not have to special-case dialects.
package logging
import (
"context"
"log/slog"
"net/http"
"github.com/google/uuid"
)
type ctxKey struct{}
var loggerKey = ctxKey{}
// FromContext returns a logger that always carries request_id, falling back
// to the package default if the request did not pass through Inject.
func FromContext(ctx context.Context) *slog.Logger {
if l, ok := ctx.Value(loggerKey).(*slog.Logger); ok {
return l
}
return slog.Default()
}
// Inject is the only middleware allowed to populate request_id. It coexists
// with chi/middleware.RequestID by reading the canonical X-Request-Id header.
func Inject(base *slog.Logger) func(http.Handler) http.Handler {
return func(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
reqID := r.Header.Get("X-Request-Id")
if reqID == "" {
reqID = uuid.NewString()
r.Header.Set("X-Request-Id", reqID)
}
w.Header().Set("X-Request-Id", reqID)
child := base.With(
slog.String("request_id", reqID),
slog.String("method", r.Method),
slog.String("route", r.URL.Path),
)
ctx := context.WithValue(r.Context(), loggerKey, child)
next.ServeHTTP(w, r.WithContext(ctx))
})
}
}Two failure modes to guard against. A goroutine launched with go fireAndForget(req) inherits no context, so it logs without request_id and the trail dies — fix it by passing a deliberately-derived context.Background() plus an explicit logger argument. And a downstream HTTP client must inject the same X-Request-Id header into outbound calls, otherwise the next service generates a fresh ID and the request graph fragments at every hop.
Why chi Middleware Order Is Not Negotiable
The middleware list earlier in this article — RequestID, Logging, Recovery, RateLimiter, Auth — is not aesthetic. Each layer depends on invariants established by the ones above it, and the wrong order silently breaks observability or opens denial-of-service vectors. RequestID runs first because every later log line needs it; if Logging ran first, panics from upstream layers would lack a correlation ID and become unsearchable. Recovery sits above RateLimiter and Auth, not below, because a panic inside the rate limiter — for example, a nil Redis client during deploy — would otherwise crash the entire server process, not just the request. RateLimiter precedes Auth because token validation is expensive (signature verification, JWKS lookup, sometimes a database call); rejecting a credential-stuffing attacker before that work begins is the difference between a brownout and a healthy service. Auth runs last among the cross-cutting layers so that authenticated routes can read identity from context, but liveness and readiness probes are mounted outside the auth subtree so Kubernetes does not need credentials to keep your pods alive.
The only exception worth memorising: tracing middleware (OpenTelemetry, Honeycomb's Beeline) must wrap RequestID, not the other way around, because trace spans are the parent record into which the request_id becomes an attribute. Reverse that and traces orphan themselves at the gateway. Audit your router.Use block in code review; reorderings are the kind of one-line change reviewers miss.
Frequently Asked Questions
How should you structure a production Go API?
Use handler, service, and repository layers in internal/. Handlers parse requests and respond. Services hold business logic. Repositories handle data access. Dependencies flow inward — a repository never calls a handler.
How do you implement structured error handling?
Define AppError with code, user-facing message, internal detail, cause error, and structured fields. Map codes to HTTP status. Build error middleware that logs full details and returns only the safe message to clients.
What health endpoints should you expose?
Expose /healthz (liveness probe — returns 200 if running) and /readyz (readiness probe — checks database and cache). Kubernetes uses these to manage pod restarts and traffic routing.
How do you implement graceful shutdown?
Call server.Shutdown(ctx) on SIGTERM. This stops accepting new connections and waits for in-flight requests to complete within a timeout, preventing dropped requests during pod termination.
Keep Reading
- Go error handling patterns — deeper dive into custom error types and error wrapping
- Kubernetes Networking Deep Dive — service routing, ingress, and the network plane behind your Go API
- Go Graceful Shutdown in Production — clean lifecycle and request draining for layered Go services
Engineering Team
A multidisciplinary team of backend engineers, architects, and DevOps practitioners shipping deep dives into distributed systems and production infrastructure.
Read Next
Go Error Handling: errors.Is, errors.As, Wrapping, and Custom Types
Go error handling: sentinel errors, wrapping, errors.Is/As, custom types, and production patterns that prevent silent failures.
Go Dynamic JSON: Parsing Unknown Schemas in Production
Handle unpredictable JSON in Go: map[string]any, json.RawMessage, type switches, and defensive patterns for shifting schemas.
Go Graceful HTTP Shutdown: Zero-Downtime Production Patterns
Go graceful shutdown: SIGTERM handling, health probe coordination, and Kubernetes drain patterns for zero dropped requests.