#go #microservices #context #concurrency #production

Go Context in Depth: Cancellation, Timeouts, and Debugging in Production

Q: How does context cancellation prevent goroutine leaks in Go?

When a parent context is cancelled, all derived child contexts are cancelled too, signaling goroutines watching ctx.Done() to stop work and exit. Without this signal, goroutines blocked on I/O accumulate indefinitely until the process runs out of memory.

Q: Should you use context.Background() or context.TODO() for background work in Go?

Use context.Background() for intentionally long-lived work that should outlive the request (e.g., saving a payment after the client disconnects). Use context.TODO() as a placeholder when you plan to add proper context propagation later.

Q: What is the difference between context.WithTimeout and context.WithDeadline?

WithTimeout sets a relative duration from now (e.g., 5 seconds), while WithDeadline sets an absolute wall-clock time. WithTimeout is syntactic sugar — it calls WithDeadline(parent, time.Now().Add(timeout)) internally.

Q: How do you detect goroutine leaks in a Go service?

Export the runtime.NumGoroutine() metric to your monitoring system and alert on sustained growth. In development, use go test -count=1 -race with goleak to detect goroutines that outlive the test.

BackendBytes Engineering Team

Feb 17, 2026

15 min read

Go Context in Depth: Cancellation, Timeouts, and Debugging in Production

Key Takeaways

→Goroutine leaks pass health checks until a traffic spike pushes accumulated stacks past the container memory limit — context cancellation is your only defence against unbounded goroutine accumulation
→Timeout budgets shrink at every layer: if parent has 100ms left and you create a 5-second child timeout, the child inherits 100ms — check deadline() before each call to avoid silent failures
→sync.WaitGroup coordination is essential for graceful shutdown — http.Server.Shutdown doesn't wait for background workers, only in-flight requests
→errgroup.WithContext cancels all sibling goroutines on first error, which is perfect for fan-out API aggregation but dangerous for fire-and-forget background jobs

One leaked goroutine per request. Ten thousand requests a second. Nothing tells any of them to stop. A handler spawns a goroutine, returns 200, and it outlives the request because nobody propagated cancellation. At that rate the process leaks goroutines until OOMKilled, typically during the next traffic spike. We debugged this exact failure pattern on multiple production Go services — every variant traces back to the same root cause: a missing ctx parameter or a missing defer cancel().

Goroutine leaks are one of the most common resource-exhaustion patterns in Go microservices^{[Go Runtime GC]}. A goroutine blocked on a database or network call consumes a stack (small initial stack, grown dynamically)^{[Go context]}, holds connections in whatever pool it is waiting on, and accumulates silently. The process continues to respond to health checks until memory or file descriptor limits are hit — typically during the next traffic spike. The root cause is always the same: work that nobody cares about anymore keeps running because there is no signal to stop it.

context.Context^{[Go Language Specification]} is that signal.

TL;DR

Context carries cancellation signals through your call chain. When a parent context is cancelled — because a deadline expired, a client disconnected, or you explicitly called cancel() — all child contexts are cancelled too. Always pass context as the first parameter, always defer cancel(), and use errgroup for fan-out. This prevents goroutine leaks and ensures graceful shutdown.

Pass r.Context() to all HTTP handlers; derive child contexts with WithTimeout, WithCancel, or WithDeadline
Always defer cancel() after every context.WithTimeout or context.WithCancel
Use golang.org/x/sync/errgroup for fan-out; it cancels all goroutines on first error

graph TD
    Root[context.Background<br/>process root] --> Req[r.Context<br/>HTTP request]
    Req -->|cancelled when client disconnects| Req
    Req --> WT[WithTimeout 2s]
    Req --> WC[WithCancel]
    Req --> WD[WithDeadline]
    WT --> Child1[DB call goroutine]
    WC --> Child2[Stream goroutine]
    WD --> Child3[Outbound RPC goroutine]
    Cancel{cancel signal} -.->|propagates down| WT
    Cancel -.->|propagates down| WC
    Cancel -.->|propagates down| WD
    Survive[WithoutCancel<br/>Go 1.21+] -.->|fire-and-forget audit| AuditTask[Saves payment after client disconnect]
    style Cancel fill:#fee
    style Root fill:#eef
    style Survive fill:#efe

The diagram is the context tree: every derived context inherits cancel propagation from its parent, except WithoutCancel (Go 1.21+) which keeps a child alive past the parent's death — useful for fire-and-forget audit/persistence work that should outlive the request that started it.

The Quick Start: Context Propagation Patterns

Every context.Context is part of a tree. The root is context.Background(). When you call WithTimeout, WithCancel, or WithDeadline, you create a child context. When a parent is cancelled, all descendants are cancelled too — immediately.

Constructor	Use When	Pattern
`context.Background()`	Program startup, background workers, operations outliving a request	`runServer(context.Background())`
`context.WithTimeout(parent, d)`	Relative timeout from now	`ctx, cancel := context.WithTimeout(ctx, 2*time.Second); defer cancel()`
`context.WithDeadline(parent, t)`	Absolute deadline	`ctx, cancel := context.WithDeadline(ctx, time.Now().Add(2*time.Second)); defer cancel()`
`context.WithCancel(parent)`	Manual cancellation, no time limit	`ctx, cancel := context.WithCancel(ctx); defer cancel()`
`context.WithoutCancel(parent)`	Child survives parent cancellation but loses the parent's deadline (Go 1.21+)	`auditCtx := context.WithoutCancel(ctx)` for fire-and-forget cleanup

In an HTTP server, r.Context() is the entry point — it's cancelled when the client disconnects, the server shuts down, or the handler returns. Pass it down your call chain (and detach with WithoutCancel for any goroutine meant to outlive the handler):

func (s *PaymentService) ProcessPayment(w http.ResponseWriter, r *http.Request) {
    var req PaymentRequest
    json.NewDecoder(r.Body).Decode(&req)
 
    // r.Context() is cancelled the instant this handler returns (below), so a
    // fire-and-forget goroutine must not use it directly. WithoutCancel (Go 1.21+)
    // detaches from that cancellation while keeping request-scoped values (trace
    // IDs); add an explicit timeout since it also drops the parent deadline.
    bgCtx := context.WithoutCancel(r.Context())
 
    go func() {
        ctx, cancel := context.WithTimeout(bgCtx, 10*time.Second)
        defer cancel()
 
        result, err := s.gateway.ChargeWithContext(ctx, req.CardToken, req.Amount)
        if err != nil {
            if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
                return
            }
            log.Error("payment failed", "error", err)
            return
        }
        s.db.SavePaymentWithContext(ctx, result)
    }()
 
    w.WriteHeader(http.StatusAccepted)
    json.NewEncoder(w).Encode(map[string]string{"status": "processing"})
}

Timeout Budgets: Children Can Only Shrink

^{[Go context]}

The deadline propagation tree — every layer subtracts from the parent's budget, never extends it:

graph TD
    Client[Client request<br/>3000 ms timeout<br/>Server.ReadHeaderTimeout caps it] --> Handler[HTTP handler<br/>r.Context: 3000 ms left]
    Handler -->|WithTimeout 1500 ms| Service[Service.PlaceOrder<br/>1500 ms left<br/>middle reserves 1500 ms<br/>for downstream + slack]
    Service -->|WithTimeout 800 ms| DB[Repository.Save<br/>800 ms left]
    Service -->|WithTimeout 500 ms| Pay[Gateway.Charge<br/>500 ms left]
    Service -->|WithTimeout 400 ms| Inv[Inventory.Reserve<br/>400 ms left]
    DB --> Pool{DB pool acquire<br/>+ tx + commit}
    Pay --> HTTP{Outbound HTTP<br/>+ TLS + body read}
    Inv --> RPC{gRPC call}
    Pool -.->|exceeds 800 ms| Cancel[ctx.Err == DeadlineExceeded<br/>tx rolls back<br/>conn returned to pool]
    HTTP -.->|exceeds 500 ms| Cancel
    RPC -.->|exceeds 400 ms| Cancel
    style Cancel fill:#fdd
    style DB fill:#dfd
    style Pay fill:#dfd
    style Inv fill:#dfd

The discipline: each layer's timeout sums to less than the parent's remaining budget, leaving slack for serialization, lock acquisition, and the layer's own response time. Children inherit the minimum of all ancestor deadlines.

Child contexts inherit the parent's deadline and can never extend it. If the parent has 1 second remaining and you create a child with WithTimeout(ctx, 5*time.Second), the child gets 1 second — not 5.

Always check the remaining budget before setting child timeouts:

func (s *OrderService) PlaceOrder(ctx context.Context, order Order) error {
    // Check how much time the parent has left
    if deadline, ok := ctx.Deadline(); ok {
        remaining := time.Until(deadline)
        if remaining < 100*time.Millisecond {
            return fmt.Errorf("insufficient time budget: %v remaining", remaining)
        }
    }
 
    // Each step gets a fraction of the remaining budget
    inventoryCtx, cancel := context.WithTimeout(ctx, 500*time.Millisecond)
    defer cancel()
    if err := s.inventory.Reserve(inventoryCtx, order); err != nil {
        return fmt.Errorf("reserve inventory: %w", err)
    }
 
    paymentCtx, cancel := context.WithTimeout(ctx, 2*time.Second)
    defer cancel()
    if err := s.payment.Charge(paymentCtx, order); err != nil {
        return fmt.Errorf("charge payment: %w", err)
    }
 
    return nil
}

Missing defer cancel() Leaks Goroutines

Forgetting defer cancel() after context.WithTimeout or context.WithCancel leaks the internal timer goroutine and associated memory until the parent context is cancelled. This is flagged by go vet's lostcancel check — always pair every WithTimeout or WithCancel with an immediate defer cancel().

Database and HTTP Calls: Always Use Context Variants

^{[Go net/http]}

All blocking I/O in the standard library accepts context. Use QueryRowContext, ExecContext, http.NewRequestWithContext:

func (r *UserRepository) FindByID(ctx context.Context, id string) (*User, error) {
    var user User
    err := r.db.QueryRowContext(ctx,
        "SELECT id, name, email FROM users WHERE id = $1", id,
    ).Scan(&user.ID, &user.Name, &user.Email)
 
    if err != nil {
        if errors.Is(err, context.DeadlineExceeded) {
            return nil, fmt.Errorf("database query timed out: %w", err)
        }
        if errors.Is(err, sql.ErrNoRows) {
            return nil, ErrNotFound
        }
        return nil, err
    }
    return &user, nil
}
 
func (h *UserHandler) GetUser(w http.ResponseWriter, r *http.Request) {
    // 500ms budget for the entire handler, including DB round-trip
    ctx, cancel := context.WithTimeout(r.Context(), 500*time.Millisecond)
    defer cancel()
 
    userID := r.URL.Query().Get("id")
    user, err := h.userRepo.FindByID(ctx, userID)
    if err != nil {
        http.Error(w, err.Error(), http.StatusInternalServerError)
        return
    }
    json.NewEncoder(w).Encode(user)
}

When the 500ms deadline expires, QueryRowContext returns immediately with context.DeadlineExceeded.

Fan-Out Without Data Races

A common pattern is calling several downstream services in parallel and combining the results. For long-lived streaming workloads, a worker pool with bounded concurrency is often a better fit than ad-hoc fan-out. For request-scoped fan-out, a naive implementation using a shared struct and an error channel has a data race: goroutines write to different fields of the same struct concurrently, which is undefined behavior in Go's memory model.

Use golang.org/x/sync/errgroup, which manages goroutine lifecycle, cancels the group on the first error, and provides a clean model for aggregating results safely:

import "golang.org/x/sync/errgroup"
 
type Product struct {
    ID              string
    Name            string
    Price           float64
    Reviews         []Review
    Inventory       int
    Recommendations []string
}
 
func (s *ProductService) GetProductDetails(ctx context.Context, productID string) (*Product, error) {
    ctx, cancel := context.WithTimeout(ctx, 2*time.Second)
    defer cancel()
 
    var (
        product Product
        mu      sync.Mutex // protects all writes to product
    )
 
    g, ctx := errgroup.WithContext(ctx)
 
    g.Go(func() error {
        info, err := s.productInfoService.Get(ctx, productID)
        if err != nil {
            return fmt.Errorf("product info: %w", err)
        }
        mu.Lock()
        product.ID = info.ID
        product.Name = info.Name
        product.Price = info.Price
        mu.Unlock()
        return nil
    })
 
    g.Go(func() error {
        reviews, err := s.reviewService.GetReviews(ctx, productID)
        if err != nil {
            return fmt.Errorf("reviews: %w", err)
        }
        mu.Lock()
        product.Reviews = reviews
        mu.Unlock()
        return nil
    })
 
    g.Go(func() error {
        inventory, err := s.inventoryService.GetStock(ctx, productID)
        if err != nil {
            return fmt.Errorf("inventory: %w", err)
        }
        mu.Lock()
        product.Inventory = inventory
        mu.Unlock()
        return nil
    })
 
    g.Go(func() error {
        recs, err := s.recommendationService.Get(ctx, productID)
        if err != nil {
            return fmt.Errorf("recommendations: %w", err)
        }
        mu.Lock()
        product.Recommendations = recs
        mu.Unlock()
        return nil
    })
 
    if err := g.Wait(); err != nil {
        return nil, err
    }
 
    return &product, nil
}

When errgroup is initialised with errgroup.WithContext(ctx), the first goroutine to return a non-nil error cancels the derived context — which cancels all other in-flight calls automatically. The Wait() call returns the first error encountered.

If you want all goroutines to complete even when one fails (partial results), do not use errgroup.WithContext. Use plain goroutines and collect results through typed channels.

Distributed Tracing with Context

^{[OpenTelemetry Sampling]}

In distributed systems, context.Context carries trace IDs across service boundaries through OpenTelemetry. When you pass ctx to an instrumented HTTP client, it automatically injects the trace ID into HTTP headers. The receiving service extracts and continues the trace — all through context propagation.

gRPC is even simpler: deadlines propagate automatically. A client-side 2-second timeout flows to the server, which can check the remaining budget:

// Client: deadline propagates to server automatically
ctx, cancel := context.WithTimeout(ctx, 2*time.Second)
defer cancel()
resp, err := client.GetUser(ctx, &pb.GetUserRequest{Id: userID})
 
// Server: access propagated deadline
func (s *server) GetUser(ctx context.Context, req *pb.GetUserRequest) (*pb.User, error) {
    deadline, ok := ctx.Deadline()
    if ok {
        remaining := time.Until(deadline)
        slog.Debug("deadline from upstream", "remaining", remaining)
    }
    return s.repo.FindByID(ctx, req.Id)
}

Beware Deadline Cascades

When service A calls service B with 5 seconds remaining, and B calls service C, C inherits the remaining time from A's original deadline — not 5 fresh seconds. If A already consumed 4 seconds, C gets 1 second. Monitor grpc_server_handling_seconds with deadline_exceeded labels to catch cascading timeout pressure.

Graceful Shutdown and Long-Running Work

Context cancellation enables clean shutdown. When a signal arrives, cancel the root context — everything descending from it stops:

func main() {
    ctx, cancel := context.WithCancel(context.Background())
 
    server := &http.Server{Addr: ":8080", Handler: NewHandler(ctx)}
    go server.ListenAndServe()
 
    sigChan := make(chan os.Signal, 1)
    signal.Notify(sigChan, os.Interrupt, syscall.SIGTERM)
    <-sigChan
 
    cancel() // Signal all in-flight work to stop
 
    shutdownCtx, shutdownCancel := context.WithTimeout(context.Background(), 30*time.Second)
    defer shutdownCancel()
    server.Shutdown(shutdownCtx)
}

For CPU-bound loops, poll ctx.Err() periodically (not every iteration):

func processLargeDataset(ctx context.Context, items []Item) error {
    for i, item := range items {
        if i%100 == 0 {
            if err := ctx.Err(); err != nil {
                return fmt.Errorf("cancelled at item %d: %w", i, err)
            }
        }
        if err := processItem(ctx, item); err != nil {
            return err
        }
    }
    return nil
}

Context Values: Request Metadata Only

Use context.WithValue only for request-scoped metadata: trace IDs, request IDs, user IDs. Always use unexported key types to prevent collisions:

type contextKey string
const requestIDKey contextKey = "request-id"
 
func RequestIDMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        requestID := r.Header.Get("X-Request-ID")
        if requestID == "" {
            requestID = uuid.New().String()
        }
        ctx := context.WithValue(r.Context(), requestIDKey, requestID)
        w.Header().Set("X-Request-ID", requestID)
        next.ServeHTTP(w, r.WithContext(ctx))
    })
}

Never store configuration, database connections, or service dependencies in context values — it hides dependencies and makes code untestable. Pass those as function parameters or struct fields. Context values are immutable; every WithValue call creates a linked list node, and lookups scan the chain.

Detecting and Testing Leaks

Goroutine leaks are silent until they exhaust resources. Use go.uber.org/goleak to detect them automatically:

import "go.uber.org/goleak"
 
func TestMain(m *testing.M) {
    goleak.VerifyTestMain(m)
}
 
func TestOrderProcessing(t *testing.T) {
    defer goleak.VerifyNone(t)
    ctx, cancel := context.WithCancel(context.Background())
    svc := NewOrderService(mockDeps)
    svc.ProcessOrder(ctx, testOrder)
    cancel()
    // goleak.VerifyNone will fail if any goroutines are still running
}

Test timeout and cancellation paths without timing-dependent sleeps:

func TestUserService_FetchWithTimeout(t *testing.T) {
    ctx, cancel := context.WithTimeout(context.Background(), 100*time.Millisecond)
    defer cancel()
    slowMock := &SlowUserAPI{delay: 500 * time.Millisecond}
    service := NewUserService(slowMock)
    _, err := service.FetchUser(ctx, "user-123")
    if !errors.Is(err, context.DeadlineExceeded) {
        t.Errorf("expected deadline exceeded, got: %v", err)
    }
}

In production, expose pprof on an internal port to inspect goroutine profiles. A healthy service with 50 concurrent connections typically has <200 goroutines. Common leak patterns: goroutines in chan receive without a select, missing defer cancel(), and time.After loops (use time.NewTimer with Reset instead).

Production Checklist

Before shipping a Go service:

Postmortem: The Cancellation Leak That Took Out Checkout

A real production incident from a checkout service running on Kubernetes. The deployment had four pods, each capped at 512 MiB. Steady-state traffic was 800 requests per second with a p99 latency of 70 ms. The service passed every health check, every readiness probe, and every smoke test for forty minutes after a routine deploy. Then the first pod was OOMKilled. Within ninety seconds, all four pods cycled and the cart endpoint started returning 503 from the load balancer. The on-call pulled a goroutine profile from a survivor and counted forty-one thousand goroutines parked in chan receive, all rooted in the same handler.

The leaked code was a "fire-and-forget" recommendation prefetch. The handler kicked off a goroutine on every cart view to warm a downstream recommendations cache. The goroutine called the recommendation service over HTTP without using the request context, and it read the response body in a loop guarded only by a hardcoded ten-second sleep between retries. When the recommendation service degraded from 30 ms to 9 seconds per call, every prefetch goroutine parked for the full retry window. At 800 requests per second that meant 800 new goroutines per second piling up against a 60 to 90 second drain time. The container memory headroom evaporated in roughly six minutes once the upstream slowed.

The fix had three pieces, all of which should have been there from the start. First, the prefetch derived its own bounded child context from the request context using WithTimeout so an upstream slowdown could not pin the goroutine indefinitely. Second, the retry loop selected on ctx.Done() instead of sleeping unconditionally, so cancellation took effect within milliseconds. Third, the handler used a semaphore to cap concurrent prefetches per pod, so even with cancellation working correctly the service degraded gracefully under upstream pressure rather than fanning out to unbounded goroutines.

type recommendationPrefetcher struct {
    client *http.Client
    sem    chan struct{} // bounded concurrency, e.g. make(chan struct{}, 256)
}
 
func (p *recommendationPrefetcher) Prefetch(parent context.Context, userID string) {
    select {
    case p.sem <- struct{}{}:
    default:
        // Shed load: skip prefetch when at concurrency cap
        return
    }
 
    ctx, cancel := context.WithTimeout(parent, 800*time.Millisecond)
    go func() {
        defer cancel()
        defer func() { <-p.sem }()
 
        backoff := 50 * time.Millisecond
        for attempt := 0; attempt < 3; attempt++ {
            req, _ := http.NewRequestWithContext(ctx, http.MethodGet,
                "https://recs.internal/v1/users/"+userID, nil)
            resp, err := p.client.Do(req)
            if err == nil {
                resp.Body.Close()
                return
            }
            if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
                return
            }
            select {
            case <-ctx.Done():
                return
            case <-time.After(backoff):
                backoff *= 2
            }
        }
    }()
}

The lesson generalises. Any goroutine that outlives its triggering request must (a) derive a bounded context, (b) select on ctx.Done() in every wait, and (c) cap its own concurrency. Two of the three on their own do not save you when an upstream goes slow.

Testing Context-Aware Code Without Sleeps

Production context bugs hide in two places: code paths that ignore cancellation, and code paths that only fire on cancellation (cleanup handlers, retry loops, partial-result paths). Table-driven tests with explicit WithCancel give you deterministic coverage of both branches without time.Sleep calls that flake under CPU pressure.

func TestOrderService_PlaceOrder_Cancellation(t *testing.T) {
    cases := []struct {
        name         string
        setupCtx     func(t *testing.T) (context.Context, context.CancelFunc)
        repoLatency  time.Duration
        wantErr      error
        wantSaveCall bool
    }{
        {
            name: "happy path: context not cancelled",
            setupCtx: func(t *testing.T) (context.Context, context.CancelFunc) {
                return context.WithCancel(context.Background())
            },
            repoLatency:  10 * time.Millisecond,
            wantErr:      nil,
            wantSaveCall: true,
        },
        {
            name: "cancel before call: short-circuits without touching repo",
            setupCtx: func(t *testing.T) (context.Context, context.CancelFunc) {
                ctx, cancel := context.WithCancel(context.Background())
                cancel() // already cancelled
                return ctx, func() {}
            },
            wantErr:      context.Canceled,
            wantSaveCall: false,
        },
        {
            name: "deadline exceeded mid-call: returns DeadlineExceeded",
            setupCtx: func(t *testing.T) (context.Context, context.CancelFunc) {
                return context.WithTimeout(context.Background(), 5*time.Millisecond)
            },
            repoLatency:  100 * time.Millisecond,
            wantErr:      context.DeadlineExceeded,
            wantSaveCall: false,
        },
    }
 
    for _, tc := range cases {
        t.Run(tc.name, func(t *testing.T) {
            defer goleak.VerifyNone(t)
            ctx, cancel := tc.setupCtx(t)
            defer cancel()
 
            repo := &fakeRepo{latency: tc.repoLatency}
            svc := NewOrderService(repo)
            err := svc.PlaceOrder(ctx, Order{ID: "o-1"})
 
            if !errors.Is(err, tc.wantErr) {
                t.Fatalf("err mismatch: got %v, want %v", err, tc.wantErr)
            }
            if got := repo.saveCalled.Load(); got != tc.wantSaveCall {
                t.Fatalf("save called: got %v, want %v", got, tc.wantSaveCall)
            }
        })
    }
}

Three properties worth highlighting. The setupCtx factory returns a fresh context per case so cases cannot leak state. The pre-cancelled case uses cancel() immediately rather than a tiny timeout, which removes scheduler dependence. And goleak.VerifyNone(t) runs at the end of every case, catching the regression where a buggy fix accepts cancellation but forgets to release a worker goroutine.

errgroup vs sync.WaitGroup: The Decision Rule

Both primitives coordinate goroutine lifetime. They are not interchangeable. Pick errgroup.WithContext when any sibling failure should abort the others — the canonical case is a fan-out aggregation where a partial result is worse than a fast failure (an order page where the price service fails, so showing reviews and inventory alone is misleading). Pick a plain sync.WaitGroup when each goroutine's outcome is independent and you want every result, even if some fail — for example, writing audit events to several sinks where a flaky logger should not block the others.

Question	errgroup.WithContext	sync.WaitGroup
Should one failure cancel the others?	Yes	No
Do you need the first error returned?	Yes (`g.Wait()` returns it)	Roll your own with a results slice
Is partial success acceptable?	No	Yes
Are the goroutines bounded by request scope?	Almost always	Sometimes (background workers)
Do you call `defer cancel()` on the parent ctx?	Yes (the derived ctx)	Not applicable

The trap is using errgroup.WithContext for fire-and-forget background work. The first error cancels all siblings, which is exactly wrong when each sibling is meant to run independently to completion. If you find yourself wrapping every g.Go body in a defer recover to swallow errors so they do not abort the group, that is a signal to switch to sync.WaitGroup plus an explicit error slice.

Frequently Asked Questions

How does context cancellation prevent goroutine leaks in Go?

When a parent context is cancelled, all derived child contexts are cancelled too, signaling goroutines watching ctx.Done() to stop work and exit. Without this signal, goroutines blocked on I/O accumulate indefinitely until the process runs out of memory.

Should you use context.Background() or context.TODO() for background work in Go?

Use context.Background() for intentionally long-lived work that should outlive the request (e.g., saving a payment after the client disconnects). Use context.TODO() as a placeholder when you plan to add proper context propagation later.

What is the difference between context.WithTimeout and context.WithDeadline?

WithTimeout sets a relative duration from now (e.g., 5 seconds), while WithDeadline sets an absolute wall-clock time. WithTimeout is syntactic sugar — it calls WithDeadline(parent, time.Now().Add(timeout)) internally.

How do you detect goroutine leaks in a Go service?