HTTP/1.1 vs HTTP/2 vs HTTP/3: The Protocol Evolution Guide
Key Takeaways
- →HTTP/2 migration made latency worse (180ms → 340ms) because the load balancer terminated HTTP/2 from clients but proxied HTTP/1.1 without keep-alive to backends — multiplexing was lost
- →HTTP/2 eliminates domain sharding and sprite sheets by multiplexing streams over one TCP connection, but TCP head-of-line blocking stalls all streams on packet loss (mobile networks suffer)
- →HTTP/3 over QUIC fixes packet-loss isolation so only the affected stream blocks, not all streams; 1-RTT handshake (0-RTT resumed) and connection migration (IP change on mobile) are bonuses
- →HTTP/2 server push was deprecated and removed from Chrome in 2022 — use 103 Early Hints with Link headers or browser prefetch based on HTML instead
The classic HTTP/2 misconfiguration. A team migrates a high-traffic API from HTTP/1.1 to HTTP/2, expecting the multiplexing win. Instead p99 latency moves the wrong way (in our case: 180ms → 340ms). The root cause is almost always at the load balancer: it terminates HTTP/2 from clients but proxies to backends over HTTP/1.1 with keep-alive disabled, creating a new TCP connection per request. The multiplexing benefit evaporates and the single-connection overhead of HTTP/2's larger frames[RFC 9113, 2022] makes things worse. Understanding the protocol at the wire level would have caught this in design review.
HTTP/1.1 sends requests sequentially. HTTP/2[RFC 9113, 2022] multiplexes streams over one TCP connection but loses all streams on packet loss (TCP head-of-line blocking). HTTP/3[RFC 9114, 2022] runs over QUIC[RFC 9000, 2021] so each stream is independent. Upgrade order: enable HTTP/2 everywhere, add HTTP/3 via CDN, run HTTP/3 natively only for mobile-first workloads.
- HTTP/2 eliminates domain sharding and sprite sheets; TCP head-of-line blocking is the trade-off.
- HTTP/3 over QUIC fixes packet-loss isolation and adds 0-RTT resumption and connection migration.
- Disable HTTP/2 server push (deprecated); use Alt-Svc header to advertise HTTP/3 support.
The evolution: Sequential → Multiplexed → Independent
[RFC 9110, 2022]HTTP/1.1 (RFC 2068 in 1997, revised as RFC 2616 in 1999) sends one request per connection, so browsers open 6 parallel TCP connections per host. This forced years of performance hacks (domain sharding, sprite sheets, bundling) to work around a protocol limitation. HTTP/2 (2015) multiplexes many requests over a single TCP connection using binary frames and stream IDs, eliminating these hacks. But TCP's ordered delivery means one lost packet blocks all streams. HTTP/3 (2022) replaces TCP with QUIC (over UDP), so each stream loses packets independently, fixing the remaining bottleneck.
graph LR
subgraph H1["HTTP/1.1 — 6 TCP connections per host"]
C1["req1 → resp1"]
C2["req2 → resp2"]
C3["req3 → resp3"]
end
subgraph H2["HTTP/2 — 1 TCP conn, multiplexed streams"]
S1(("stream 1"))
S2(("stream 2"))
S3(("stream 3"))
S1 -. "lost packet blocks<br/>ALL streams" .- S2
S2 -. "(TCP HOL)" .- S3
end
subgraph H3["HTTP/3 — QUIC over UDP, independent streams"]
Q1(("stream 1"))
Q2(("stream 2"))
Q3(("stream 3"))
Q1 -. "loss only<br/>affects own stream" .- Q2
end
H1 --> H2 --> H3
Each generation fixes the bottleneck the previous created: HTTP/1's connection limit → HTTP/2's multiplexing → TCP's head-of-line blocking → HTTP/3's per-stream loss isolation.
The real cost of HTTP/1.1: A page loading 60 resources waits 10 rounds of 6 resources each (if all resources load in parallel). Each round-trip adds latency. Browsers compensated with domain sharding (img1.example.com, img2.example.com, etc.), sprite sheets to combine dozens of images, and massive CSS/JS bundles—all workarounds for a serial protocol.
| Aspect [RFC 9113, 2022] | HTTP/1.1 | HTTP/2 | HTTP/3 |
|---|---|---|---|
| Transport | TCP (multiple connections) | TCP (single connection) | QUIC (UDP) |
| Format | Text | Binary frames | Binary frames |
| Concurrency | 6 connections per host | Multiplexed streams | Independent streams |
| Head-of-line blocking | Per-connection | All streams (TCP level) | Per-stream only |
| Header compression | None | HPACK (85-90% reduction) | QPACK |
| Handshake (new) | 2-3 RTT | 2 RTT | 1 RTT |
| Handshake (resumed) | 2 RTT | 2 RTT | 0 RTT |
| Connection migration | No | No | Yes |
HTTP/2: From Text to Multiplexed Frames
[RFC 9113, 2022]The evolution from HTTP/1.1 → HTTP/2 → HTTP/3 in one picture — what each version fixes:
graph TB
subgraph H1[HTTP/1.1 — sequential]
H1c[Client] -->|req 1| H1s[Server]
H1s -->|resp 1| H1c
H1c -->|req 2 — must wait| H1s
H1s -->|resp 2| H1c
H1c -.->|workaround:<br/>6 parallel TCP conns| H1s
end
subgraph H2[HTTP/2 — multiplexed over one TCP]
H2c[Client] -->|stream 1, 3, 5<br/>simultaneously| H2s[Server]
H2s -->|interleaved frames| H2c
H2s -.->|HOL blocking:<br/>one lost packet<br/>stalls all streams| H2lost[TCP retransmit]
end
subgraph H3[HTTP/3 — independent streams over QUIC]
H3c[Client] -->|stream 1, 3, 5<br/>over UDP| H3s[Server]
H3s -->|stream 1 lost<br/>stream 3 keeps flowing| H3c
H3s -->|0-RTT resumption<br/>connection migration| H3c
end
style H1c fill:#fdd
style H2lost fill:#fdd
style H3c fill:#dfd
HTTP/2 (2015) replaces HTTP/1.1's sequential request-response model with binary frames. Instead of opening 6 connections, the client sends multiple requests simultaneously over one TCP connection using stream IDs. The server multiplexes responses back, interleaving frames from different requests.
Request Stream 1: GET /api/users
Request Stream 3: GET /api/posts
Request Stream 5: GET /api/commentsAll three requests travel as interleaved frames on the same TCP connection. Browsers no longer need domain sharding, sprite sheets, or aggressive bundling. A single connection can handle hundreds of concurrent requests.
Binary framing and HPACK compression: HTTP/2 wraps headers and body data in binary frames (more efficient than text). HPACK maintains a dynamic table of previously sent headers, reducing overhead by 85-90% for API calls with repetitive headers (cookies, user-agent, authorization)[RFC 9113, 2022].
Server push (deprecated): HTTP/2's server push feature was removed from Chrome in 2022 — the server has no way to know if clients cached resources, wasting bandwidth. Use 103 Early Hints instead if you want early resource hints.
The remaining problem: TCP head-of-line blocking: HTTP/2 eliminated HTTP-level blocking, but introduced a subtler issue by funneling all streams into a single TCP connection. TCP guarantees ordered byte delivery[RFC 9293]. If a single packet is lost, the kernel buffers all subsequent packets (across all HTTP/2 streams) until the lost packet retransmits. On a lossy mobile network with 1-5% packet loss, this blocks all streams simultaneously.
On datacenter networks (near-zero packet loss), HTTP/2 is measurably faster than HTTP/1.1 (less overhead, more concurrency). On mobile networks with 1-5% packet loss, HTTP/2 can be slower than HTTP/1.1 because TCP head-of-line blocking stalls all multiplexed streams at once. This is the core motivation for HTTP/3[RFC 9114, 2022].
HTTP/3: Independent Streams Over QUIC
[RFC 9000, 2021]HTTP/3 (2022) replaces TCP with QUIC, a transport layer built on UDP. QUIC reimplements TCP's reliability and congestion control, but with one critical architectural difference: each stream's packet loss is independent at the transport layer.
Independent packet loss recovery: In HTTP/2, if a packet belonging to stream 3 is lost, TCP buffers all subsequent packets (from all streams) until stream 3's packet retransmits. In HTTP/3, QUIC only buffers stream 3; streams 1, 2, and 4 continue receiving data without delay.
HTTP/2 (TCP) — packet loss blocks all streams:
Stream 1: [data] [data] ████ STALLED ████ [delayed]
Stream 2: [data] [data] ████ STALLED ████ [delayed]
Stream 3: [data] [LOST] ... retransmit ... [data]
Stream 4: [data] [data] ████ STALLED ████ [delayed]
HTTP/3 (QUIC) — packet loss affects only one stream:
Stream 1: [data] [data] [data] [data] ✓ unaffected
Stream 2: [data] [data] [data] [data] ✓ unaffected
Stream 3: [data] [LOST] ... retransmit ... [data]
Stream 4: [data] [data] [data] [data] ✓ unaffected
On mobile networks with 1-5% packet loss, this isolation means the median user experience improves significantly. One video stream having packet loss doesn't freeze your API requests[RFC 9114, 2022].
Faster connection establishment: TCP + TLS requires two sequential handshakes: TCP 3-way (1 RTT) + TLS 1.3 (1 RTT) = 2 RTTs before application data moves. QUIC combines both into one: transport security handshake in 1 RTT. For resumed connections (clients reconnecting to a server they've visited), QUIC supports 0-RTT resumption — the client sends application data in its first packet using cached session credentials.
On a 100ms latency connection, this saves 100-200ms of handshake delay. On mobile networks with 200ms RTT, the difference is user-visible.
Connection migration without re-handshaking: TCP connections are identified by a 4-tuple (source IP, source port, destination IP, destination port). When a mobile device switches from Wi-Fi to cellular, its IP changes. The TCP connection dies, and a new one must be established (2 more RTTs). QUIC identifies connections by a connection ID, not by IP. When the device's IP changes, QUIC continues the connection seamlessly — no handshake, no stream interruption. Matters for users in elevators, subways, or car transitions.
QPACK header compression: HTTP/3 uses QPACK instead of HPACK. Both achieve similar compression ratios, but QPACK is designed for QUIC's independent-stream model. HPACK requires headers to be decoded in order (both sides must update the dynamic table in sync), which would reintroduce head-of-line blocking. QPACK uses separate encoder and decoder streams to allow out-of-order header delivery.
0-RTT data is cryptographically protected but replayable — an attacker who captures the initial packet can resend it. Servers must ensure 0-RTT requests are idempotent (safe to replay). Most implementations restrict 0-RTT to GET requests and reject mutations (POST, PUT, DELETE).
Server Configuration and Verification
Nginx HTTP/2: Set http2 on with TLS 1.2+. Critical: use proxy_http_version 1.1 and clear Connection header when proxying to backends to enable keep-alive (otherwise each request creates a new connection, negating multiplexing).
server {
listen 443 ssl http2;
ssl_protocols TLSv1.2 TLSv1.3;
http2_max_concurrent_streams 256;
location / {
proxy_pass http://backend:8080;
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}Caddy: HTTP/2 and HTTP/3 automatic with zero config. Obtains TLS via Let's Encrypt and advertises HTTP/3 automatically.
Advertise HTTP/3 via Alt-Svc header (Nginx: add_header Alt-Svc 'h3=":443"; ma=86400' always;). Browsers will attempt QUIC; if blocked by firewall, they fall back to HTTP/2.
Verify: curl --http2 https://example.com or check Chrome DevTools Network tab for Protocol column (h2 or h3).
Go HTTP server: HTTP/2 and HTTP/3
Go's net/http upgrades to HTTP/2 automatically when TLS is enabled. HTTP/3 requires quic-go and runs alongside on UDP/443:
import (
"crypto/tls"
"net/http"
"github.com/quic-go/quic-go/http3"
)
func main() {
mux := http.NewServeMux()
mux.HandleFunc("/api/health", func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Alt-Svc", `h3=":443"; ma=86400`)
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte("ok"))
})
h2 := &http.Server{
Addr: ":443",
Handler: mux,
TLSConfig: &tls.Config{
MinVersion: tls.VersionTLS12, // TLS 1.2+ required for HTTP/2
NextProtos: []string{"h2", "http/1.1"},
},
}
h3 := &http3.Server{
Addr: ":443", // UDP for QUIC
Handler: mux,
TLSConfig: &tls.Config{
MinVersion: tls.VersionTLS13, // QUIC requires TLS 1.3
NextProtos: []string{"h3"},
},
}
go h3.ListenAndServeTLS("cert.pem", "key.pem")
h2.ListenAndServeTLS("cert.pem", "key.pem")
}The Alt-Svc header is the upgrade hint — clients see HTTP/2 the first time, store the advertisement, then attempt HTTP/3 on subsequent requests.
Go HTTP client: forcing a specific version
Most diagnostic confusion comes from "I configured HTTP/2 but my client is still using HTTP/1.1." The Go client default selects via TLS ALPN; force it explicitly when measuring:
import (
"crypto/tls"
"net/http"
"golang.org/x/net/http2"
)
// HTTP/2 client (default when TLS is enabled)
h2Client := &http.Client{
Transport: &http2.Transport{
TLSClientConfig: &tls.Config{NextProtos: []string{"h2"}},
},
}
// HTTP/1.1 client (force, for A/B testing)
h11Client := &http.Client{
Transport: &http.Transport{
ForceAttemptHTTP2: false,
TLSClientConfig: &tls.Config{NextProtos: []string{"http/1.1"}},
},
}Production Checklist
Phase 1: Enable HTTP/2 everywhere
- Enable HTTP/2 on all HTTPS-enabled servers. Every major web server (Nginx, Apache, Caddy, IIS) and cloud load balancer (AWS ALB, GCP Cloud Load Balancer) supports it natively.
- Set
http2_max_concurrent_streams 256to prevent resource exhaustion from too many concurrent requests. - When proxying to backends, use HTTP/1.1 keep-alive (
proxy_http_version 1.1+proxy_set_header Connection ""). If you proxy over HTTP/1.1 without keep-alive, you lose multiplexing and actually regress performance. - Remove HTTP/1.1 workarounds: no more domain sharding, no more sprite sheets, no more aggressive bundling. HTTP/2 multiplexing handles hundreds of small files efficiently over one connection.
Phase 2: Add HTTP/3 via CDN (easiest path)
- Deploy HTTP/3 by putting a QUIC-capable CDN in front of your origin. Cloudflare, Fastly, Google Cloud CDN, and AWS CloudFront all support HTTP/3 natively.
- The CDN terminates QUIC from clients and proxies to your origin over HTTP/2 or HTTP/1.1. Your users get latency benefits; your origin doesn't change.
- Set Alt-Svc header (
Alt-Svc: h3=":443"; ma=86400) so browsers discover HTTP/3 and attempt it on the next request. - If QUIC is blocked by a firewall (common in corporate networks), browsers automatically fall back to HTTP/2.
Phase 3: Monitor and measure
- In our experience, log the negotiated protocol per request: add
$server_protocolto Nginx access logs. This tells you what protocol each client is using. - Set up a Prometheus metric for protocol distribution:
sum(rate(http_requests_total{protocol="HTTP/1.1"}[5m])) / sum(rate(http_requests_total[5m])). Alert if HTTP/1.1 traffic is unexpectedly >10% of volume (indicates misconfiguration or ALPN negotiation failure). - Test on real mobile networks with packet loss (1-5%). Simulated loss in the lab doesn't always match carrier conditions.
- Monitor QUIC fallback rate: the percentage of clients that attempt HTTP/3 but fall back to HTTP/2. A high fallback rate (>15-20%) suggests UDP is being blocked or throttled (common in enterprise firewalls).
Phase 4: gRPC and internal APIs
- gRPC runs over HTTP/2 by default and benefits from multiplexing. Set keepalive time below your load balancer's idle timeout: 60s for AWS ALB, 350s for Envoy defaults. Use
GRPC_ARG_KEEPALIVE_TIME_MS=30000and enableGRPC_ARG_KEEPALIVE_PERMIT_WITHOUT_CALLS=1to send pings even when idle. - Without proper keepalive settings, idle connections are silently closed by load balancers, causing opaque transport errors on the next RPC.
Phase 5: Native HTTP/3 on origin (optional)
- Run HTTP/3 natively on your origin only if: clients connect directly (no CDN), you need connection migration for mobile API clients, or you're building a mobile-first service.
- Use Caddy (HTTP/3 built in, zero config) or Nginx with the
nginx-quicmodule. Always keep HTTP/2 as a fallback.
nginx-quic origin config that advertises HTTP/3 correctly
The Alt-Svc header is what makes browsers actually try HTTP/3 — without it, even an HTTP/3-capable client stays on HTTP/2. The configuration below enables HTTP/3 alongside HTTP/2, with the keepalive and retry settings tuned for the QUIC migration story to work over real-world networks (mobile NAT churn, hotel Wi-Fi):
# nginx.conf — nginx-quic build (or nginx ≥ 1.25 with --with-http_v3_module)
http {
# Use the QUIC-capable BoringSSL or a recent OpenSSL/quictls — stock OpenSSL
# ships without QUIC, so verify with `nginx -V | grep quic` before deploy.
ssl_protocols TLSv1.3;
ssl_early_data on; # 0-RTT for resumed sessions
server {
listen 443 quic reuseport; # UDP for HTTP/3
listen 443 ssl http2; # TCP fallback for HTTP/2 + 1.1
server_name api.example.com;
ssl_certificate /etc/ssl/api.example.com.crt;
ssl_certificate_key /etc/ssl/api.example.com.key;
# Tell capable clients HTTP/3 is available on the same authority.
# Without this header browsers do not upgrade.
add_header Alt-Svc 'h3=":443"; ma=86400' always;
# Connection coalescing for HTTP/2 — let one connection serve multiple
# subdomains under the same cert. Saves TCP handshakes for SPAs.
http2_max_concurrent_streams 256;
location / {
proxy_pass http://upstream_api;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_read_timeout 30s;
}
}
}The diagnostic that surfaces "is HTTP/3 actually being used?" — paste it into your incident playbook so you can disambiguate "client says it's slow" between TCP and QUIC paths in seconds:
# 1. Verify the server advertises HTTP/3 via Alt-Svc.
curl -I --http2 https://api.example.com/healthz | grep -i alt-svc
# Expected: alt-svc: h3=":443"; ma=86400
# 2. Force HTTP/3 from a curl with QUIC support (curl ≥ 7.66 + nghttp3).
curl -sw 'protocol: %{http_version} | total: %{time_total}s | dns: %{time_namelookup}s | connect: %{time_connect}s | tls: %{time_appconnect}s\n' \
-o /dev/null --http3-only https://api.example.com/healthz
# Expected: protocol: 3 ...
# 3. Compare the same request over HTTP/2 — typical delta on mobile is
# 20-50ms saved on the handshake, with QUIC keeping the stream alive
# across IP changes (the migration property HTTP/2 does not have).
curl -sw 'protocol: %{http_version} | total: %{time_total}s\n' \
-o /dev/null --http2 https://api.example.com/healthzIf --http3-only returns Couldn't connect to server, the most common cause is the host firewall blocking UDP/443 — every hop between client and origin must allow UDP, and many corporate networks don't.
A Go client snippet that opts into HTTP/3 with quic-go's experimental transport — useful inside health-check probes that need to verify the QUIC path independently of the public CDN:
import (
"net/http"
"github.com/quic-go/quic-go/http3"
)
func newH3Client() *http.Client {
return &http.Client{
// RoundTripper attempts HTTP/3 over QUIC; falls back to error
// (not HTTP/2) if UDP/443 is blocked — exactly what you want
// for a probe that's specifically validating QUIC reachability.
Transport: &http3.Transport{},
}
}Frequently Asked Questions
Why did our HTTP/2 migration make latency worse?
Migrating clients to HTTP/2 while proxying to HTTP/1.1 backends loses multiplexing. If your load balancer terminates HTTP/2 but proxies to backends over HTTP/1.1 without keep-alive, you create a new TCP connection per request — negating the single-connection benefit. Ensure end-to-end HTTP/2 or HTTP/1.1 keep-alive on backends.
Should we enable HTTP/2 server push?
No. Server push was deprecated in HTTP/2 and removed from Chrome in 2022. It doesn't know if clients have cached the resource, so it wastes bandwidth. Use 103 Early Hints with Link headers instead, or let browsers prefetch based on the HTML.
Is HTTP/3 production-ready?
Yes. Major CDNs and web platforms already serve significant traffic over HTTP/3. For the easiest rollout, deploy HTTP/3 behind a QUIC-capable CDN; the CDN terminates QUIC and proxies to your HTTP/2 origin. For origin-native HTTP/3, use Caddy or the nginx-quic module.
Why would we get different performance from HTTP/2 vs HTTP/3?
On datacenter networks with near-zero packet loss, the difference is negligible. On mobile networks with 1-5% packet loss, HTTP/3 (independent streams) can be 20-30% faster than HTTP/2 (TCP head-of-line blocking)[RFC 9114, 2022]. Test on real mobile networks; simulated loss doesn't always match production.
Do we need to support HTTP/1.1 clients?
Yes. Corporate proxies, old IoT devices, and some embedded systems are stuck on HTTP/1.1. But modern browsers and API clients are on HTTP/2+. Recommend: HTTP/2 minimum, offer HTTP/3 via Alt-Svc, keep HTTP/1.1 as fallback. Measure traffic; HTTP/1.1 should be <5% of volume.
Keep Reading
- TCP vs UDP vs QUIC: Protocol Selection Under Production Load — The transport layer underneath HTTP: head-of-line blocking mechanics, QUIC internals, and protocol selection for real-time and streaming workloads
- REST vs gRPC vs GraphQL: A Production Decision Guide — How these application-layer protocols map onto HTTP/2 and HTTP/3, with production code for all three
- What Happens When You Type a URL: The Complete Production Guide — The full request lifecycle from DNS to TLS to HTTP protocol negotiation, with debugging tools for each layer
- Intro to gRPC and Protocol Buffers — gRPC sits directly on HTTP/2 multiplexing; this is the deep dive on what that buys you in production
- DNS Records Production Guide — The layer that resolves before HTTP even starts; CNAME/A/AAAA/HTTPS records and TTL strategies
Engineering Team
A multidisciplinary team of backend engineers, architects, and DevOps practitioners shipping deep dives into distributed systems and production infrastructure.
Read Next
Distributed Rate Limiting at Scale: The Probabilistic Drop Architecture
Probabilistic drop rate limiting: uncoordinated enforcement bypassing Redis for 1M+ RPS with zero coordination overhead.
DNS Records: The Complete Production Guide for Backend Engineers
Every DNS record type for production: A, CNAME, MX, TXT, CAA, SRV. TTL failover math, SPF/DKIM/DMARC, GeoDNS, and DNSSEC.
TCP vs UDP vs QUIC: Protocol Selection Under Production Load
What head-of-line blocking costs, how QUIC solves it, and how to choose the right transport for real-time and API workloads.