Should I use a low TTL (60s) or high TTL (3600s)?

Low TTL enables fast failover but multiplies DNS queries 60x. Use 300s (5 min) for failover-critical services (load balancers, APIs). Use 3600s for stable records (mail servers, CDN aliases). Temporarily drop to 60s before planned migrations, then restore after.

Can I CNAME the apex domain (example.com)?

No. The apex requires SOA and NS records, which can't coexist with CNAME. Use your provider's ALIAS/ANAME record instead (Cloudflare CNAME Flattening, Route 53 ALIAS, DNSimple ALIAS). If your provider doesn't offer ALIAS, use an A record pointing to a stable IP.

What's the difference between SPF and DMARC?

SPF declares which servers can send email for your domain. DMARC enforces policy when SPF/DKIM fail. Deploy SPF first (reduces spoofing), then add DKIM (cryptographic signing), then DMARC (policy enforcement). Start with p=none to collect reports before moving to p=quarantine or p=reject.

Without CAA records, what can an attacker do?

An attacker can request a TLS certificate for your domain from any Certificate Authority. They then run a fake version of your site with that cert, fooling users. CAA restricts issuance to specific CAs. Add CAA records for every domain: example.com CAA 0 issue 'letsencrypt.org'.

#dns #networking #devops #security #terraform

DNS Records: The Complete Production Guide for Backend Engineers

Q: Why do my SPF checks fail with too many lookups?

SPF has a hard limit of 10 DNS lookups. Each include: mechanism counts as one. If you exceed 10, SPF returns permerror and receivers may reject your email. Consolidate includes, use ip4/ip6 directives (which don't count), and monitor your includes: count via SPF tools.

BackendBytes Engineering Team

Mar 2, 2026

16 min read

DNS Records: The Complete Production Guide for Backend Engineers

Key Takeaways

→Facebook disappeared for almost six hours when a BGP script withdrew routes to their DNS servers; third-party revenue impact estimates ranged into the tens of millions — DNS is critical infrastructure that most treat as a checkbox
→TTL controls propagation speed: 300s for services that might failover (enables 5-minute recovery), 3600s for stable records (reduces resolver load 12x), 60s temporarily during migrations
→Never CNAME the apex domain (example.com) — use your provider's ALIAS/ANAME record instead; no CNAME means falling back to A records, losing CDN integration flexibility
→CAA records lock down certificate issuance — without them, any CA in the world can issue certificates for your domain (a real attack vector); SPF/DKIM/DMARC secure email with 10-lookup limit on SPF includes

On October 4, 2021, Facebook disappeared from the internet for nearly six hours. A routine BGP maintenance script withdrew the routes to their DNS nameservers^{[Meta 2021-10-04 outage]}. Engineers couldn't diagnose the problem because the remote management tools also depended on DNS. The fix required physical access to the Santa Clara data center. Three billion users lost access to Facebook, Instagram, and WhatsApp. The cascade is well-documented in Cloudflare's network-side analysis of the BGP and DNS withdrawal. And it started with DNS^{[RFC 1035]}.

Quick Take

DNS^{[RFC 1035]} resolves domains to IPs through a hierarchical cache. Every record type — A, CNAME, MX, CAA, SRV, NS — has a specific production use. TTLs control propagation speed; choose 300s for failover-critical services, 3600s for stable records. Secure DNS with CAA records, authenticate email with SPF/DKIM/DMARC, and always manage DNS as code.

A/AAAA route traffic; CNAME aliases; CAA locks down certificates
TTL 300s for services, 3600s for stable records, 60s before migrations
SPF declares senders, DKIM signs messages, DMARC enforces policy
Manage all records in Terraform — never console-click production DNS

The DNS resolution path

What actually happens when a client looks up api.example.com — the cache hierarchy, then the authoritative chain:

graph LR
    Client[Client app] --> Stub[Stub resolver<br/>OS / libc]
    Stub -->|cache hit| Done1[Return IP]
    Stub -->|cache miss| Recurse[Recursive resolver<br/>1.1.1.1, 8.8.8.8, ISP]
    Recurse -->|cache hit| Done1
    Recurse -->|cache miss| Root[Root nameservers<br/>13 anycasted]
    Root -->|.com NS| TLD[TLD nameservers<br/>Verisign for .com]
    TLD -->|example.com NS| Auth[Authoritative nameserver<br/>your DNS provider]
    Auth -->|A record| Recurse
    Recurse -->|cache + return| Stub
    Stub -->|cache + return| Client
    Auth -.->|TTL governs<br/>cache lifetime| Recurse
    style Done1 fill:#dfd
    style Auth fill:#ffd

Most lookups hit the recursive resolver's cache and never reach root. New domains and TTL-expired entries walk the full chain — adding 50-200 ms to first-byte latency. Set TTLs to balance freshness (low TTL = faster failover) against load on your authoritative servers (high TTL = fewer queries).

The quick start: Record types by purpose

Record Type	Purpose	TTL	Notes
A	IPv4 → domain	300s	Multiple A records for round-robin. Failover needs low TTL.
AAAA	IPv6 → domain	300s	Required for IPv6-only clients. Publish both A and AAAA.
CNAME	Domain alias	3600s	Never use at apex (`example.com`); use ALIAS/ANAME instead.
MX	Mail routing	3600s	Priority values; lower = preferred. Always 2+ for redundancy.
TXT	Text records	3600s	SPF, DKIM, DMARC, ACME validation — arbitrary text.
CAA	Cert authority	3600s	Locks down TLS issuance. No CAA = any CA can issue certs.
SRV	Service discovery	300s	Port + priority. Used by SIP, XMPP, K8s external DNS.
NS	Nameservers	86400s	Zone delegation. Set at registrar; rarely change.
DNSSEC	Signing	variable	RRSIG, DNSKEY, DS. Optional unless regulated.

A and AAAA: Domain-to-IP Mapping

An A record maps a domain to an IPv4 address; AAAA maps to IPv6. Multiple A records for the same domain enable round-robin load distribution:

api.example.com.    300    IN    A    203.0.113.10
api.example.com.    300    IN    A    203.0.113.11
api.example.com.    300    IN    AAAA 2001:db8::1

DNS round-robin distributes connections without health checking. If one IP fails, clients still hit it until TTL expires. Use it for coarse distribution across redundant load balancers — never as a replacement for proper load balancing.

IPv6 Is Not Optional

Major cloud providers, mobile carriers, and ISPs increasingly route over IPv6. IPv6-only clients must rely on NAT64/DNS64 translation if you omit AAAA records, adding latency and a failure point. Publish both A and AAAA for any public-facing service.

CNAME: Aliases and the Apex Restriction

^{[RFC 1035]}

A CNAME record aliases one domain to another — essential for CDN integration and PaaS hosting:

blog.example.com.       3600    IN    CNAME    d1234abcd.cloudfront.net.
staging.example.com.    300     IN    CNAME    example-app.fly.dev.

Critical constraint: A CNAME cannot coexist with other record types. Since the apex domain (example.com) requires SOA and NS records, you cannot CNAME at apex. Use your provider's ALIAS/ANAME record (Cloudflare CNAME flattening, Route 53 ALIAS, DNSimple ALIAS, NS1 linked record) or fall back to an A record pointing to a stable IP.

MX: Email Routing with Priority

MX records direct incoming email to mail servers with priority values (lower = preferred). Always configure at least 2 MX records for redundancy:

example.com.    3600    IN    MX    10    mail1.example.com.
example.com.    3600    IN    MX    20    mail2.example.com.
 
; MX targets need their own A records
mail1.example.com.    3600    IN    A    203.0.113.50
mail2.example.com.    3600    IN    A    203.0.113.51

MX targets must be hostnames with A/AAAA records — never point MX at an IP or CNAME. If using Google Workspace or Microsoft 365, they provide the MX records.

TXT: SPF, DKIM, and DMARC for Email Authentication

TXT records store arbitrary text, primarily used for email authentication. Without SPF/DKIM/DMARC, receiving servers can't verify your domain — result: spam folder.

When a recipient's mail server receives mail claiming to be from your domain, it runs three DNS lookups in parallel. All three must pass (or DMARC's policy explicitly permit the failure) for inbox placement.

graph LR
    Mail["Incoming message<br/>From: alice@example.com"] --> SPF{"SPF check<br/>(TXT lookup)"}
    Mail --> DKIM{"DKIM check<br/>(selector._domainkey)"}
    Mail --> DMARC{"DMARC policy<br/>(_dmarc.example.com)"}
    SPF -->|"sender IP in<br/>SPF allowlist?"| Verdict["Combined verdict"]
    DKIM -->|"signature valid<br/>vs public key?"| Verdict
    DMARC -->|"p=reject /<br/>quarantine / none"| Verdict
    Verdict -->|"all pass"| Inbox["✓ Inbox"]
    Verdict -->|"SPF or DKIM fail<br/>+ p=reject"| Rejected["✗ Rejected"]
    Verdict -->|"fail but p=none"| Spam["⚠ Spam folder"]

Misconfigure any one of the three and mail lands in spam silently — the sender sees a successful SMTP response and no bounce.

SPF (Sender Policy Framework) declares who can send email for your domain:

example.com.    3600    IN    TXT    "v=spf1 include:_spf.google.com include:sendgrid.net -all"

DKIM (DomainKeys) signs outgoing messages cryptographically:

google._domainkey.example.com.    3600    IN    TXT    "v=DKIM1; k=rsa; p=MIIBIjANBgkqhki..."

DMARC enforces policy when SPF/DKIM fail:

_dmarc.example.com.    3600    IN    TXT    "v=DMARC1; p=reject; rua=mailto:dmarc@example.com; pct=100"

Deployment order: start with p=none to collect reports, then move to p=quarantine, then p=reject.

The SPF 10-Lookup Limit

SPF has a hard limit of 10 DNS lookups per policy. Each include: mechanism triggers a lookup; nested includes count. If exceeded, SPF returns permerror and receivers may reject your email. Consolidate includes and use ip4:/ip6: directives (which don't count) for static IPs.

CAA: Lock Down Certificate Issuance

^{[RFC 1035]}

CAA records specify which Certificate Authorities can issue TLS certificates for your domain. Without CAA, any CA in the world can issue certificates for you — a real attack vector.

example.com.    3600    IN    CAA    0 issue "letsencrypt.org"
example.com.    3600    IN    CAA    0 issue "amazon.com"
example.com.    3600    IN    CAA    0 issuewild "letsencrypt.org"
example.com.    3600    IN    CAA    0 iodef "mailto:security@example.com"

Since September 2017 (mandated by the CA/Browser Forum under RFC 6844, later superseded by RFC 8659 in 2019), all CAs must check CAA records before issuing. If your domain has CAA records that don't include the requesting CA, issuance is denied.

Verify with: dig example.com CAA +short

CAA Can Block Cert Renewal

If you add CAA records after certificates are issued, ensure the issuing CA is listed. Let's Encrypt ACME clients fail renewal if letsencrypt.org is missing — a common source of 2 AM pages when certificates expire. Check CAA records any time you change DNS providers.

SRV: Service Discovery with Port Numbers

SRV records advertise service locations with port numbers (A records can't do this). Used by SIP, XMPP, LDAP, and Kubernetes external DNS:

; Format: _service._protocol.domain  TTL  IN  SRV  priority  weight  port  target
_sip._tcp.example.com.    300    IN    SRV    10    60    5060    sip1.example.com.
_sip._tcp.example.com.    300    IN    SRV    10    40    5060    sip2.example.com.

Priority works like MX (lower = preferred). Weight distributes traffic proportionally. Most HTTP services use A/CNAME records with well-known ports (80/443).

NS: Zone Delegation

NS records declare which nameservers are authoritative for a domain or subdomain. Use them for subdomain delegation when a separate team manages their own DNS infrastructure:

internal.example.com.    3600    IN    NS    ns1.internal-infra.example.com.
internal.example.com.    3600    IN    NS    ns2.internal-infra.example.com.

After delegation, your primary DNS provider stops answering queries under internal.example.com — those go to the delegated nameservers.

DNSSEC: Optional but Recommended

DNSSEC adds cryptographic signatures to DNS responses, preventing tampering. Enable it if you're in a regulated industry, have a high-value phishing target, or your provider makes it one-click (Cloudflare, Route 53, Google Cloud DNS).

Key steps:

Enable DNSSEC at your DNS provider
Publish the DS record at your registrar (commonly missed)
Verify with: dig example.com +dnssec +short

DNSSEC Key Rotation

Keys need periodic rotation. If your provider handles it automatically, you're fine. If you manage it yourself, a missed rotation makes your zone unresolvable for DNSSEC-validating resolvers — worse than no DNSSEC.

TTL Strategy: Balancing Speed vs. Query Volume

^{[RFC 1035]}

TTL (Time to Live) controls caching duration. Choose based on change frequency:

Record Type	TTL	Rationale
A/AAAA (services)	300s	Fast failover, acceptable query load
A/AAAA (stable)	3600s	Mail servers, nameservers rarely change
CNAME	3600s	CDN targets stable; lower for blue/green deploys
MX, TXT, CAA, NS	3600s–86400s	Rarely change; planned updates only
Pre-migration	60s	Lower 24h before any IP change

Critical: if your A record has a 1-hour TTL and you change IPs, clients hit the old IP for up to an hour. Lower to 60s a full day before migration, make the change, verify, then raise back. Some resolvers (particularly ISP resolvers) ignore low TTLs and cache for a minimum of 5-30 minutes. Verify propagation across multiple resolvers with tools like dnschecker.org.

GeoDNS and Health-Checked Failover

Geolocation routing returns different IPs based on client location — use for data residency (GDPR) or region-specific deployments. Latency-based routing measures client-to-endpoint RTT and returns the fastest — better for global APIs. Weighted routing distributes traffic by percentage — useful for canary deployments. Failover routing designates primary/secondary, returning secondary only when primary health checks fail.

Health-checked DNS adds active monitoring: the DNS provider probes endpoints and removes unhealthy ones from responses:

## Route 53: Create a health check
aws route53 create-health-check --caller-reference "api-prod-$(date +%s)" \
  --health-check-config '{
    "IPAddress": "203.0.113.10",
    "Port": 443,
    "Type": "HTTPS",
    "ResourcePath": "/healthz",
    "RequestInterval": 10,
    "FailureThreshold": 3,
    "EnableSNI": true,
    "FullyQualifiedDomainName": "api.example.com"
  }'

Design your /healthz endpoint to verify real dependencies (database, cache, etc.), not just return 200. Set failure threshold high (3+ failures) to survive transient issues.

Provider support:

Type	Route 53	Cloudflare	Google DNS	NS1
Geolocation	Yes	Yes	Yes	Yes
Latency	Yes	Yes	No	Yes
Weighted	Yes	Yes	Yes	Yes
Failover	Yes (health checks)	Yes	No	Yes

DNS as Code: Terraform

Making DNS changes through a web console causes outages. DNS belongs in version control.

Route 53 example:

resource "aws_route53_zone" "primary" {
  name = "example.com"
}
 
resource "aws_route53_record" "apex" {
  zone_id = aws_route53_zone.primary.zone_id
  name    = "example.com"
  type    = "A"
  ttl     = 300
  records = ["203.0.113.10"]
}
 
resource "aws_route53_record" "api_primary" {
  zone_id = aws_route53_zone.primary.zone_id
  name    = "api.example.com"
  type    = "A"
  ttl     = 60
  failover_routing_policy {
    type = "PRIMARY"
  }
  set_identifier  = "api-primary"
  records         = ["203.0.113.20"]
  health_check_id = aws_route53_health_check.api_primary.id
}
 
resource "aws_route53_record" "mx" {
  zone_id = aws_route53_zone.primary.zone_id
  name    = "example.com"
  type    = "MX"
  ttl     = 3600
  records = ["10 mail1.example.com"]
}
 
resource "aws_route53_record" "caa" {
  zone_id = aws_route53_zone.primary.zone_id
  name    = "example.com"
  type    = "CAA"
  ttl     = 3600
  records = [
    "0 issue \"letsencrypt.org\"",
    "0 iodef \"mailto:security@example.com\"",
  ]
}

With DNS in Terraform: peer-reviewed changes, rollback via git revert + terraform apply, audit trail, environment parity. Always use terraform import to bring existing records under management before applying changes.

Production Checklist

DNS During Incidents: First Five Minutes

When a DNS-rooted outage hits, the temptation is to start mutating records. Don't. The Facebook 2021 cascade lasted nearly six hours partly because the same DNS withdrawal also broke the tooling engineers used to diagnose the problem^{[Meta 2021-10-04 outage]}. Triage in this order: confirm authoritative reachability before touching records, isolate the resolver path before assuming a record is wrong, and only then consider rollback.

## 1. Are your authoritative nameservers reachable at all?
dig +trace example.com NS
dig @ns-1234.awsdns-12.com example.com SOA
dig @ns-1234.awsdns-12.com example.com SOA +tcp +nsid
 
## 2. Do recursive resolvers see the same answer?
for resolver in 1.1.1.1 8.8.8.8 9.9.9.9 208.67.222.222; do
  echo "=== $resolver ==="
  dig @$resolver api.example.com A +short +tries=1 +time=2
done
 
## 3. What is each resolver actually caching, and for how long?
dig @8.8.8.8 api.example.com A | awk '/^api\.example\.com\./ {print $2, $4, $5}'
 
## 4. Is a DNSSEC chain break the cause? (SERVFAIL with AD bit demand)
dig api.example.com +dnssec +cd   # +cd disables validation
dig api.example.com +dnssec       # without +cd: SERVFAIL = chain break

If dig +trace stalls at the TLD step, your delegation is broken — fix the registrar's NS records, not the zone. If +trace succeeds but resolvers disagree, you have a propagation issue and need to wait for TTL expiry, not push more changes. If +cd returns answers but the validating query returns SERVFAIL, your DNSSEC signatures or DS record are out of sync — disable DNSSEC at the registrar before mutating anything else. Maintain at least one out-of-band channel (status page on a different domain, mobile-only Slack workspace, paper runbook with phone numbers) so the team can coordinate when the primary domain itself is the failure.

Multi-Region DNS: Pick the Right Routing Policy

Routing policies are not interchangeable. Picking the wrong one is how teams end up serving European traffic from us-east-1 because the resolver IP geolocates incorrectly.

Workload	Routing Policy	Why
Stateless API, global users	Latency-based	Resolver-to-region RTT is the right proxy for user latency; ignores legal boundaries.
Regulated data (GDPR, PIPL)	Geolocation (continent)	Hard residency boundary; latency-based may route a Frankfurt user to Virginia at 3 AM.
Active-passive DR	Failover with health checks	Health check pulls primary on three consecutive failures; secondary stays cold otherwise.
Canary or blue-green release	Weighted (1/99, then 50/50)	Shift traffic incrementally; combine with health checks for automatic rollback.
Single-region with anycast in front	Simple A/AAAA	The CDN handles geographic distribution; DNS just points at the anycast VIP.

A common anti-pattern: latency-based routing for a workload that writes to a region-pinned database. Users get routed to the geographically nearest read replica, then writes fail or replicate slowly because the primary is in another region. Either pin writes through a separate hostname (writes.api.example.com) with simple routing, or use geolocation routing aligned to your data residency.

resource "aws_route53_record" "api_eu" {
  zone_id        = aws_route53_zone.primary.zone_id
  name           = "api.example.com"
  type           = "A"
  set_identifier = "eu-west-1"
  geolocation_routing_policy { continent = "EU" }
  alias {
    name                   = aws_lb.eu_west_1.dns_name
    zone_id                = aws_lb.eu_west_1.zone_id
    evaluate_target_health = true
  }
}

Always configure a default record (continent = "*" in Route 53) — without it, requests from un-mapped countries return NODATA and your service is invisible to those users.

DNSSEC Done Right: KSK Rotation Without Outages

DNSSEC has two key types: the Key Signing Key (KSK) signs the DNSKEY RRset and is referenced by the DS record at the parent zone; the Zone Signing Key (ZSK) signs everything else. Rotating the ZSK is mostly automatic; rotating the KSK requires coordinating with your registrar. Get this wrong and the entire zone goes SERVFAIL for validating resolvers — closer to half the internet today^{[RFC 1035]}.

The safe sequence is double-DS rollover: publish the new DS alongside the old one, wait for parent TTL plus a safety margin, then remove the old DS and old DNSKEY.

## Generate a new KSK (BIND example; cloud providers automate this)
dnssec-keygen -a ECDSAP256SHA256 -f KSK -n ZONE example.com
 
## Generate the DS record to publish at the registrar
dnssec-dsfromkey -2 Kexample.com.+013+12345.key
 
## Verify the chain after publishing the new DS
dig example.com DNSKEY +dnssec +short
dig example.com DS @a.gtld-servers.net. +short
delv @1.1.1.1 example.com A +rtrace   # validates end-to-end
 
## Watch for SERVFAIL during the rollover window
dig @1.1.1.1 example.com +dnssec | grep -E 'status|flags'

Algorithm choice matters: prefer ECDSA P-256 (algorithm 13) over RSA-2048 — smaller signatures, smaller responses, lower amplification factor. Schedule KSK rotation no more than annually; ZSK rotation every 30-90 days is reasonable. Most teams should let the DNS provider (Route 53, Cloudflare, NS1) handle this automatically; only roll your own DNSSEC key management if you have a compliance requirement that forbids the provider holding key material. ^{[Terraform Docs]}

Frequently Asked Questions

Why can't I CNAME at apex?

The apex domain (example.com) must have SOA and NS records. A CNAME cannot coexist with other record types on the same name. Use ALIAS/ANAME records (Cloudflare CNAME flattening, Route 53 ALIAS, DNSimple ALIAS) or fall back to A records pointing to a stable IP.

How do I know if my DNS change has propagated?

Use dig api.example.com to see the TTL remaining on cached records. Lower TTLs propagate faster, but some ISP resolvers ignore low TTLs and cache for 5-30 minutes anyway. Verify across multiple public resolvers: dig @8.8.8.8, dig @1.1.1.1, dig @9.9.9.9.

What happens if I exceed the SPF 10-lookup limit?

SPF evaluation returns permerror and receivers may reject or defer your email entirely. Count lookups with dig TXT _spf.google.com to see nested includes. Consolidate include directives and use ip4:/ip6: for static IPs (which don't count as lookups).

Can I change my DNS provider without breaking anything?

Yes, but verify CAA records after the migration. If you add CAA records and the new provider is not listed, certificate renewal will fail silently. Always check: dig example.com CAA +short.

Why should health checks check the database?

Health checks should verify real dependencies. If your healthz endpoint always returns 200 but your database is down, DNS will keep returning the IP and traffic still fails. Check database connectivity with a timeout; set failure threshold high (3 failures) to survive transient issues.

Keep Reading

What Happens When You Type a URL: The Complete Production Guide — DNS is layer one of the request lifecycle; this walks through every subsequent layer from TCP handshake to CDN edge logic
Terraform in Production: Modules, State Management, and CI/CD Patterns — The Terraform patterns used in this article's DNS-as-code section, plus state locking, module design, and CI/CD integration
HTTP/1.1 vs HTTP/2 vs HTTP/3: The Protocol Evolution Guide — After DNS resolves the IP, the protocol negotiation determines how fast data moves; covers HTTP/2 multiplexing and QUIC's 0-RTT handshake
Kubernetes Networking Deep Dive — How CoreDNS handles in-cluster service resolution and the ndots:5 tax that surprises every team migrating to Kubernetes
TCP vs UDP vs QUIC: Protocol Selection Under Production Load — DNS over UDP (port 53), DoT/DoH transport, and the resolution failover semantics

Was this article helpful?

Your feedback directly shapes our editorial depth and technical accuracy.

BackendBytes Engineering Team

Engineering Team

A multidisciplinary team of backend engineers, architects, and DevOps practitioners shipping deep dives into distributed systems and production infrastructure.

DNS Records: The Complete Production Guide for Backend Engineers

Key Takeaways

The DNS resolution path

The quick start: Record types by purpose

A and AAAA: Domain-to-IP Mapping

CNAME: Aliases and the Apex Restriction

MX: Email Routing with Priority

TXT: SPF, DKIM, and DMARC for Email Authentication

CAA: Lock Down Certificate Issuance

SRV: Service Discovery with Port Numbers

NS: Zone Delegation

DNSSEC: Optional but Recommended

TTL Strategy: Balancing Speed vs. Query Volume

GeoDNS and Health-Checked Failover

DNS as Code: Terraform

Production Checklist

DNS During Incidents: First Five Minutes

Multi-Region DNS: Pick the Right Routing Policy

DNSSEC Done Right: KSK Rotation Without Outages

Frequently Asked Questions

Keep Reading

Was this article helpful?

Read Next

HTTP/1.1 vs HTTP/2 vs HTTP/3: The Protocol Evolution Guide

OAuth2 and OpenID Connect: Production Security Patterns

Consistent Hashing: The Algorithm Behind Every Scalable Distributed System

HTTP/1.1 vs HTTP/2 vs HTTP/3: The Protocol Evolution Guide

OAuth2 and OpenID Connect: Production Security Patterns

Consistent Hashing: The Algorithm Behind Every Scalable Distributed System