CDN / WAF Concepts
What a CDN is for
A CDN sits between users and your origin to cache content closer to clients, absorb traffic spikes, terminate TLS at the edge, and optionally enforce WAF or bot controls before the request reaches your infrastructure. The win is latency and origin offload, but only if you are deliberate about what is cacheable and what must always reach origin.
Not everything belongs in cache. Static assets, product pages, and API responses with explicit freshness rules are good candidates. Authenticated HTML, admin paths, and anything that depends on per-user state should normally bypass. If the origin is nginx or Apache, the edge should complement your origin config, not replace the need to understand it; see Nginx Reverse Proxy and Apache Basics for origin-side patterns.
Typical split
- Cache aggressively: /assets/*, versioned JS/CSS, images, docs, public API responses with explicit TTL
- Cache carefully: public HTML with short TTL + stale-while-revalidate
- Do not cache by default: /login, /admin, /account, cart/checkout, authenticated APIs
Remember that browser cache and CDN cache are separate layers. A response might be cacheable at the edge but revalidated by browsers every request, or the reverse. Make that distinction explicit in your headers and in your incident notes.
Cache keys and variant control
The cache key decides which requests share the same object. If the key is too narrow, users receive the wrong content. If it is too broad, your hit rate collapses and the CDN becomes an expensive pass-through.
# Conceptual cache key policy
key = scheme + host + normalized_path + selected_query_params
ignore_query = utm_* fbclid gclid
vary_headers = Accept-Encoding
bypass_on = Authorization sessionid csrftoken
Three practical rules:
- Normalize marketing junk out of the key.
?utm_source=...should not create a new cache object. - Do not vary on cookies or headers unless the origin response truly changes for that value.
- Never let authenticated traffic share a cache object with anonymous traffic.
At the origin, keep the same intent visible so bypass logic is not hidden inside the CDN UI alone:
map $http_cookie $skip_cache {
default 0;
~*(session|csrftoken|auth_token)= 1;
}
proxy_cache_bypass $http_authorization $skip_cache;
proxy_no_cache $http_authorization $skip_cache;
That origin-side pattern mirrors the same logic described in Nginx Reverse Proxy. If you cannot explain the cache key in one sentence, it is probably too complicated.
Origin protection and real client IP
The edge only matters if the origin is protected from direct access. Otherwise an attacker can hit the origin IP directly and skip the CDN, the WAF, and any rate limit enforced there. Lock down the origin to trusted edge addresses and make sure the origin rewrites the client IP correctly.
server {
listen 443 ssl http2;
server_name origin.example.com;
allow 198.51.100.0/24;
allow 2001:db8:1234::/48;
deny all;
set_real_ip_from 198.51.100.0/24;
set_real_ip_from 2001:db8:1234::/48;
real_ip_header X-Forwarded-For;
real_ip_recursive on;
}
Important nuance: X-Forwarded-For is only trustworthy when it comes from a trusted proxy. From the open internet, it is user input. If your provider injects a dedicated single-IP header such as True-Client-IP or CF-Connecting-IP, prefer that and trust only the provider ranges that can set it.
On Apache, the same concept is handled by mod_remoteip. If you want the origin host firewall to enforce the same boundary, pair the webserver allowlist with firewalld Rich Rules or upstream security group rules.
Cache poisoning pitfalls
Cache poisoning happens when user-controlled input influences the cached response in a way the key does not capture. The classic outcome is that one malicious request stores the wrong object, then normal users receive it.
Common causes:
- Origin behavior changes based on
Host,X-Forwarded-Host, or similar headers, but the cache key does not. - Marketing or debug query parameters change the response and are accidentally cacheable.
- Redirects, 404 pages, or error bodies echo user input and then get cached.
- Authenticated and anonymous responses share a key because cookies or auth headers were ignored.
# Bad: user-controlled forwarded host changes the response
proxy_set_header X-Forwarded-Host $http_x_forwarded_host;
# Better: force forwarded host to the validated request host
proxy_set_header X-Forwarded-Host $host;
proxy_set_header Host $host;
Poisoning smell test
- Can a header or query string change the HTML body, redirects, or asset URLs?
- Is that same value present in the cache key?
- If not, assume the object is poisonable until proven otherwise.
WAF rule classes
WAF rules are best understood by class rather than vendor brand. Most managed rule sets are trying to detect the same families of bad input and bad behavior:
- Protocol violations: malformed HTTP, illegal encodings, suspicious header structure.
- Injection attacks: SQLi, XSS, command injection, template injection.
- Traversal and file abuse: path traversal, LFI/RFI, upload abuse.
- Credential attacks: login brute force, account enumeration, credential stuffing.
- Bot and scraper control: abnormal request rates, known bad ASNs, poor browser fingerprints.
The WAF is not a substitute for validation in the application. It is a compensating control and an early filter. If the app trusts spoofable headers, reflects unsanitized input, or exposes a dangerous debug route, the WAF may slow the attack down but it will not make the design safe.
Managed rules, custom rules, and CRS paranoia
Managed rules are vendor-maintained signatures and heuristics. They are the fastest way to cover broad exploit classes, but they do not know your application's normal traffic shape. Custom rules handle your actual risk: geo restrictions, protecting admin paths, blocking known abusive patterns, or creating allowlists for endpoints that intentionally look "weird".
If you use ModSecurity with the OWASP Core Rule Set (CRS), the nearest concept to CDN WAF "sensitivity" is the combination of paranoia level and anomaly scoring.
SecRuleEngine On
IncludeOptional /etc/modsecurity.d/owasp-crs/crs-setup.conf
IncludeOptional /etc/modsecurity.d/owasp-crs/rules/*.conf
# Start conservative in block mode
SecAction "id:900000,phase:1,nolog,pass,t:none,setvar:tx.paranoia_level=1"
SecAction "id:900110,phase:1,nolog,pass,t:none,setvar:tx.inbound_anomaly_score_threshold=10"
Operationally:
- PL1 is the sane starting point for most public apps.
- PL2 catches more edge cases but raises false-positive risk.
- PL3/PL4 are usually for very controlled traffic profiles, not random internet user traffic.
When a single endpoint is noisy, tune precisely instead of disabling half the ruleset:
# Example: remove a specific CRS rule for one upload endpoint
SecRule REQUEST_URI "@beginsWith /api/upload" \
"id:100100,phase:1,pass,nolog,ctl:ruleRemoveById=941100"
Managed rules give you breadth. Custom rules and targeted exclusions give you survivability.
False positives and rate limits
Every new WAF deployment has false positives. The goal is not zero noise on day one; it is fast triage, narrow exclusions, and keeping protection active while you learn real traffic.
Good rollout pattern:
- Enable new rules in detect or log mode where possible.
- Capture request IDs, rule IDs, action, host, path, and client IP.
- Tune only the rule or endpoint that is noisy.
- Promote to blocking once you have evidence.
Rate limiting has the same dependency: it only works if real client IP is fixed first. A limit on the CDN egress IP is useless; a limit on the true client is valuable. Be careful with NAT-heavy populations such as offices, schools, mobile carriers, and corporate VPNs.
limit_req_zone $binary_remote_addr zone=login_zone:10m rate=5r/m;
server {
location = /login {
limit_req zone=login_zone burst=10 nodelay;
proxy_pass http://app_backend;
}
}
Safer rate-limit targets
- POST /login or /token, not the whole site
- Per client IP after real-ip restoration
- Optionally combine with username, API key, or session identity if the edge product supports it
- Exempt health checks and synthetic monitoring explicitly
If your app serves APIs to partners, publish rate-limit behavior clearly. Silent blocking looks like packet loss to the caller and creates awful troubleshooting loops.
Edge logging and incident evidence
If the CDN or WAF blocks something important and you cannot answer "which rule, which request, which client, which origin status," you do not have usable edge logging. Treat edge logs as part of your observability stack, not just a security add-on.
At minimum, keep:
- timestamp, host, method, path, query, response status
- edge action: allow, challenge, block, rate-limited, cache hit or miss
- request ID or trace ID that can be correlated with origin logs
- real client IP, geo, ASN, and TLS details if the platform provides them
- WAF rule ID or managed rule category when a request is challenged or blocked
log_format edge_origin
'$remote_addr real=$http_x_forwarded_for req_id=$http_x_request_id '
'host=$host status=$status upstream_status=$upstream_status '
'"$request"';
access_log /var/log/nginx/access.log edge_origin;
curl -I https://www.example.com/assets/app.js
curl -I https://www.example.com/login
From those responses, inspect headers such as Cache-Control, Age, Via, provider cache status headers, and any request ID propagated by the edge. Pair that with the wider guidance on Observability Overview and Troubleshooting when you are building an incident trail.
mTLS origin pull
Mutual TLS on the origin connection lets the CDN prove its identity to your origin with a client certificate. This is stronger than IP allowlists alone because the origin validates a certificate chain, not just the source address. In practice you often use both: CDN IP allowlists for reachability and mTLS for authenticated origin pulls.
server {
listen 443 ssl http2;
server_name origin.example.com;
ssl_certificate /etc/pki/tls/certs/origin.example.com/fullchain.pem;
ssl_certificate_key /etc/pki/tls/private/origin.example.com/privkey.pem;
ssl_client_certificate /etc/pki/tls/certs/cdn-origin-pull-ca.pem;
ssl_verify_client on;
ssl_verify_depth 2;
location / {
proxy_pass http://app_backend;
include /etc/nginx/snippets/proxy-headers.conf;
}
}
The origin still needs a normal server certificate and hostname validation, which is why certificate automation remains relevant even when a CDN fronts the site; see ACME & Certbot. If you enforce mTLS, monitor expiry on both sides: the origin server certificate and the client CA or client certificate used by the CDN.
Troubleshooting
When debugging edge behavior, test both paths: through the CDN and, where safe and authorized, directly to origin. The question is always "did the edge change the request, did the origin trust the wrong thing, or did the cache key group the wrong traffic together?"
| Symptom | Likely cause | Fix |
|---|---|---|
| Cache hit ratio is terrible | Key includes junk query strings or cache bypass is triggered by broad cookies. | Normalize query params, narrow bypass conditions, and confirm what really varies. |
| Logged-in pages are being cached | Auth cookies or Authorization were not excluded from cache logic. |
Bypass cache on auth headers and session cookies at both edge and origin. |
| Origin logs only show CDN egress IPs | real_ip or mod_remoteip is not configured correctly. |
Trust only the CDN ranges and restore the client IP from the correct header. |
| Attack traffic reaches origin directly | Origin IP is public and not restricted. | Enforce allowlists at the webserver, firewall, or security group level; consider mTLS origin pull. |
| Users receive the wrong redirect, locale, or asset URL from cache | Cache poisoning via host or forwarded-host handling, or query params changed the response without changing the key. | Normalize host headers, review reflected input, and align cache key with response variants. |
| WAF blocks legitimate JSON or GraphQL requests | Managed rules or CRS paranoia are too aggressive for that endpoint. | Start at PL1 or detection mode, capture the rule ID, and create a narrow exclusion for that route only. |
| Rate limits hit whole offices or VPN users | The key is only client IP and many users share one egress IP. | Scope rate limits to risky endpoints and combine with identity signals where supported. |
| mTLS origin pull handshakes fail | Wrong client CA bundle, wrong SNI or hostname, or expired origin-pull cert material. | Verify the trusted client CA, confirm the origin certificate and name, and inspect both edge and origin TLS logs. |
| Incidents are hard to reconstruct | Edge logs lack rule IDs, request IDs, or client identity fields. | Add correlation fields and keep edge logs where operations teams can search them during an incident. |