PKI Design
- Two tiers. Offline root signs an intermediate; the intermediate signs everything else. Never issue end-entity certs directly from the root.
- The root's private key lives on a machine that has never seen a network. Ideally an HSM or a smartcard kept in a safe; at minimum an air-gapped laptop you only boot to sign.
- Subject Alternative Name is where the names live. CN is decorative. Modern clients ignore CN entirely.
- Short-lived leaves beat revocation. A 24-hour certificate has no CRL problem.
- Put Extended Key Usage on everything. A cert without EKU is a cert that can do anything.
Why two tiers
A single self-signed CA that issues every certificate on your network is tempting — it is one key, one file, one command. It is also unrecoverable. The day that key is compromised (a backup leaks, an admin's laptop is lost, a contractor walks off with a USB) every certificate you have ever issued is burnt, and you have no way to tell the fleet "trust this new one instead" without reconfiguring every client by hand.
The industry answer is a two-tier hierarchy:
- Root CA. Long-lived (10-20 years), small (issues only intermediates), offline. Its job is to vouch for intermediates. It signs a handful of certificates in its lifetime.
- Intermediate (issuing) CA. Medium-lived (3-5 years), online, automated. This is what your ACME server, smallstep
step-ca, Vault PKI mount, or Microsoft ADCS enterprise CA actually is. It issues the end-entity certificates.
When the intermediate is compromised you revoke it with the root and stand up a new one. Clients still trust the root, so they accept the new intermediate's signatures with no change on their end. The rotation is bounded and safe.
Key material hygiene
The root's private key is the most valuable secret in your estate. It deserves more than a file on a shared drive.
In order of decreasing safety
- Hardware Security Module (HSM) with FIPS 140-2 L3. YubiHSM 2, SoftHSM (bad — not hardware), Thales Luna, AWS CloudHSM. The key never leaves the HSM; you issue sign operations against it.
- Smartcard / YubiKey PIV slot. Good enough for a small estate. Keep two, kept apart, enrolled identically as "root key A" and "root key B" — only one is the canonical signer; the other is the disaster-recovery spare in a safe in a different building.
- Air-gapped laptop. A dedicated machine, never networked, booted from a live USB or a fully disk-encrypted install. Key lives in a LUKS volume on the machine. Acceptable for most internal PKIs.
- Root on the same server as the intermediate. Do not do this. Then you have one tier pretending to be two.
Ceremony
"Key ceremony" is not a silly word. When you generate the root key, and whenever you use it (to sign a new intermediate, to publish a CRL), run it as a logged, two-person procedure:
- Two people present, one operating, one witnessing.
- Written (paper) checklist of the exact commands.
- Photos or a video of each screen.
- The signed artefact and its hash written to a log.
- The machine shut down, the key media returned to the safe, the safe logged.
This sounds like theatre, and it is — but the theatre is the audit trail. When you are asked in 2030 "who signed this intermediate?" the answer exists.
Subjects, SAN, CN
Every X.509 certificate has a Subject Distinguished Name (DN) and zero-or-more Subject Alternative Names (SAN). For nearly two decades the CN inside the DN was also used to identify hosts, so people got into the habit of putting CN=www.example.com on web server certs. RFC 2818 deprecated that in 2000 and Chrome stopped honouring CN entirely in 2017. Put names in the SAN.
Subject: CN = web-01.prod.example.internal, O = Example, C = GB
SAN: DNS:web-01.prod.example.internal, DNS:web.prod.example.internal,
IP:10.2.3.4
The CN can be anything human-readable — it is effectively a label. Modern tooling will not look at it for hostname matching. Putting the primary hostname in the CN is a convention that helps humans scanning a cert with openssl x509 -subject; it does not affect TLS behaviour.
Key Usage and Extended Key Usage
A certificate without Extended Key Usage (EKU) is a certificate that a naive verifier will accept for anything: TLS server, TLS client, code signing, email S/MIME. Set EKU on every leaf.
| Purpose | Key Usage | Extended Key Usage (OID) |
|---|---|---|
| Web server | digitalSignature, keyEncipherment | serverAuth (1.3.6.1.5.5.7.3.1) |
| Web/mTLS client | digitalSignature | clientAuth (1.3.6.1.5.5.7.3.2) |
| Code signing | digitalSignature | codeSigning (1.3.6.1.5.5.7.3.3) |
| S/MIME | digitalSignature, keyEncipherment | emailProtection (1.3.6.1.5.5.7.3.4) |
| OCSP responder | digitalSignature | OCSPSigning (1.3.6.1.5.5.7.3.9) |
| CA | keyCertSign, cRLSign | (usually omitted on root/intermediate; use basicConstraints: CA:TRUE) |
mTLS authentication is the commonest place this matters. A cert issued with serverAuth only should be rejected by a server verifying client certs — and in OpenSSL 1.1.1+ it is. If you want a cert to do both (some internal LB setups), list both EKUs explicitly.
Chain of trust
A TLS server presents its own cert plus any intermediates up to but not including the root. The client already has the root; it uses the provided intermediates to build a path to one of its trusted roots.
┌─────────────────────────┐
│ ROOT CA │ self-signed, offline
│ "Example Root CA G1" │ 20-year lifetime
└───────────┬─────────────┘
│ signs
┌───────────▼─────────────┐
│ INTERMEDIATE CA │ online, automated
│ "Example Issuing CA G1" │ 5-year lifetime
└───────────┬─────────────┘
│ signs
┌──────────────────┼───────────────────┐
▼ ▼ ▼
┌────────────────┐ ┌────────────────┐ ┌────────────────┐
│ web-01.example │ │ db-01.example │ │ user@example │
│ serverAuth │ │ serverAuth, │ │ clientAuth, │
│ 90 days │ │ clientAuth 90d │ │ emailProt 1y │
└────────────────┘ └────────────────┘ └────────────────┘
What to put in each store:
- Clients' trust store: root only.
- Server's certificate file: leaf + intermediate(s), in that order. Nginx, Apache, HAProxy, and most servers expect the leaf first. Omitting the intermediate is the single commonest misconfiguration: your browser might work (it caches intermediates) but
curl,openssl s_client, and load-balancer health checks will fail. - Server's key file: private key, never chained.
# Verify a chain end-to-end
openssl verify -CAfile root.crt -untrusted intermediate.crt leaf.crt
# leaf.crt: OK
# What your server actually sends
openssl s_client -connect web-01:443 -showcerts < /dev/null \
| awk '/BEGIN CERT/,/END CERT/'
Revocation: CRL, OCSP, stapling
A compromised cert is live until it expires unless something tells the client not to trust it anymore. Three mechanisms, each with tradeoffs.
| Mechanism | How it works | Tradeoffs |
|---|---|---|
| CRL (Certificate Revocation List) | CA publishes a signed list of revoked serials at a URL in the cert's cRLDistributionPoints. Clients fetch and cache it. |
Simple, cacheable, offline-tolerant. Grows unboundedly. Clients may cache stale CRLs (validity window usually 7 days). Bandwidth if large. |
| OCSP (Online Certificate Status Protocol) | Client asks a CA-run responder "is serial X revoked?" over HTTP. Answer is a signed response valid for minutes-to-days. | Real-time freshness; the responder sees who is visiting what (privacy leak); load on the responder; latency on the TLS handshake; "soft-fail" is the default — if the responder is down the client proceeds. |
| OCSP stapling | Server periodically fetches the OCSP response for its own cert and staples it to the TLS handshake. | No extra RTT, no client privacy leak, no load on the responder per-connection. Needs server support and a working responder at fetch time. The Must-Staple extension (OID 1.3.6.1.5.5.7.1.24) forces clients to hard-fail if no staple. |
Rotation and lifetimes
Rotation frequency is a function of blast radius. The shorter the lifetime, the smaller the window of abuse when a key leaks.
| Tier | Typical lifetime | Reason |
|---|---|---|
| Root CA | 10-20 years | Distributing a new root to every client is expensive. You generate it once, you keep it offline, you hope. |
| Intermediate CA | 3-5 years, with a successor generated and distributed at least 6 months before expiry | Gives clients time to pick up the new chain before old leaves are re-issued against it. |
| TLS server / client (long-lived) | 90 days | Matches WebPKI norms (Let's Encrypt). Forces automation. |
| Workload identity (mTLS in a mesh) | 1-24 hours | SPIFFE/SPIRE, Vault dynamic secrets, Istio, step-ca. At this lifetime, revocation is moot. |
| Code signing | 1-3 years | Timestamped signatures remain valid past cert expiry. |
Rollover pattern for an intermediate
- At T − 12 months (from expiry): sign a new intermediate from the offline root. Keep the new one alongside the old in every server's chain file.
- At T − 6 months: start issuing new leaves from the new intermediate only.
- At T − 0: old leaves have expired. Remove the old intermediate from server chain files.
- Revoke the old intermediate at the root and publish the root CRL.
Certificate Transparency
For the public WebPKI, Certificate Transparency logs are a world-readable, append-only record of every certificate a public CA has issued. Chrome requires at least two CT log signatures (SCTs) on certs issued after April 2018 or it shows a warning.
For a private PKI this does not apply — your internal names are not (and should not be) in public CT logs. But two operational points:
- Use internal-only TLDs (
.internal,.corp,.lan, or a subdomain of a real domain you own) for private-CA-issued certs. Public names in a private CA tempt staff into bypassing the public PKI. - If you also have public-CA certs, use crt.sh periodically to find certs issued for your domains you did not request — a signal of either shadow IT or an attacker.
Worked example: build a 2-tier CA with openssl
Not production-hardened — production goes to step-ca, Vault PKI, or ADCS — but a useful exercise for understanding what the tools do under the hood.
Layout
pki/
├── root/
│ ├── openssl.cnf
│ ├── private/ca.key.pem # offline, chmod 400
│ ├── certs/ca.cert.pem
│ ├── newcerts/
│ ├── crl/
│ ├── index.txt # empty initially
│ └── serial # contains: 1000
└── intermediate/
├── openssl.cnf
├── private/intermediate.key.pem
├── certs/intermediate.cert.pem
├── csr/intermediate.csr.pem
├── newcerts/
├── crl/
├── index.txt
└── serial
Step 1 — Root key and self-signed root cert
cd pki/root
# 4096-bit RSA; for new deployments prefer ECDSA P-384:
# openssl ecparam -name secp384r1 -genkey -out private/ca.key.pem
openssl genrsa -aes256 -out private/ca.key.pem 4096
chmod 400 private/ca.key.pem
# Self-sign the root with -extensions v3_ca (set CA:TRUE, keyUsage)
openssl req -config openssl.cnf \
-key private/ca.key.pem \
-new -x509 -days 7300 -sha384 \
-extensions v3_ca \
-out certs/ca.cert.pem
openssl x509 -in certs/ca.cert.pem -noout -text | head
Step 2 — Intermediate key and CSR
cd ../intermediate
openssl genrsa -aes256 -out private/intermediate.key.pem 4096
chmod 400 private/intermediate.key.pem
openssl req -config openssl.cnf -new -sha384 \
-key private/intermediate.key.pem \
-out csr/intermediate.csr.pem
Step 3 — Root signs the intermediate
cd ../root
openssl ca -config openssl.cnf -extensions v3_intermediate_ca \
-days 1825 -notext -md sha384 \
-in ../intermediate/csr/intermediate.csr.pem \
-out ../intermediate/certs/intermediate.cert.pem
# Verify it chains
openssl verify -CAfile certs/ca.cert.pem \
../intermediate/certs/intermediate.cert.pem
# intermediate.cert.pem: OK
Relevant bits of the root's openssl.cnf:
[ v3_ca ]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer
basicConstraints = critical, CA:true
keyUsage = critical, digitalSignature, cRLSign, keyCertSign
[ v3_intermediate_ca ]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer
basicConstraints = critical, CA:true, pathlen:0
keyUsage = critical, digitalSignature, cRLSign, keyCertSign
pathlen:0 on the intermediate forbids it from signing further CAs — it can only sign leaves. This is almost always what you want.
Step 4 — Issue a server certificate
cd ../intermediate
# Leaf key (ECDSA this time — faster, smaller)
openssl ecparam -name secp384r1 -genkey -out private/web-01.key.pem
chmod 400 private/web-01.key.pem
# CSR with SANs in an extension file
cat > csr/web-01.ext <<'EOF'
subjectAltName = DNS:web-01.prod.example.internal,DNS:web.prod.example.internal
EOF
openssl req -new -sha384 \
-key private/web-01.key.pem \
-subj "/C=GB/O=Example/CN=web-01.prod.example.internal" \
-addext "subjectAltName=DNS:web-01.prod.example.internal,DNS:web.prod.example.internal" \
-out csr/web-01.csr.pem
# Sign with the server extensions profile
openssl ca -config openssl.cnf -extensions server_cert \
-days 90 -notext -md sha384 \
-in csr/web-01.csr.pem \
-out certs/web-01.cert.pem
# Build a chain file for the web server
cat certs/web-01.cert.pem certs/intermediate.cert.pem > certs/web-01.fullchain.pem
Leaf profile in openssl.cnf:
[ server_cert ]
basicConstraints = CA:FALSE
nsCertType = server
keyUsage = critical, digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer:always
subjectAltName = ${ENV::SAN}
crlDistributionPoints = URI:http://pki.example.internal/crl/intermediate.crl
authorityInfoAccess = OCSP;URI:http://pki.example.internal/ocsp
Step 5 — Publish a CRL
openssl ca -config openssl.cnf \
-gencrl -out crl/intermediate.crl.pem
# DER form for HTTP distribution
openssl crl -in crl/intermediate.crl.pem -outform DER \
-out /var/www/pki/crl/intermediate.crl
Regenerate on a cron (daily; sooner after a revocation). The CRL's nextUpdate is a hard ceiling — clients stop trusting a stale CRL, and if they have crlCheck = strict they stop trusting the issuer's certs entirely.
Common pitfalls
- No intermediate sent by the server. Browsers hide it (they cache intermediates). Non-browser clients do not. Always ship the chain file.
- SAN missing primary name. Only the CN was set. Chrome/Firefox/
curlrefuse. - Mixed signature algorithms. Root is SHA-1, intermediate is SHA-256, leaf is SHA-384. Some clients complain about the weakest link in the chain. Pick SHA-256 or SHA-384 everywhere.
- Clock skew. A leaf's
notBeforeis in the future because the issuing CA's clock is fast. All verifications fail for the first few minutes. Run chrony on the CA host (chrony). - Root buried in a forgotten password manager. If the two humans who know the root key passphrase leave, your CA is effectively dead. Document recovery; test it.
- Revoking but forgetting to publish. The CRL is only honoured if clients fetch it. Check
openssl crl -in intermediate.crl -noout -textshows the expected revoked serials, then check the CDP URL actually serves the new file. - Wildcards instead of SANs. A
*.prod.example.internalcert is convenient and makes the blast radius of a compromise enormous. Prefer explicit SAN lists; automation makes them cheap. - RSA 2048 "because it's fine". It is fine, but new deployments should default to ECDSA P-256 or P-384 — smaller, faster, and well-supported everywhere except some ancient embedded devices.
See also: Certificates (index), SSH Certificate Authorities, Keycloak + LDAPS.