PKI Design

How to lay out a two-tier private PKI that survives contact with humans: offline root, online issuing CA, key material hygiene, SAN/EKU discipline, revocation tradeoffs, rotation, and worked openssl examples.

If you only remember five things
  • Two tiers. Offline root signs an intermediate; the intermediate signs everything else. Never issue end-entity certs directly from the root.
  • The root's private key lives on a machine that has never seen a network. Ideally an HSM or a smartcard kept in a safe; at minimum an air-gapped laptop you only boot to sign.
  • Subject Alternative Name is where the names live. CN is decorative. Modern clients ignore CN entirely.
  • Short-lived leaves beat revocation. A 24-hour certificate has no CRL problem.
  • Put Extended Key Usage on everything. A cert without EKU is a cert that can do anything.

Why two tiers

A single self-signed CA that issues every certificate on your network is tempting — it is one key, one file, one command. It is also unrecoverable. The day that key is compromised (a backup leaks, an admin's laptop is lost, a contractor walks off with a USB) every certificate you have ever issued is burnt, and you have no way to tell the fleet "trust this new one instead" without reconfiguring every client by hand.

The industry answer is a two-tier hierarchy:

When the intermediate is compromised you revoke it with the root and stand up a new one. Clients still trust the root, so they accept the new intermediate's signatures with no change on their end. The rotation is bounded and safe.

Three tiers? You rarely need them outside of a WebPKI context or a very large organisation that wants a separate policy-CA per business unit. Two is the pragmatic default.

Key material hygiene

The root's private key is the most valuable secret in your estate. It deserves more than a file on a shared drive.

In order of decreasing safety

  1. Hardware Security Module (HSM) with FIPS 140-2 L3. YubiHSM 2, SoftHSM (bad — not hardware), Thales Luna, AWS CloudHSM. The key never leaves the HSM; you issue sign operations against it.
  2. Smartcard / YubiKey PIV slot. Good enough for a small estate. Keep two, kept apart, enrolled identically as "root key A" and "root key B" — only one is the canonical signer; the other is the disaster-recovery spare in a safe in a different building.
  3. Air-gapped laptop. A dedicated machine, never networked, booted from a live USB or a fully disk-encrypted install. Key lives in a LUKS volume on the machine. Acceptable for most internal PKIs.
  4. Root on the same server as the intermediate. Do not do this. Then you have one tier pretending to be two.

Ceremony

"Key ceremony" is not a silly word. When you generate the root key, and whenever you use it (to sign a new intermediate, to publish a CRL), run it as a logged, two-person procedure:

This sounds like theatre, and it is — but the theatre is the audit trail. When you are asked in 2030 "who signed this intermediate?" the answer exists.

Subjects, SAN, CN

Every X.509 certificate has a Subject Distinguished Name (DN) and zero-or-more Subject Alternative Names (SAN). For nearly two decades the CN inside the DN was also used to identify hosts, so people got into the habit of putting CN=www.example.com on web server certs. RFC 2818 deprecated that in 2000 and Chrome stopped honouring CN entirely in 2017. Put names in the SAN.

Subject: CN = web-01.prod.example.internal, O = Example, C = GB
SAN:     DNS:web-01.prod.example.internal, DNS:web.prod.example.internal,
         IP:10.2.3.4

The CN can be anything human-readable — it is effectively a label. Modern tooling will not look at it for hostname matching. Putting the primary hostname in the CN is a convention that helps humans scanning a cert with openssl x509 -subject; it does not affect TLS behaviour.

IP SANs are real but awkward. They work; they are not deprecated; but they are not rotated when you re-IP, and they defeat most service-mesh identity schemes. Prefer a DNS SAN that resolves to the IP.

Key Usage and Extended Key Usage

A certificate without Extended Key Usage (EKU) is a certificate that a naive verifier will accept for anything: TLS server, TLS client, code signing, email S/MIME. Set EKU on every leaf.

PurposeKey UsageExtended Key Usage (OID)
Web serverdigitalSignature, keyEnciphermentserverAuth (1.3.6.1.5.5.7.3.1)
Web/mTLS clientdigitalSignatureclientAuth (1.3.6.1.5.5.7.3.2)
Code signingdigitalSignaturecodeSigning (1.3.6.1.5.5.7.3.3)
S/MIMEdigitalSignature, keyEnciphermentemailProtection (1.3.6.1.5.5.7.3.4)
OCSP responderdigitalSignatureOCSPSigning (1.3.6.1.5.5.7.3.9)
CAkeyCertSign, cRLSign(usually omitted on root/intermediate; use basicConstraints: CA:TRUE)

mTLS authentication is the commonest place this matters. A cert issued with serverAuth only should be rejected by a server verifying client certs — and in OpenSSL 1.1.1+ it is. If you want a cert to do both (some internal LB setups), list both EKUs explicitly.

Chain of trust

A TLS server presents its own cert plus any intermediates up to but not including the root. The client already has the root; it uses the provided intermediates to build a path to one of its trusted roots.

                     ┌─────────────────────────┐
                     │        ROOT CA          │  self-signed, offline
                     │   "Example Root CA G1"  │  20-year lifetime
                     └───────────┬─────────────┘
                                 │ signs
                     ┌───────────▼─────────────┐
                     │    INTERMEDIATE CA      │  online, automated
                     │ "Example Issuing CA G1" │  5-year lifetime
                     └───────────┬─────────────┘
                                 │ signs
              ┌──────────────────┼───────────────────┐
              ▼                  ▼                   ▼
     ┌────────────────┐ ┌────────────────┐  ┌────────────────┐
     │ web-01.example │ │ db-01.example  │  │ user@example   │
     │ serverAuth     │ │ serverAuth,    │  │ clientAuth,    │
     │ 90 days        │ │ clientAuth 90d │  │ emailProt 1y   │
     └────────────────┘ └────────────────┘  └────────────────┘

What to put in each store:

# Verify a chain end-to-end
openssl verify -CAfile root.crt -untrusted intermediate.crt leaf.crt
# leaf.crt: OK

# What your server actually sends
openssl s_client -connect web-01:443 -showcerts < /dev/null \
  | awk '/BEGIN CERT/,/END CERT/'

Revocation: CRL, OCSP, stapling

A compromised cert is live until it expires unless something tells the client not to trust it anymore. Three mechanisms, each with tradeoffs.

MechanismHow it worksTradeoffs
CRL (Certificate Revocation List) CA publishes a signed list of revoked serials at a URL in the cert's cRLDistributionPoints. Clients fetch and cache it. Simple, cacheable, offline-tolerant. Grows unboundedly. Clients may cache stale CRLs (validity window usually 7 days). Bandwidth if large.
OCSP (Online Certificate Status Protocol) Client asks a CA-run responder "is serial X revoked?" over HTTP. Answer is a signed response valid for minutes-to-days. Real-time freshness; the responder sees who is visiting what (privacy leak); load on the responder; latency on the TLS handshake; "soft-fail" is the default — if the responder is down the client proceeds.
OCSP stapling Server periodically fetches the OCSP response for its own cert and staples it to the TLS handshake. No extra RTT, no client privacy leak, no load on the responder per-connection. Needs server support and a working responder at fetch time. The Must-Staple extension (OID 1.3.6.1.5.5.7.1.24) forces clients to hard-fail if no staple.
For internal PKI: publish a CRL at a well-known URL that every client can reach, issue an OCSP responder if you like, and turn on stapling at the load balancer. For very short-lived leaves (24h via ACME), revocation is effectively a non-problem — just let them expire.

Rotation and lifetimes

Rotation frequency is a function of blast radius. The shorter the lifetime, the smaller the window of abuse when a key leaks.

TierTypical lifetimeReason
Root CA10-20 yearsDistributing a new root to every client is expensive. You generate it once, you keep it offline, you hope.
Intermediate CA3-5 years, with a successor generated and distributed at least 6 months before expiryGives clients time to pick up the new chain before old leaves are re-issued against it.
TLS server / client (long-lived)90 daysMatches WebPKI norms (Let's Encrypt). Forces automation.
Workload identity (mTLS in a mesh)1-24 hoursSPIFFE/SPIRE, Vault dynamic secrets, Istio, step-ca. At this lifetime, revocation is moot.
Code signing1-3 yearsTimestamped signatures remain valid past cert expiry.

Rollover pattern for an intermediate

  1. At T − 12 months (from expiry): sign a new intermediate from the offline root. Keep the new one alongside the old in every server's chain file.
  2. At T − 6 months: start issuing new leaves from the new intermediate only.
  3. At T − 0: old leaves have expired. Remove the old intermediate from server chain files.
  4. Revoke the old intermediate at the root and publish the root CRL.

Certificate Transparency

For the public WebPKI, Certificate Transparency logs are a world-readable, append-only record of every certificate a public CA has issued. Chrome requires at least two CT log signatures (SCTs) on certs issued after April 2018 or it shows a warning.

For a private PKI this does not apply — your internal names are not (and should not be) in public CT logs. But two operational points:

Worked example: build a 2-tier CA with openssl

Not production-hardened — production goes to step-ca, Vault PKI, or ADCS — but a useful exercise for understanding what the tools do under the hood.

Layout

pki/
├── root/
│   ├── openssl.cnf
│   ├── private/ca.key.pem      # offline, chmod 400
│   ├── certs/ca.cert.pem
│   ├── newcerts/
│   ├── crl/
│   ├── index.txt               # empty initially
│   └── serial                  # contains: 1000
└── intermediate/
    ├── openssl.cnf
    ├── private/intermediate.key.pem
    ├── certs/intermediate.cert.pem
    ├── csr/intermediate.csr.pem
    ├── newcerts/
    ├── crl/
    ├── index.txt
    └── serial

Step 1 — Root key and self-signed root cert

cd pki/root

# 4096-bit RSA; for new deployments prefer ECDSA P-384:
#   openssl ecparam -name secp384r1 -genkey -out private/ca.key.pem
openssl genrsa -aes256 -out private/ca.key.pem 4096
chmod 400 private/ca.key.pem

# Self-sign the root with -extensions v3_ca (set CA:TRUE, keyUsage)
openssl req -config openssl.cnf \
  -key private/ca.key.pem \
  -new -x509 -days 7300 -sha384 \
  -extensions v3_ca \
  -out certs/ca.cert.pem

openssl x509 -in certs/ca.cert.pem -noout -text | head

Step 2 — Intermediate key and CSR

cd ../intermediate

openssl genrsa -aes256 -out private/intermediate.key.pem 4096
chmod 400 private/intermediate.key.pem

openssl req -config openssl.cnf -new -sha384 \
  -key private/intermediate.key.pem \
  -out csr/intermediate.csr.pem

Step 3 — Root signs the intermediate

cd ../root

openssl ca -config openssl.cnf -extensions v3_intermediate_ca \
  -days 1825 -notext -md sha384 \
  -in ../intermediate/csr/intermediate.csr.pem \
  -out ../intermediate/certs/intermediate.cert.pem

# Verify it chains
openssl verify -CAfile certs/ca.cert.pem \
  ../intermediate/certs/intermediate.cert.pem
# intermediate.cert.pem: OK

Relevant bits of the root's openssl.cnf:

[ v3_ca ]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer
basicConstraints = critical, CA:true
keyUsage = critical, digitalSignature, cRLSign, keyCertSign

[ v3_intermediate_ca ]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer
basicConstraints = critical, CA:true, pathlen:0
keyUsage = critical, digitalSignature, cRLSign, keyCertSign

pathlen:0 on the intermediate forbids it from signing further CAs — it can only sign leaves. This is almost always what you want.

Step 4 — Issue a server certificate

cd ../intermediate

# Leaf key (ECDSA this time — faster, smaller)
openssl ecparam -name secp384r1 -genkey -out private/web-01.key.pem
chmod 400 private/web-01.key.pem

# CSR with SANs in an extension file
cat > csr/web-01.ext <<'EOF'
subjectAltName = DNS:web-01.prod.example.internal,DNS:web.prod.example.internal
EOF

openssl req -new -sha384 \
  -key private/web-01.key.pem \
  -subj "/C=GB/O=Example/CN=web-01.prod.example.internal" \
  -addext "subjectAltName=DNS:web-01.prod.example.internal,DNS:web.prod.example.internal" \
  -out csr/web-01.csr.pem

# Sign with the server extensions profile
openssl ca -config openssl.cnf -extensions server_cert \
  -days 90 -notext -md sha384 \
  -in csr/web-01.csr.pem \
  -out certs/web-01.cert.pem

# Build a chain file for the web server
cat certs/web-01.cert.pem certs/intermediate.cert.pem > certs/web-01.fullchain.pem

Leaf profile in openssl.cnf:

[ server_cert ]
basicConstraints = CA:FALSE
nsCertType = server
keyUsage = critical, digitalSignature, keyEncipherment
extendedKeyUsage = serverAuth
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer:always
subjectAltName = ${ENV::SAN}
crlDistributionPoints = URI:http://pki.example.internal/crl/intermediate.crl
authorityInfoAccess = OCSP;URI:http://pki.example.internal/ocsp

Step 5 — Publish a CRL

openssl ca -config openssl.cnf \
  -gencrl -out crl/intermediate.crl.pem

# DER form for HTTP distribution
openssl crl -in crl/intermediate.crl.pem -outform DER \
  -out /var/www/pki/crl/intermediate.crl

Regenerate on a cron (daily; sooner after a revocation). The CRL's nextUpdate is a hard ceiling — clients stop trusting a stale CRL, and if they have crlCheck = strict they stop trusting the issuer's certs entirely.

Common pitfalls

See also: Certificates (index), SSH Certificate Authorities, Keycloak + LDAPS.