HashiCorp Vault

Vault gives you one control plane for secrets, encryption, and short-lived credentials. The hard part is not enabling it; the hard part is using the right auth path and the right TTL.

Operator priorities
  • Humans and machines should use different auth methods. Do not share one long-lived token everywhere.
  • KV v2 is for stored secrets; Transit is for encryption without storing the plaintext.
  • The root token is for bootstrap and disaster recovery, not for CI, apps, or day-to-day admin.
  • Short TTLs, narrow policies, and audit logs matter more than fancy secret engines.
  • If Vault is sealed or unreachable, it is now part of your production dependency chain. Plan for that explicitly.

Auth methods

Pick auth by caller type, not by personal preference. A workable split looks like this:

Auth methodBest forOperational note
userpass or ldap / oidc Human operators Use MFA or upstream identity controls where possible. Do not turn this into a machine auth path.
approle VMs, CI jobs, automation that cannot present a cloud-native identity Good default for non-interactive callers. Protect the SecretID like a real secret.
jwt / oidc GitLab CI and other identity-bearing workloads Strong pattern when you already trust the issuer; see GitLab Secrets & OIDC.
kubernetes Pods inside a cluster Good when the cluster identity plane is healthy and tightly scoped.

Whatever you choose, tie it to policies that only expose the paths that caller truly needs. Most Vault mistakes are not "Vault broke"; they are "the role or policy was too broad" or "the token lived too long".

KV v2 secrets

KV v2 is the standard key/value engine with versioning. It is the right place for passwords, API tokens, and deploy-time configuration secrets.

vault secrets enable -path=kv kv-v2

vault kv put kv/prod/app \
  db_password="redacted" \
  api_token="redacted"

vault kv get kv/prod/app
vault kv metadata get kv/prod/app

Important path nuance:

# API-style read path
curl -sS \
  -H "X-Vault-Token: $VAULT_TOKEN" \
  "$VAULT_ADDR/v1/kv/data/prod/app"
Classic KV v2 mistake: a policy grants kv/prod/app but the client reads kv/data/prod/app, resulting in permission denied even though the secret "exists". Match the engine version and path format.

Transit encryption

Transit is for "encrypt this value for me" or "decrypt this ciphertext for me" without storing the plaintext in Vault. It is useful for apps that need encryption as a service.

vault secrets enable transit
vault write -f transit/keys/payments

# Encrypt: plaintext must be base64-encoded
vault write transit/encrypt/payments \
  plaintext="$(printf '4111111111111111' | base64)"

# Decrypt
vault write transit/decrypt/payments \
  ciphertext="vault:v1:..."

Transit is not a generic secret store. If the application needs to fetch a password later, use KV. If it needs Vault to hold the key and perform crypto operations, use Transit.

Policies and AppRole

Policies are the real access control layer. AppRole is just one way to obtain a token that carries those policies.

# ansible-prod.hcl
path "kv/data/prod/ansible/*" {
  capabilities = ["read"]
}

path "transit/encrypt/ansible" {
  capabilities = ["update"]
}
vault policy write ansible-prod ansible-prod.hcl

vault auth enable approle
vault write auth/approle/role/ansible-prod \
  token_policies="ansible-prod" \
  token_ttl="20m" \
  token_max_ttl="1h" \
  secret_id_ttl="30m" \
  secret_id_num_uses=1

vault read auth/approle/role/ansible-prod/role-id
vault write -f auth/approle/role/ansible-prod/secret-id

Operationally:

Tokens and leases

Vault has two related lifetime concepts:

vault token lookup
vault token renew
vault token revoke -self

# If using a dynamic secret engine
vault lease lookup database/creds/app/abcd-1234
vault lease revoke database/creds/app/abcd-1234

When in doubt, shorter is better, as long as your job can finish. Expired short-lived credentials are noise; leaked long-lived credentials are incidents.

Unseal and HA basics

Vault must be unsealed before it can serve traffic. In small labs this is often manual Shamir unseal. In production, auto-unseal with KMS or HSM is usually the sane choice.

storage "raft" {
  path    = "/opt/vault/data"
  node_id = "vault01"
}

listener "tcp" {
  address       = "0.0.0.0:8200"
  tls_cert_file = "/etc/vault.d/tls/fullchain.pem"
  tls_key_file  = "/etc/vault.d/tls/privkey.pem"
}

api_addr     = "https://vault01.example.com:8200"
cluster_addr = "https://vault01.example.com:8201"
ui           = true
vault status
vault operator init
vault operator unseal
vault operator raft list-peers

HA basics worth remembering:

Pair this with your normal DR plan in Backup & Restore.

Ansible integration

Ansible usually reads from Vault at deploy time. That keeps secrets out of Git and out of the image build.

---
- name: Read DB password from Vault
  hosts: app
  gather_facts: false
  tasks:
    - name: Fetch DB password
      ansible.builtin.set_fact:
        db_password: >-
          {{
            lookup(
              'community.hashi_vault.hashi_vault',
              'secret=kv/data/prod/app '
              ~ 'url=' ~ lookup('env', 'VAULT_ADDR') ~ ' '
              ~ 'token=' ~ lookup('env', 'VAULT_TOKEN')
            ).data.data.db_password
          }}

    - name: Render config
      ansible.builtin.template:
        src: app.conf.j2
        dest: /etc/myapp/app.conf
        mode: "0600"

For CI-driven Ansible, get VAULT_TOKEN from AppRole or JWT/OIDC, not from a shared static admin token. The broader CI patterns are covered in CI for Ansible and GitLab Secrets & OIDC.

Troubleshooting and failure modes

SymptomLikely causeWhat to do
vault is sealed Node restarted or unseal failed. Check vault status, unseal or restore auto-unseal dependency first.
permission denied Policy path does not match the engine path actually being read. Confirm KV v1 vs KV v2 and compare the requested path with the policy exactly.
no handler for route Wrong mount path or wrong secret engine. List mounts with vault secrets list and fix the client path.
Token expires mid-deploy TTL is shorter than the job or the token is non-renewable. Adjust role TTLs or renew the token, but keep them short enough to remain safe.
JWT auth works in dev but not prod Bound claims on branch, environment, or audience do not match. Decode the JWT claims and compare them to the Vault role configuration.
TLS trust errors Vault CA is missing from the client trust store. Install the CA and verify the hostname and certificate chain.
Writes fail intermittently behind a load balancer Requests are hitting standby nodes or health checks are too weak. Ensure the load balancer only routes write traffic to the active node.

Related pages: GitLab Secrets & OIDC, Ansible, CI for Ansible, Backup & Restore.