CI for Ansible
lint→syntax→test→check→apply. Everything beforeapplymust be green before a human can press the button.lintandsyntaxare cheap and parallel; run them on every push.testuses Molecule and a distro matrix. Slow but the only thing that actually exercises your roles.checkisansible-playbook --check --diffagainst dev. It is not a test — it is a preview of what will change.applyis manual, environment-scoped, and serialised with aresource_group.- Secrets come from Vault via OIDC at job time, not from CI variables at rest.
First-day checklist
| Task | Where |
|---|---|
Create a ci folder with pinned requirements.txt, requirements.yml, and the Molecule driver you'll use | repo root |
Register a tagged, protected deploy runner (deploy, shell-prod) | infra host; see GitLab Runner Setup |
Protect main and the prod-* tag pattern | Settings → Repository → Protected branches/tags |
Create the production and dev environments, mark production as protected | Operations → Environments |
Configure GitLab as a JWT issuer in Vault, create a role bound to the project's sub claim | Vault; see GitLab Secrets and OIDC |
Add masked file variable ANSIBLE_VAULT_PASSWORD (or use Vault lookup) | Settings → CI/CD → Variables |
| Add the SSH private key for the CI user as a file variable, not a regular one | Settings → CI/CD → Variables |
Write a minimal .gitlab-ci.yml with just lint and let it go green before adding more stages | repo root |
Full .gitlab-ci.yml skeleton
default:
image: python:3.12-slim
tags: [docker, linux]
interruptible: true
stages:
- lint
- syntax
- test
- check
- apply
variables:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
ANSIBLE_FORCE_COLOR: "1"
ANSIBLE_CALLBACK_RESULT_FORMAT: "yaml"
ANSIBLE_HOST_KEY_CHECKING: "False"
# Pin collection versions; see requirements.yml
ANSIBLE_COLLECTIONS_PATH: "$CI_PROJECT_DIR/.ansible/collections"
cache:
key:
files:
- ci/requirements.txt
- requirements.yml
paths:
- .cache/pip/
- .ansible/collections/
.python-setup: &python-setup
before_script:
- apt-get update && apt-get install -y --no-install-recommends git openssh-client rsync
- pip install --disable-pip-version-check -r ci/requirements.txt
- ansible-galaxy collection install -r requirements.yml
include:
- local: ci/lint.yml
- local: ci/syntax.yml
- local: ci/test.yml
- local: ci/check.yml
- local: ci/apply.yml
Each ci/*.yml file contains the jobs for that stage. Splitting keeps the root file legible and makes per-stage diffs easy to review.
ci/requirements.txt
ansible-core==2.17.*
ansible-lint==24.*
yamllint==1.35.*
molecule==24.*
molecule-plugins[docker]==23.*
jmespath==1.0.*
junit-xml==1.9
Stage: lint
# ci/lint.yml
yamllint:
stage: lint
<<: *python-setup
script:
- yamllint -s .
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'
ansible-lint:
stage: lint
<<: *python-setup
script:
- ansible-lint --profile production --format pep8
artifacts:
when: always
reports:
codequality: ansible-lint-report.json
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'
Config lives in the repo, not in the CI file:
# .yamllint
extends: default
rules:
line-length:
max: 140
level: warning
truthy:
allowed-values: ["true", "false"]
comments:
min-spaces-from-content: 1
braces: { max-spaces-inside: 1 }
indentation: { spaces: 2, indent-sequences: true }
ignore: |
.cache/
.ansible/
# .ansible-lint
profile: production
exclude_paths:
- .cache/
- .ansible/
- molecule/*/cache/
skip_list:
- fqcn[action] # only if you have not migrated fully to FQCNs
Stage: syntax
# ci/syntax.yml
syntax-check:
stage: syntax
<<: *python-setup
script:
- ansible-playbook --syntax-check site.yml -i inventories/dev
- ansible-playbook --syntax-check site.yml -i inventories/prod
Running syntax-check against both inventories catches "this var is undefined in prod" before a human clicks apply.
Stage: test (Molecule matrix)
# ci/test.yml
.molecule-base:
stage: test
image: python:3.12
services:
- name: docker:27-dind
command: ["--tls=false"]
variables:
DOCKER_HOST: "tcp://docker:2375"
DOCKER_TLS_CERTDIR: ""
<<: *python-setup
script:
- cd roles/$ROLE
- molecule test --scenario-name $SCENARIO
artifacts:
when: always
reports:
junit: roles/$ROLE/molecule/$SCENARIO/junit.xml
paths:
- roles/$ROLE/molecule/$SCENARIO/
expire_in: 1 week
molecule:nginx:rocky9:
extends: .molecule-base
variables: { ROLE: nginx, SCENARIO: rocky9 }
molecule:nginx:ubuntu2404:
extends: .molecule-base
variables: { ROLE: nginx, SCENARIO: ubuntu2404 }
molecule:postgresql:rocky9:
extends: .molecule-base
variables: { ROLE: postgresql, SCENARIO: rocky9 }
Each scenario directory has its own molecule.yml, converge.yml, and verify.yml. Use the ansible verifier so your "tests" are asserts in a playbook — readable by anyone who reads Ansible, and they emit JUnit natively when invoked with --tags verify and the junit callback.
# roles/nginx/molecule/rocky9/molecule.yml
dependency: { name: galaxy }
driver: { name: docker }
platforms:
- name: instance
image: "geerlingguy/docker-rockylinux9-ansible:latest"
pre_build_image: true
privileged: true
cgroupns_mode: host
volumes: ["/sys/fs/cgroup:/sys/fs/cgroup:rw"]
provisioner:
name: ansible
env:
ANSIBLE_STDOUT_CALLBACK: junit
JUNIT_OUTPUT_DIR: "${MOLECULE_SCENARIO_DIRECTORY}"
verifier:
name: ansible
Stage: check (dry-run against dev)
# ci/check.yml
check:dev:
stage: check
environment:
name: dev
deployment_tier: development
<<: *python-setup
id_tokens:
VAULT_ID_TOKEN:
aud: https://vault.example.com
before_script:
- !reference [.python-setup, before_script]
- export VAULT_ADDR=https://vault.example.com
- export VAULT_TOKEN="$(vault write -field=token auth/jwt/login role=ansible-dev jwt=$VAULT_ID_TOKEN)"
- export ANSIBLE_VAULT_PASSWORD_FILE=<(vault kv get -field=password secret/ci/ansible-vault/dev)
- install -m 0600 /dev/stdin ~/.ssh/id_ed25519 <<< "$(vault kv get -field=key secret/ci/ssh/dev)"
- ssh-keyscan -H $(awk '/ansible_host=/{sub(/.*ansible_host=/,"");print $1}' inventories/dev/hosts) >> ~/.ssh/known_hosts
script:
- ansible-playbook -i inventories/dev site.yml --check --diff | tee check-dev.log
artifacts:
when: always
paths: [check-dev.log]
expire_in: 2 weeks
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
- if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'
check-dev.log; a MR that claims to be a no-op should produce exactly zero changes. CI cannot decide "this diff is correct" — you can.
Stage: apply (manual, protected)
# ci/apply.yml
apply:prod:
stage: apply
environment:
name: production
deployment_tier: production
resource_group: prod # serialise: one apply at a time
tags: [shell-prod, deploy] # protected runner only
rules:
- if: '$CI_COMMIT_BRANCH == "main"'
when: manual
id_tokens:
VAULT_ID_TOKEN:
aud: https://vault.example.com
before_script:
- export VAULT_ADDR=https://vault.example.com
- export VAULT_TOKEN="$(vault write -field=token auth/jwt/login role=ansible-prod jwt=$VAULT_ID_TOKEN)"
- export ANSIBLE_VAULT_PASSWORD_FILE=<(vault kv get -field=password secret/ci/ansible-vault/prod)
- install -d -m 0700 ~/.ssh
- install -m 0600 /dev/stdin ~/.ssh/id_ed25519 <<< "$(vault kv get -field=key secret/ci/ssh/prod)"
script:
- ansible-playbook -i inventories/prod site.yml --diff
after_script:
- vault token revoke -self || true
artifacts:
when: always
paths: [/tmp/ansible-apply-prod.log]
resource_group. Two simultaneous apply:prod jobs will fight over serial:, handlers, and external APIs. GitLab's resource_group queues them into a single-file line.
Secrets with OIDC to Vault
CI variables have a place — for non-rotatable, non-secret hints. Real credentials should be short-lived and fetched at job time. The id_tokens keyword gives every job an OIDC-signed JWT that Vault, AWS, or GCP will exchange for a credential.
# Vault admin once, to wire GitLab as a JWT provider
vault auth enable jwt
vault write auth/jwt/config \
oidc_discovery_url="https://gitlab.example.com" \
bound_issuer="https://gitlab.example.com"
vault write auth/jwt/role/ansible-prod \
role_type="jwt" \
user_claim="user_email" \
bound_audiences="https://vault.example.com" \
bound_claims_type="glob" \
bound_claims='{"project_path":"infra/ansible","ref":"main","ref_type":"branch","ref_protected":"true"}' \
policies="ansible-prod" \
ttl=30m
Bind on ref_protected=true so a rogue MR branch cannot assume the prod role even if someone forgets to tag the runner. See GitLab Secrets and OIDC for the full claim reference.
The Ansible vault password, via Vault
# At job time — no password ever lands on disk unencrypted
export ANSIBLE_VAULT_PASSWORD_FILE=<(vault kv get -field=password secret/ci/ansible-vault/prod)
ansible-playbook -i inventories/prod site.yml --diff
Process substitution (<(...)) gives Ansible a file descriptor that never hits the filesystem. The password is only in memory for the lifetime of the job.
Artifacts and caching
| What | Where | Why |
|---|---|---|
| pip cache, collection cache | cache: | Shared across pipelines, keyed on requirements files — re-used until you bump a version. |
| ansible-lint JSON report | artifacts.reports.codequality | Renders inline on MR widgets. |
Molecule junit.xml | artifacts.reports.junit | Rich test pane in the MR. |
check-dev.log | artifacts.paths | Keep for two weeks so reviewers can compare MR diffs. |
| Apply logs | artifacts.paths, when: always | Post-mortem evidence even on failure. |
.ansible/tmp/ or ~/.ssh/. Either is a credential-leak channel. Cache is visible across branches and shared runners.
Verification
- [ ] A fresh clone runs
yamllint -s .andansible-lintcleanly - [ ]
--syntax-checkpasses against every inventory - [ ]
molecule testpasses locally for at least one role, on the same Python and Ansible pins as CI - [ ] An MR that touches a template shows a non-empty
check-dev.log - [ ] The
apply:prodjob ismanual,resource_group: prod, and taggeddeploy - [ ] A feature-branch MR cannot run
apply:prod— rules and protected environments both deny - [ ] Vault role is bound on
ref_protected=trueand the correctproject_path - [ ] No Ansible Vault password, no SSH key, and no API token is stored as a plain CI variable
See also: Ansible Testing, GitLab CI/CD, GitLab Runner Setup, GitLab Secrets and OIDC.