CI for Ansible

A pipeline that lints, syntax-checks, tests with Molecule, dry-runs against dev, and deploys to prod behind a manual gate — with secrets brokered via OIDC, not stored in variables.

The pipeline shape

lint → syntax → test → check → apply. Everything before apply must be green before a human can press the button.
lint and syntax are cheap and parallel; run them on every push.
test uses Molecule and a distro matrix. Slow but the only thing that actually exercises your roles.
check is ansible-playbook --check --diff against dev. It is not a test — it is a preview of what will change.
apply is manual, environment-scoped, and serialised with a resource_group.
Secrets come from Vault via OIDC at job time, not from CI variables at rest.

On this page

First-day checklist
Full .gitlab-ci.yml skeleton
Stage: lint
Stage: syntax
Stage: test (Molecule)
Stage: check (dry-run against dev)
Stage: apply (manual, protected)
Secrets with OIDC to Vault
Artifacts and caching
Verification

First-day checklist

Task	Where
Create a `ci` folder with pinned `requirements.txt`, `requirements.yml`, and the Molecule driver you'll use	repo root
Register a tagged, protected deploy runner (`deploy`, `shell-prod`)	infra host; see GitLab Runner Setup
Protect `main` and the `prod-*` tag pattern	Settings → Repository → Protected branches/tags
Create the `production` and `dev` environments, mark `production` as protected	Operations → Environments
Configure GitLab as a JWT issuer in Vault, create a role bound to the project's `sub` claim	Vault; see GitLab Secrets and OIDC
Add masked file variable `ANSIBLE_VAULT_PASSWORD` (or use Vault lookup)	Settings → CI/CD → Variables
Add the SSH private key for the CI user as a file variable, not a regular one	Settings → CI/CD → Variables
Write a minimal `.gitlab-ci.yml` with just `lint` and let it go green before adding more stages	repo root

Full `.gitlab-ci.yml` skeleton

default:
  image: python:3.12-slim
  tags: [docker, linux]
  interruptible: true

stages:
  - lint
  - syntax
  - test
  - check
  - apply

variables:
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
  ANSIBLE_FORCE_COLOR: "1"
  ANSIBLE_CALLBACK_RESULT_FORMAT: "yaml"
  ANSIBLE_HOST_KEY_CHECKING: "False"
  # Pin collection versions; see requirements.yml
  ANSIBLE_COLLECTIONS_PATH: "$CI_PROJECT_DIR/.ansible/collections"

cache:
  key:
    files:
      - ci/requirements.txt
      - requirements.yml
  paths:
    - .cache/pip/
    - .ansible/collections/

.python-setup: &python-setup
  before_script:
    - apt-get update && apt-get install -y --no-install-recommends git openssh-client rsync
    - pip install --disable-pip-version-check -r ci/requirements.txt
    - ansible-galaxy collection install -r requirements.yml

include:
  - local: ci/lint.yml
  - local: ci/syntax.yml
  - local: ci/test.yml
  - local: ci/check.yml
  - local: ci/apply.yml

Each ci/*.yml file contains the jobs for that stage. Splitting keeps the root file legible and makes per-stage diffs easy to review.

`ci/requirements.txt`

ansible-core==2.17.*
ansible-lint==24.*
yamllint==1.35.*
molecule==24.*
molecule-plugins[docker]==23.*
jmespath==1.0.*
junit-xml==1.9

Stage: lint

# ci/lint.yml
yamllint:
  stage: lint
  <<: *python-setup
  script:
    - yamllint -s .
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'

ansible-lint:
  stage: lint
  <<: *python-setup
  script:
    - ansible-lint --profile production --format pep8
  artifacts:
    when: always
    reports:
      codequality: ansible-lint-report.json
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'

Config lives in the repo, not in the CI file:

# .yamllint
extends: default
rules:
  line-length:
    max: 140
    level: warning
  truthy:
    allowed-values: ["true", "false"]
  comments:
    min-spaces-from-content: 1
  braces: { max-spaces-inside: 1 }
  indentation: { spaces: 2, indent-sequences: true }
ignore: |
  .cache/
  .ansible/

# .ansible-lint
profile: production
exclude_paths:
  - .cache/
  - .ansible/
  - molecule/*/cache/
skip_list:
  - fqcn[action]     # only if you have not migrated fully to FQCNs

Stage: syntax

# ci/syntax.yml
syntax-check:
  stage: syntax
  <<: *python-setup
  script:
    - ansible-playbook --syntax-check site.yml -i inventories/dev
    - ansible-playbook --syntax-check site.yml -i inventories/prod

Running syntax-check against both inventories catches "this var is undefined in prod" before a human clicks apply.

Stage: test (Molecule matrix)

# ci/test.yml
.molecule-base:
  stage: test
  image: python:3.12
  services:
    - name: docker:27-dind
      command: ["--tls=false"]
  variables:
    DOCKER_HOST: "tcp://docker:2375"
    DOCKER_TLS_CERTDIR: ""
  <<: *python-setup
  script:
    - cd roles/$ROLE
    - molecule test --scenario-name $SCENARIO
  artifacts:
    when: always
    reports:
      junit: roles/$ROLE/molecule/$SCENARIO/junit.xml
    paths:
      - roles/$ROLE/molecule/$SCENARIO/
    expire_in: 1 week

molecule:nginx:rocky9:
  extends: .molecule-base
  variables: { ROLE: nginx, SCENARIO: rocky9 }

molecule:nginx:ubuntu2404:
  extends: .molecule-base
  variables: { ROLE: nginx, SCENARIO: ubuntu2404 }

molecule:postgresql:rocky9:
  extends: .molecule-base
  variables: { ROLE: postgresql, SCENARIO: rocky9 }

Each scenario directory has its own molecule.yml, converge.yml, and verify.yml. Use the ansible verifier so your "tests" are asserts in a playbook — readable by anyone who reads Ansible, and they emit JUnit natively when invoked with --tags verify and the junit callback.

# roles/nginx/molecule/rocky9/molecule.yml
dependency: { name: galaxy }
driver: { name: docker }
platforms:
  - name: instance
    image: "geerlingguy/docker-rockylinux9-ansible:latest"
    pre_build_image: true
    privileged: true
    cgroupns_mode: host
    volumes: ["/sys/fs/cgroup:/sys/fs/cgroup:rw"]
provisioner:
  name: ansible
  env:
    ANSIBLE_STDOUT_CALLBACK: junit
    JUNIT_OUTPUT_DIR: "${MOLECULE_SCENARIO_DIRECTORY}"
verifier:
  name: ansible

Stage: check (dry-run against dev)

# ci/check.yml
check:dev:
  stage: check
  environment:
    name: dev
    deployment_tier: development
  <<: *python-setup
  id_tokens:
    VAULT_ID_TOKEN:
      aud: https://vault.example.com
  before_script:
    - !reference [.python-setup, before_script]
    - export VAULT_ADDR=https://vault.example.com
    - export VAULT_TOKEN="$(vault write -field=token auth/jwt/login role=ansible-dev jwt=$VAULT_ID_TOKEN)"
    - export ANSIBLE_VAULT_PASSWORD_FILE=<(vault kv get -field=password secret/ci/ansible-vault/dev)
    - install -m 0600 /dev/stdin ~/.ssh/id_ed25519 <<< "$(vault kv get -field=key secret/ci/ssh/dev)"
    - ssh-keyscan -H $(awk '/ansible_host=/{sub(/.*ansible_host=/,"");print $1}' inventories/dev/hosts) >> ~/.ssh/known_hosts
  script:
    - ansible-playbook -i inventories/dev site.yml --check --diff | tee check-dev.log
  artifacts:
    when: always
    paths: [check-dev.log]
    expire_in: 2 weeks
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'

Review the diff as a human. A MR that changes a template should produce a non-empty check-dev.log; a MR that claims to be a no-op should produce exactly zero changes. CI cannot decide "this diff is correct" — you can.

Stage: apply (manual, protected)

# ci/apply.yml
apply:prod:
  stage: apply
  environment:
    name: production
    deployment_tier: production
  resource_group: prod                 # serialise: one apply at a time
  tags: [shell-prod, deploy]           # protected runner only
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'
      when: manual
  id_tokens:
    VAULT_ID_TOKEN:
      aud: https://vault.example.com
  before_script:
    - export VAULT_ADDR=https://vault.example.com
    - export VAULT_TOKEN="$(vault write -field=token auth/jwt/login role=ansible-prod jwt=$VAULT_ID_TOKEN)"
    - export ANSIBLE_VAULT_PASSWORD_FILE=<(vault kv get -field=password secret/ci/ansible-vault/prod)
    - install -d -m 0700 ~/.ssh
    - install -m 0600 /dev/stdin ~/.ssh/id_ed25519 <<< "$(vault kv get -field=key secret/ci/ssh/prod)"
  script:
    - ansible-playbook -i inventories/prod site.yml --diff
  after_script:
    - vault token revoke -self || true
  artifacts:
    when: always
    paths: [/tmp/ansible-apply-prod.log]

Do not skip resource_group. Two simultaneous apply:prod jobs will fight over serial:, handlers, and external APIs. GitLab's resource_group queues them into a single-file line.

Secrets with OIDC to Vault

CI variables have a place — for non-rotatable, non-secret hints. Real credentials should be short-lived and fetched at job time. The id_tokens keyword gives every job an OIDC-signed JWT that Vault, AWS, or GCP will exchange for a credential.

# Vault admin once, to wire GitLab as a JWT provider
vault auth enable jwt
vault write auth/jwt/config \
  oidc_discovery_url="https://gitlab.example.com" \
  bound_issuer="https://gitlab.example.com"

vault write auth/jwt/role/ansible-prod \
  role_type="jwt" \
  user_claim="user_email" \
  bound_audiences="https://vault.example.com" \
  bound_claims_type="glob" \
  bound_claims='{"project_path":"infra/ansible","ref":"main","ref_type":"branch","ref_protected":"true"}' \
  policies="ansible-prod" \
  ttl=30m

Bind on ref_protected=true so a rogue MR branch cannot assume the prod role even if someone forgets to tag the runner. See GitLab Secrets and OIDC for the full claim reference.

The Ansible vault password, via Vault

# At job time — no password ever lands on disk unencrypted
export ANSIBLE_VAULT_PASSWORD_FILE=<(vault kv get -field=password secret/ci/ansible-vault/prod)
ansible-playbook -i inventories/prod site.yml --diff

Process substitution (<(...)) gives Ansible a file descriptor that never hits the filesystem. The password is only in memory for the lifetime of the job.

Artifacts and caching

What	Where	Why
pip cache, collection cache	`cache:`	Shared across pipelines, keyed on requirements files — re-used until you bump a version.
ansible-lint JSON report	`artifacts.reports.codequality`	Renders inline on MR widgets.
Molecule `junit.xml`	`artifacts.reports.junit`	Rich test pane in the MR.
`check-dev.log`	`artifacts.paths`	Keep for two weeks so reviewers can compare MR diffs.
Apply logs	`artifacts.paths`, `when: always`	Post-mortem evidence even on failure.

Do not cache .ansible/tmp/ or ~/.ssh/. Either is a credential-leak channel. Cache is visible across branches and shared runners.

Verification

[ ] A fresh clone runs yamllint -s . and ansible-lint cleanly
[ ] --syntax-check passes against every inventory
[ ] molecule test passes locally for at least one role, on the same Python and Ansible pins as CI
[ ] An MR that touches a template shows a non-empty check-dev.log
[ ] The apply:prod job is manual, resource_group: prod, and tagged deploy
[ ] A feature-branch MR cannot run apply:prod — rules and protected environments both deny
[ ] Vault role is bound on ref_protected=true and the correct project_path
[ ] No Ansible Vault password, no SSH key, and no API token is stored as a plain CI variable

CI for Ansible

First-day checklist

Full .gitlab-ci.yml skeleton

ci/requirements.txt

Stage: lint

Stage: syntax

Stage: test (Molecule matrix)

Stage: check (dry-run against dev)

Stage: apply (manual, protected)

Secrets with OIDC to Vault

The Ansible vault password, via Vault

Artifacts and caching

Verification

Full `.gitlab-ci.yml` skeleton

`ci/requirements.txt`