CI for Ansible

A pipeline that lints, syntax-checks, tests with Molecule, dry-runs against dev, and deploys to prod behind a manual gate — with secrets brokered via OIDC, not stored in variables.

The pipeline shape
  • lintsyntaxtestcheckapply. Everything before apply must be green before a human can press the button.
  • lint and syntax are cheap and parallel; run them on every push.
  • test uses Molecule and a distro matrix. Slow but the only thing that actually exercises your roles.
  • check is ansible-playbook --check --diff against dev. It is not a test — it is a preview of what will change.
  • apply is manual, environment-scoped, and serialised with a resource_group.
  • Secrets come from Vault via OIDC at job time, not from CI variables at rest.

First-day checklist

TaskWhere
Create a ci folder with pinned requirements.txt, requirements.yml, and the Molecule driver you'll userepo root
Register a tagged, protected deploy runner (deploy, shell-prod)infra host; see GitLab Runner Setup
Protect main and the prod-* tag patternSettings → Repository → Protected branches/tags
Create the production and dev environments, mark production as protectedOperations → Environments
Configure GitLab as a JWT issuer in Vault, create a role bound to the project's sub claimVault; see GitLab Secrets and OIDC
Add masked file variable ANSIBLE_VAULT_PASSWORD (or use Vault lookup)Settings → CI/CD → Variables
Add the SSH private key for the CI user as a file variable, not a regular oneSettings → CI/CD → Variables
Write a minimal .gitlab-ci.yml with just lint and let it go green before adding more stagesrepo root

Full .gitlab-ci.yml skeleton

default:
  image: python:3.12-slim
  tags: [docker, linux]
  interruptible: true

stages:
  - lint
  - syntax
  - test
  - check
  - apply

variables:
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
  ANSIBLE_FORCE_COLOR: "1"
  ANSIBLE_CALLBACK_RESULT_FORMAT: "yaml"
  ANSIBLE_HOST_KEY_CHECKING: "False"
  # Pin collection versions; see requirements.yml
  ANSIBLE_COLLECTIONS_PATH: "$CI_PROJECT_DIR/.ansible/collections"

cache:
  key:
    files:
      - ci/requirements.txt
      - requirements.yml
  paths:
    - .cache/pip/
    - .ansible/collections/

.python-setup: &python-setup
  before_script:
    - apt-get update && apt-get install -y --no-install-recommends git openssh-client rsync
    - pip install --disable-pip-version-check -r ci/requirements.txt
    - ansible-galaxy collection install -r requirements.yml

include:
  - local: ci/lint.yml
  - local: ci/syntax.yml
  - local: ci/test.yml
  - local: ci/check.yml
  - local: ci/apply.yml

Each ci/*.yml file contains the jobs for that stage. Splitting keeps the root file legible and makes per-stage diffs easy to review.

ci/requirements.txt

ansible-core==2.17.*
ansible-lint==24.*
yamllint==1.35.*
molecule==24.*
molecule-plugins[docker]==23.*
jmespath==1.0.*
junit-xml==1.9

Stage: lint

# ci/lint.yml
yamllint:
  stage: lint
  <<: *python-setup
  script:
    - yamllint -s .
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'

ansible-lint:
  stage: lint
  <<: *python-setup
  script:
    - ansible-lint --profile production --format pep8
  artifacts:
    when: always
    reports:
      codequality: ansible-lint-report.json
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'

Config lives in the repo, not in the CI file:

# .yamllint
extends: default
rules:
  line-length:
    max: 140
    level: warning
  truthy:
    allowed-values: ["true", "false"]
  comments:
    min-spaces-from-content: 1
  braces: { max-spaces-inside: 1 }
  indentation: { spaces: 2, indent-sequences: true }
ignore: |
  .cache/
  .ansible/
# .ansible-lint
profile: production
exclude_paths:
  - .cache/
  - .ansible/
  - molecule/*/cache/
skip_list:
  - fqcn[action]     # only if you have not migrated fully to FQCNs

Stage: syntax

# ci/syntax.yml
syntax-check:
  stage: syntax
  <<: *python-setup
  script:
    - ansible-playbook --syntax-check site.yml -i inventories/dev
    - ansible-playbook --syntax-check site.yml -i inventories/prod

Running syntax-check against both inventories catches "this var is undefined in prod" before a human clicks apply.

Stage: test (Molecule matrix)

# ci/test.yml
.molecule-base:
  stage: test
  image: python:3.12
  services:
    - name: docker:27-dind
      command: ["--tls=false"]
  variables:
    DOCKER_HOST: "tcp://docker:2375"
    DOCKER_TLS_CERTDIR: ""
  <<: *python-setup
  script:
    - cd roles/$ROLE
    - molecule test --scenario-name $SCENARIO
  artifacts:
    when: always
    reports:
      junit: roles/$ROLE/molecule/$SCENARIO/junit.xml
    paths:
      - roles/$ROLE/molecule/$SCENARIO/
    expire_in: 1 week

molecule:nginx:rocky9:
  extends: .molecule-base
  variables: { ROLE: nginx, SCENARIO: rocky9 }

molecule:nginx:ubuntu2404:
  extends: .molecule-base
  variables: { ROLE: nginx, SCENARIO: ubuntu2404 }

molecule:postgresql:rocky9:
  extends: .molecule-base
  variables: { ROLE: postgresql, SCENARIO: rocky9 }

Each scenario directory has its own molecule.yml, converge.yml, and verify.yml. Use the ansible verifier so your "tests" are asserts in a playbook — readable by anyone who reads Ansible, and they emit JUnit natively when invoked with --tags verify and the junit callback.

# roles/nginx/molecule/rocky9/molecule.yml
dependency: { name: galaxy }
driver: { name: docker }
platforms:
  - name: instance
    image: "geerlingguy/docker-rockylinux9-ansible:latest"
    pre_build_image: true
    privileged: true
    cgroupns_mode: host
    volumes: ["/sys/fs/cgroup:/sys/fs/cgroup:rw"]
provisioner:
  name: ansible
  env:
    ANSIBLE_STDOUT_CALLBACK: junit
    JUNIT_OUTPUT_DIR: "${MOLECULE_SCENARIO_DIRECTORY}"
verifier:
  name: ansible

Stage: check (dry-run against dev)

# ci/check.yml
check:dev:
  stage: check
  environment:
    name: dev
    deployment_tier: development
  <<: *python-setup
  id_tokens:
    VAULT_ID_TOKEN:
      aud: https://vault.example.com
  before_script:
    - !reference [.python-setup, before_script]
    - export VAULT_ADDR=https://vault.example.com
    - export VAULT_TOKEN="$(vault write -field=token auth/jwt/login role=ansible-dev jwt=$VAULT_ID_TOKEN)"
    - export ANSIBLE_VAULT_PASSWORD_FILE=<(vault kv get -field=password secret/ci/ansible-vault/dev)
    - install -m 0600 /dev/stdin ~/.ssh/id_ed25519 <<< "$(vault kv get -field=key secret/ci/ssh/dev)"
    - ssh-keyscan -H $(awk '/ansible_host=/{sub(/.*ansible_host=/,"");print $1}' inventories/dev/hosts) >> ~/.ssh/known_hosts
  script:
    - ansible-playbook -i inventories/dev site.yml --check --diff | tee check-dev.log
  artifacts:
    when: always
    paths: [check-dev.log]
    expire_in: 2 weeks
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH'
Review the diff as a human. A MR that changes a template should produce a non-empty check-dev.log; a MR that claims to be a no-op should produce exactly zero changes. CI cannot decide "this diff is correct" — you can.

Stage: apply (manual, protected)

# ci/apply.yml
apply:prod:
  stage: apply
  environment:
    name: production
    deployment_tier: production
  resource_group: prod                 # serialise: one apply at a time
  tags: [shell-prod, deploy]           # protected runner only
  rules:
    - if: '$CI_COMMIT_BRANCH == "main"'
      when: manual
  id_tokens:
    VAULT_ID_TOKEN:
      aud: https://vault.example.com
  before_script:
    - export VAULT_ADDR=https://vault.example.com
    - export VAULT_TOKEN="$(vault write -field=token auth/jwt/login role=ansible-prod jwt=$VAULT_ID_TOKEN)"
    - export ANSIBLE_VAULT_PASSWORD_FILE=<(vault kv get -field=password secret/ci/ansible-vault/prod)
    - install -d -m 0700 ~/.ssh
    - install -m 0600 /dev/stdin ~/.ssh/id_ed25519 <<< "$(vault kv get -field=key secret/ci/ssh/prod)"
  script:
    - ansible-playbook -i inventories/prod site.yml --diff
  after_script:
    - vault token revoke -self || true
  artifacts:
    when: always
    paths: [/tmp/ansible-apply-prod.log]
Do not skip resource_group. Two simultaneous apply:prod jobs will fight over serial:, handlers, and external APIs. GitLab's resource_group queues them into a single-file line.

Secrets with OIDC to Vault

CI variables have a place — for non-rotatable, non-secret hints. Real credentials should be short-lived and fetched at job time. The id_tokens keyword gives every job an OIDC-signed JWT that Vault, AWS, or GCP will exchange for a credential.

# Vault admin once, to wire GitLab as a JWT provider
vault auth enable jwt
vault write auth/jwt/config \
  oidc_discovery_url="https://gitlab.example.com" \
  bound_issuer="https://gitlab.example.com"

vault write auth/jwt/role/ansible-prod \
  role_type="jwt" \
  user_claim="user_email" \
  bound_audiences="https://vault.example.com" \
  bound_claims_type="glob" \
  bound_claims='{"project_path":"infra/ansible","ref":"main","ref_type":"branch","ref_protected":"true"}' \
  policies="ansible-prod" \
  ttl=30m

Bind on ref_protected=true so a rogue MR branch cannot assume the prod role even if someone forgets to tag the runner. See GitLab Secrets and OIDC for the full claim reference.

The Ansible vault password, via Vault

# At job time — no password ever lands on disk unencrypted
export ANSIBLE_VAULT_PASSWORD_FILE=<(vault kv get -field=password secret/ci/ansible-vault/prod)
ansible-playbook -i inventories/prod site.yml --diff

Process substitution (<(...)) gives Ansible a file descriptor that never hits the filesystem. The password is only in memory for the lifetime of the job.

Artifacts and caching

WhatWhereWhy
pip cache, collection cachecache:Shared across pipelines, keyed on requirements files — re-used until you bump a version.
ansible-lint JSON reportartifacts.reports.codequalityRenders inline on MR widgets.
Molecule junit.xmlartifacts.reports.junitRich test pane in the MR.
check-dev.logartifacts.pathsKeep for two weeks so reviewers can compare MR diffs.
Apply logsartifacts.paths, when: alwaysPost-mortem evidence even on failure.
Do not cache .ansible/tmp/ or ~/.ssh/. Either is a credential-leak channel. Cache is visible across branches and shared runners.

Verification

See also: Ansible Testing, GitLab CI/CD, GitLab Runner Setup, GitLab Secrets and OIDC.