Learn Ansible — hands-on tutorial

Eight progressive labs that build a real, multi-role Ansible project from scratch. Not another reference — every lab is "do this, see this, fix this."

Looking for the quickest possible taste? Read Quickstart (For Dummies) first — 10 minutes to a green playbook. Come here when you're ready to build something real.
How to do the labs
  • Spin up two Linux hosts. Anything works: two cloud VMs, two LXC containers, two VMs in Vagrant/libvirt. They need SSH and python3.
  • On each target host: a user with passwordless sudo and your SSH public key in ~/.ssh/authorized_keys.
  • On your workstation: python3 -m pip install --user ansible ansible-lint yamllint molecule[ansible] mitogen (mitogen optional, used once in Lab 6).
  • Work in a git repository from Lab 1. Every lab is a commit. You will want to diff.
  • After each lab there is a What could go wrong? box listing the most common real-world failures. Read it even if nothing went wrong for you.

Lab 1 — One-file playbook

Goal: install nginx on one target host and drop a page the host responds with.

Project layout (this is all you need):

learn-ansible/
├── ansible.cfg
├── inventory.ini
└── site.yml
# ansible.cfg
[defaults]
inventory = ./inventory.ini
host_key_checking = False
stdout_callback = yaml
forks = 10
# inventory.ini
web1 ansible_host=10.0.0.11 ansible_user=ansible
# site.yml
- name: Serve a single page from web1
  hosts: web1
  become: true
  tasks:
    - name: Install nginx
      ansible.builtin.package:
        name: nginx
        state: present

    - name: Drop our index page
      ansible.builtin.copy:
        dest: /usr/share/nginx/html/index.html
        content: "Hello from {{ inventory_hostname }}\n"
        mode: '0644'

    - name: Ensure nginx is running
      ansible.builtin.service:
        name: nginx
        state: started
        enabled: true

Run it and verify:

ansible-playbook site.yml
curl http://10.0.0.11
# Hello from web1

Commit it: git add . && git commit -m "lab 1: single-host nginx".

What could go wrong?
  • Failed to connect to the host via ssh — your inventory user is wrong, the key is not installed on the target, or ansible_host is unreachable. Test with plain ssh ansible@10.0.0.11 before blaming Ansible.
  • sudo: a password is requiredbecome needs passwordless sudo, or you need to pass --ask-become-pass.
  • /usr/bin/python3: not found on very old targets — add ansible_python_interpreter=/usr/libexec/platform-python (RHEL) or install python3 out-of-band.

Lab 2 — Inventory with groups and group_vars

Add a second host, group them, and pull values out of the playbook into per-group variables.

learn-ansible/
├── ansible.cfg
├── inventory/
│   ├── hosts.ini
│   └── group_vars/
│       ├── all.yml
│       └── web.yml
└── site.yml
# ansible.cfg  (updated)
[defaults]
inventory = ./inventory
host_key_checking = False
stdout_callback = yaml
forks = 10
# inventory/hosts.ini
[web]
web1 ansible_host=10.0.0.11
web2 ansible_host=10.0.0.12

[web:vars]
ansible_user=ansible
# inventory/group_vars/all.yml
site_operator: "ops@example.com"

# inventory/group_vars/web.yml
http_port: 80
index_body: |
  Hello from {{ inventory_hostname }} (operator: {{ site_operator }})
# site.yml  (now uses vars)
- name: Serve a page from the web group
  hosts: web
  become: true
  tasks:
    - ansible.builtin.package:   { name: nginx, state: present }
    - ansible.builtin.copy:
        dest: /usr/share/nginx/html/index.html
        content: "{{ index_body }}"
        mode: '0644'
    - ansible.builtin.service:   { name: nginx, state: started, enabled: true }
ansible-inventory --graph
# @all:
#   |--@ungrouped:
#   |--@web:
#   |  |--web1
#   |  |--web2
ansible-playbook site.yml
curl http://10.0.0.11 http://10.0.0.12
Rule of thumb for group_vars. Anything a human should be able to change without editing a role goes here. Anything internal to the role goes in roles/<name>/defaults/main.yml (next lab).
What could go wrong?
  • You put vars in inventory.ini [web:vars] instead of group_vars/web.yml — fine for ONE thing, but it becomes a mess. Keep ini for hostnames, YAML for vars.
  • You named the file group_vars/web/main.yml with no web.yml. Ansible supports a directory per group too; both styles are fine but pick one.

See also: Project Structure, Variable Precedence.

Lab 3 — Refactor into a role

Time to split things up. Roles are the unit of reuse.

learn-ansible/
├── ansible.cfg
├── inventory/ ...
├── roles/
│   └── web/
│       ├── defaults/main.yml
│       ├── handlers/main.yml
│       ├── meta/main.yml
│       ├── tasks/main.yml
│       ├── templates/index.html.j2
│       └── vars/main.yml
└── site.yml
# roles/web/defaults/main.yml
# Safe-to-override defaults. These are the role's public interface.
web_http_port: 80
web_index_body: "Hello from {{ inventory_hostname }}"
web_server_tokens: "off"
# roles/web/vars/main.yml
# Internal, high-precedence vars the user should not override.
_web_package_name: "nginx"
_web_service_name: "nginx"
_web_docroot: "/usr/share/nginx/html"
# roles/web/tasks/main.yml
- name: Install nginx
  ansible.builtin.package:
    name: "{{ _web_package_name }}"
    state: present

- name: Render the index page
  ansible.builtin.template:
    src: index.html.j2
    dest: "{{ _web_docroot }}/index.html"
    mode: '0644'
  notify: reload nginx

- name: Ensure nginx is running
  ansible.builtin.service:
    name: "{{ _web_service_name }}"
    state: started
    enabled: true
# roles/web/handlers/main.yml
- name: reload nginx
  ansible.builtin.service:
    name: "{{ _web_service_name }}"
    state: reloaded
# roles/web/templates/index.html.j2
<h1>{{ web_index_body }}</h1>
<p>Operator: {{ site_operator | default('ops@example.com') }}</p>
<p>Port: {{ web_http_port }}</p>
# roles/web/meta/main.yml
galaxy_info:
  author: you
  description: Minimal nginx with a single index page.
  license: MIT
  min_ansible_version: "2.14"
dependencies: []
# site.yml
- hosts: web
  become: true
  roles:
    - web
ansible-playbook site.yml --check --diff
ansible-playbook site.yml
Naming: prefix role-private vars with _ (_web_package_name) so a reader instantly knows "do not override this from outside". Public vars: web_*. This convention is covered in Best Practices & Refactoring.

Lab 4 — Handlers and templates

Goal: change the page body, observe the handler firing, and learn the flush_handlers escape hatch.

Edit roles/web/defaults/main.yml:

web_index_body: "Hello, version 2"
ansible-playbook site.yml --diff
# TASK [web : Render the index page] ********
# --- /usr/share/nginx/html/index.html
# +++ /usr/share/nginx/html/index.html
# - <h1>Hello from web1</h1>
# + <h1>Hello, version 2</h1>
# RUNNING HANDLER [web : reload nginx] ********

Now add a second task that must run after the reload but within the same play. Use meta: flush_handlers:

# roles/web/tasks/main.yml (add at end)
- name: Flush handlers so the reload happens before the smoke test
  ansible.builtin.meta: flush_handlers

- name: Smoke test the page
  ansible.builtin.uri:
    url: "http://{{ inventory_hostname }}:{{ web_http_port }}/"
    return_content: true
  register: smoke
  delegate_to: localhost
  become: false

- name: Show what the server returned
  ansible.builtin.debug:
    msg: "{{ smoke.content | regex_search('<h1>(.+)</h1>', '\\1') }}"
What could go wrong?
  • Two handlers with the same name silently collapse into one. Use listen: to group them safely.
  • A handler on a dead service ("service not found") errors — make sure the install task ran before the notify.
  • Multi-role plays: if role A's handler depends on role B's task, you need meta: flush_handlers explicitly between them or force_handlers: true at play level.

See also: Handlers & Templates, Jinja2.

Lab 5 — Secrets with ansible-vault

Add a secret (a fake API key) and make the role write it into a file.

mkdir -p inventory/group_vars/web
ansible-vault create inventory/group_vars/web/vault.yml
# editor opens; paste:
# ---
# web_api_key: "supersekret-abc123"

Reference it from the non-vault file and bind the two together:

# inventory/group_vars/web/main.yml
http_port: 80
# Point at the vaulted value via a simple-name indirection:
api_key: "{{ web_api_key }}"
# roles/web/tasks/main.yml (add)
- name: Install the API key for nginx
  ansible.builtin.copy:
    dest: /etc/nginx/api.key
    content: "{{ api_key }}"
    owner: root
    group: root
    mode: '0600'
  no_log: true
ansible-playbook site.yml --ask-vault-pass
# or
echo 'mypass' > ~/.vault-pass && chmod 600 ~/.vault-pass
ansible-playbook site.yml --vault-password-file ~/.vault-pass
Split per environment. Real projects keep inventory/<env>/group_vars/web/vault.yml with a different password per env. Prod vault password lives somewhere a CI runner can read; dev vault password can live on disk.

Lab 6 — Idempotency lab

We will deliberately write a non-idempotent task and then fix it. Add this to roles/web/tasks/main.yml:

- name: (BAD) append a line to nginx.conf every run
  ansible.builtin.shell: echo "# run marker" >> /etc/nginx/nginx.conf
ansible-playbook site.yml   # changed
ansible-playbook site.yml   # STILL CHANGED — this is the bug
ssh web1 "grep -c '# run marker' /etc/nginx/nginx.conf"   # grows each run

Three idiomatic fixes. Pick the right one for the situation:

# Fix 1: use the real module. 90% of the time this is the answer.
- name: Ensure marker is present exactly once
  ansible.builtin.lineinfile:
    path: /etc/nginx/nginx.conf
    line: "# run marker"
    state: present

# Fix 2: if you genuinely must shell out, make idempotency explicit
- name: Ensure marker is present exactly once (shell)
  ansible.builtin.shell: |
    grep -qxF "# run marker" /etc/nginx/nginx.conf || echo "# run marker" >> /etc/nginx/nginx.conf
  register: marker
  changed_when: "'run marker' not in marker.stdout and marker.rc == 0 and 'marker added' in marker.stdout"
  # ^ this gets fiddly — that's exactly why fix 1 exists

# Fix 3: creates / removes
- name: Write marker file once
  ansible.builtin.copy:
    dest: /etc/nginx/conf.d/marker
    content: "# run marker\n"
  # copy is idempotent by content hash; no extra logic needed

Now repeatedly run ansible-playbook site.yml and expect zero changed tasks. That is the definition of idempotent and the goal for every task you write.

What could go wrong?
  • Using shell/command where a real module exists is the #1 source of non-idempotence. ansible-lint (next lab) catches this.
  • register + changed_when: false "fixes" the lint but leaves the actual bug. Fix the task, not the report.
  • --check --diff is the fastest way to detect non-idempotent tasks: idempotent tasks show no diff on the second run.

See also: Best Practices — Idempotency.

Lab 7 — CI with ansible-lint and check-mode

Add a .gitlab-ci.yml that does on every MR: yamllint → ansible-lint → syntax-check → check-mode against a dev environment.

# .gitlab-ci.yml
stages: [lint, check, dryrun]

image: python:3.12-slim

before_script:
  - pip install --quiet ansible==9.* ansible-lint==24.* yamllint==1.*

yamllint:
  stage: lint
  script:
    - yamllint .

ansible-lint:
  stage: lint
  script:
    - ansible-lint

syntax-check:
  stage: check
  script:
    - ansible-playbook site.yml --syntax-check -i inventory/hosts.ini

check-mode:
  stage: dryrun
  when: manual
  rules:
    - if: $CI_MERGE_REQUEST_IID
  script:
    - mkdir -p ~/.ssh && echo "$DEV_SSH_KEY" > ~/.ssh/id_ed25519 && chmod 600 ~/.ssh/id_ed25519
    - ssh-keyscan 10.0.0.11 10.0.0.12 >> ~/.ssh/known_hosts
    - echo "$VAULT_PASS" > vp && chmod 600 vp
    - ansible-playbook site.yml --check --diff --vault-password-file vp
# .yamllint (repo root)
extends: default
rules:
  line-length: {max: 160}
  truthy: {allowed-values: ['true', 'false']}
# .ansible-lint (repo root)
exclude_paths:
  - .git/
  - .venv/
skip_list:
  - yaml[line-length]

Mask $DEV_SSH_KEY and $VAULT_PASS in GitLab → Settings → CI/CD → Variables. Now every MR gets linted + syntax-checked automatically, and anyone can click "check-mode" to dry-run against dev.

See also: CI for Ansible, GitLab CI/CD Pipelines, Ansible Testing.

Lab 8 — Capstone: multi-role stack

Build a tiny three-tier app: nginx reverse-proxying to gunicorn, backed by postgresql. Tags let us apply just one layer at a time. Environments (dev, prod) share roles but differ in inventory/vars.

learn-ansible/
├── ansible.cfg
├── site.yml
├── inventories/
│   ├── dev/
│   │   ├── hosts.ini
│   │   └── group_vars/...
│   └── prod/
│       ├── hosts.ini
│       └── group_vars/...
├── roles/
│   ├── base/           (timezone, ntp, sshd hardening)
│   ├── db/             (postgresql)
│   ├── app/            (gunicorn + systemd unit)
│   └── proxy/          (nginx as the edge)
└── .gitlab-ci.yml
# site.yml
- hosts: all
  become: true
  roles:
    - { role: base, tags: ['base'] }

- hosts: db
  become: true
  roles:
    - { role: db, tags: ['db'] }

- hosts: app
  become: true
  roles:
    - { role: app, tags: ['app'] }

- hosts: proxy
  become: true
  roles:
    - { role: proxy, tags: ['proxy', 'edge'] }
# Apply everything to dev
ansible-playbook -i inventories/dev site.yml

# Only roll out the app layer
ansible-playbook -i inventories/dev site.yml --tags app

# Only roll out the edge
ansible-playbook -i inventories/prod site.yml --tags edge

# Single host (e.g. prod 2 of 5)
ansible-playbook -i inventories/prod site.yml --limit proxy2

Gate prod with serial: for rolling changes:

# one play from site.yml, prod-specific
- hosts: app
  become: true
  serial: "25%"           # update 25% of app hosts at a time
  max_fail_percentage: 10 # bail if more than 10% of that batch fails
  roles:
    - { role: app, tags: ['app'] }
Promote dev → prod in one MR. The diff should be purely inventory/vars, not playbooks/roles. When a change to roles/app/ changes dev behaviour, you know prod is next in line. Keep inventories thin.

Where to go next