Ansible Inventory Patterns

How to shape Ansible inventory so plays target the right hosts, vars live in the right file, and "dev vs prod" is just a different directory. INI, YAML, dynamic plugins, constructed inventories, and validation.

Inventory rules of thumb
  • One inventory directory per environment. Environment is a path, not a group name.
  • Group hosts by role (web, db), not by environment (prod-web).
  • group_vars/<group>/ as a directory of topic files; never one giant YAML.
  • Prefer YAML inventory for new projects. INI is fine for ultra-simple cases.
  • Dynamic inventory plugins > custom scripts > scraping a CMDB by hand.
  • Always ansible-inventory --graph after a change. It's the smoke test.

INI vs YAML side-by-side

Ansible supports several inventory formats. The two that matter are INI and YAML.

INI

; inventory/prod/hosts.ini
[web]
web01.example.com ansible_host=10.0.1.11
web02.example.com ansible_host=10.0.1.12

[db]
db01.example.com

[edge]
edge[01:04].example.com   ; range expansion

[eu:children]
web
db

[all:vars]
ansible_user=ansible

YAML (same inventory)

# inventory/prod/hosts.yml
all:
  vars:
    ansible_user: ansible
  children:
    web:
      hosts:
        web01.example.com:
          ansible_host: 10.0.1.11
        web02.example.com:
          ansible_host: 10.0.1.12
    db:
      hosts:
        db01.example.com: {}
    edge:
      hosts:
        "edge[01:04].example.com": {}
    eu:
      children:
        web: {}
        db: {}
DimensionINIYAML
Readability on small filesWinsVerbose
Readability on big filesLoses — hostvars inline get unreadableWins
Nested hostvarsNot supported (strings only)Native (dicts, lists)
Group-of-groups[parent:children] sectionchildren: key
Range expansionweb[01:10]Same syntax in a quoted key
Comments; or ##
Tool support / diffsLine-based, diffs wellDiffs well; YAML tools can reformat and make diffs noisy

Recommendation: INI for <30 hosts of obvious shape; YAML once you start wanting structured vars inline. But keep vars out of hosts.* where possible — put them in group_vars/ / host_vars/.

group_vars and host_vars as directories

Ansible loads group_vars/<group>.yml or group_vars/<group>/*.yml. Same for host_vars. Use the directory form. Reasons:

inventories/prod/
├── hosts.yml
├── group_vars/
│   ├── all/
│   │   ├── main.yml          # site-wide defaults (timezone, SMTP relay)
│   │   └── vault.yml         # vaulted: API tokens that apply everywhere
│   ├── web/
│   │   ├── main.yml          # web defaults
│   │   ├── tls.yml           # cert paths, ciphers
│   │   ├── performance.yml   # workers, keepalive
│   │   └── vault.yml         # vaulted: per-env web secrets
│   └── db/
│       ├── main.yml
│       └── vault.yml
└── host_vars/
    ├── web01.example.com.yml
    └── db01.example.com/
        ├── main.yml
        └── vault.yml

Var precedence (short version)

Ansible's documented rule is long; the practical summary:

  1. role/defaults/main.yml (lowest — documented interface)
  2. inventory (group vars, then host vars, then inventory file)
  3. play vars / vars_files / include_vars
  4. set_fact at runtime
  5. role/vars/main.yml (high — intended to be un-overridable)
  6. --extra-vars on the command line (highest — use only for hotfixes)

See Ansible Variables for the full ladder.

Nested groups, children, all

Every host is implicitly in all. Define other groupings by composition.

all:
  children:
    eu:
      children:
        eu_web: {}
        eu_db:  {}
    us:
      children:
        us_web: {}
        us_db:  {}
    web:
      children:
        eu_web: {}
        us_web: {}
    db:
      children:
        eu_db: {}
        us_db: {}
    eu_web:
      hosts:
        web-eu-01.example.com: {}
    us_web:
      hosts:
        web-us-01.example.com: {}
    eu_db:
      hosts:
        db-eu-01.example.com: {}
    us_db:
      hosts:
        db-us-01.example.com: {}

Now web means "all web hosts in any region", eu means "all hosts in EU", eu_web is the intersection, and a play can target any of them. Vars can live at any level — group_vars/eu/ for region-wide, group_vars/web/ for role-wide.

Patterns and selectors

The argument after -i's inventory is a pattern. It is a little DSL:

PatternMeaning
allEvery host.
webAll hosts in the web group (transitively via children).
web:dbUnion of web and db.
web:&euIntersection: hosts in both web and eu.
web:!canaryAll of web except the canary group.
web:&eu:!dbWeb-in-EU minus any DB overlap.
web[0:2]First three hosts of web (0-indexed slice).
*.example.comGlob match on hostnames.
~web[0-9]+Regex match (prefix with ~).
web01.example.comSingle host.
ansible -i inventories/prod 'web:&eu:!canary' -m ping
ansible-playbook -i inventories/prod --limit 'db:&us' site.yml
Shell quoting. Several of these characters (&, !, *) are special to your shell. Always quote the pattern: 'web:&eu:!canary'.

Inventory-per-environment layout

Give each environment its own directory. The code is identical across environments; the only thing that differs is the inventory directory you point at.

repo/
├── ansible.cfg
├── requirements.yml
├── site.yml
├── roles/
│   └── ...
└── inventories/
    ├── dev/
    │   ├── hosts.yml
    │   ├── group_vars/
    │   └── host_vars/
    ├── stage/
    │   ├── hosts.yml
    │   ├── group_vars/
    │   └── host_vars/
    └── prod/
        ├── hosts.yml
        ├── group_vars/
        └── host_vars/

Each run: ansible-playbook -i inventories/prod site.yml. A CI pipeline does dev → stage → prod by changing the -i argument, not the branch.

Do not do "branch-per-environment". That model requires cherry-picks between branches; drift is inevitable. Same code, different inventory, diff lives in the inventory tree, reviewed on every MR.

Dynamic inventory plugins

For anything cloud-native, the hostlist is not yours to hand-maintain — it's the API of the cloud provider. Ansible's inventory plugins query that API at run-time.

Plugin file convention: inventory/<env>/<name>.aws_ec2.yml (or .proxmox.yml, .gcp_compute.yml). Ansible picks it up if the filename ends in the plugin's registered suffix.

AWS EC2

# inventories/prod/aws.aws_ec2.yml
---
plugin: amazon.aws.aws_ec2
regions:
  - eu-west-1
  - us-east-1

filters:
  tag:Environment: prod
  instance-state-name: running

keyed_groups:
  - key: tags.Role
    prefix: role
  - key: placement.region
    prefix: region
  - key: instance_type
    prefix: itype

hostnames:
  - tag:Name
  - private-dns-name

compose:
  ansible_host: private_ip_address

Runs: ansible-inventory -i inventories/prod/aws.aws_ec2.yml --graph. Needs the amazon.aws collection and AWS credentials in the environment.

Proxmox

# inventories/homelab/proxmox.yml
---
plugin: community.general.proxmox
url: https://pve.example.lan:8006
user: ansible@pve
token_id: ansible
token_secret: !vault |
  $ANSIBLE_VAULT;1.1;AES256
  3433646637...

validate_certs: false
want_facts: true
group_prefix: pve_

keyed_groups:
  - key: proxmox_tags_parsed
    prefix: tag
  - key: proxmox_status
    prefix: status

compose:
  ansible_host: proxmox_ipconfig0.ip | default(proxmox_net0.ip, true) | regex_replace('/.*$', '')

Google Cloud

# inventories/prod/gcp.gcp_compute.yml
---
plugin: google.cloud.gcp_compute
projects:
  - my-prod-project
auth_kind: serviceaccount
service_account_file: /etc/ansible/gcp.json
zones:
  - europe-west1-b
  - us-central1-a
filters:
  - status = RUNNING
keyed_groups:
  - key: labels.role
    prefix: role
  - key: zone
    prefix: zone
hostnames:
  - name
compose:
  ansible_host: networkInterfaces[0].networkIP
Mix static and dynamic. Ansible loads every file in -i inventories/prod/. You can have hosts.yml (a handful of static bastions) and aws.aws_ec2.yml (the fleet) side-by-side; patterns match across the union.

Custom dynamic inventory scripts

When no plugin exists for your source (internal CMDB, a spreadsheet, a weird API), write an executable script. Ansible calls it with --list (return everything) or --host <name> (return that host's vars).

The contract

--list must print JSON shaped like this:

{
  "web": {
    "hosts": ["web01.example.com", "web02.example.com"],
    "vars":  { "app_env": "prod" }
  },
  "db": {
    "hosts": ["db01.example.com"]
  },
  "_meta": {
    "hostvars": {
      "web01.example.com": { "ansible_host": "10.0.1.11", "role": "frontend" },
      "web02.example.com": { "ansible_host": "10.0.1.12", "role": "frontend" },
      "db01.example.com":  { "ansible_host": "10.0.2.11", "role": "primary"  }
    }
  }
}

If _meta.hostvars is provided, Ansible will not call --host per host. Use it — it saves a per-host round trip and scales to thousands of hosts.

--host <name> returns {} or that host's vars dict; only needed if you didn't populate _meta.

Minimal Python example

#!/usr/bin/env python3
"""cmdb_inventory.py — tiny dynamic inventory from a JSON CMDB dump."""
import argparse
import json
import sys
from pathlib import Path

CMDB = Path("/var/lib/cmdb/hosts.json")


def build():
    raw = json.loads(CMDB.read_text())
    inv = {"_meta": {"hostvars": {}}}
    for host in raw:
        fqdn = host["fqdn"]
        role = host.get("role", "unclassified")
        env  = host.get("env",  "unknown")
        for group in (role, env, f"{role}_{env}"):
            inv.setdefault(group, {"hosts": []})["hosts"].append(fqdn)
        inv["_meta"]["hostvars"][fqdn] = {
            "ansible_host": host["mgmt_ip"],
            "role": role,
            "env":  env,
        }
    return inv


def main():
    p = argparse.ArgumentParser()
    g = p.add_mutually_exclusive_group(required=True)
    g.add_argument("--list", action="store_true")
    g.add_argument("--host")
    args = p.parse_args()

    inv = build()
    if args.list:
        json.dump(inv, sys.stdout)
    else:
        json.dump(inv["_meta"]["hostvars"].get(args.host, {}), sys.stdout)


if __name__ == "__main__":
    main()

Make it executable (chmod +x cmdb_inventory.py) and point Ansible at the file: -i inventories/prod/cmdb_inventory.py.

Prefer plugins over scripts. Scripts are a legacy interface. If the source is anything common, there is probably a plugin (community.general, the cloud collections). Scripts bypass Ansible's caching, error reporting, and schema validation.

Constructed / composed inventories

The constructed plugin takes an existing inventory and layers computed groups on top. Use it when your dynamic source has the right hosts but not the right groups.

# inventories/prod/01-aws.aws_ec2.yml   (dynamic source)
plugin: amazon.aws.aws_ec2
regions: [eu-west-1]
filters: { tag:Environment: prod }

# inventories/prod/02-constructed.yml   (layered on top)
plugin: ansible.builtin.constructed

groups:
  canary: instance_type == 't3.medium' and 'canary' in tags.get('Role', '')
  eu_web_prod: tags.get('Role') == 'web' and tags.get('Environment') == 'prod'

keyed_groups:
  - key: tags.Team
    prefix: team

compose:
  ansible_host: private_ip_address
  app_version: tags.Version | default('unknown')

Load order matters. Ansible sorts inventory files alphabetically; prefix them (01-, 02-) so the dynamic source loads first and constructed sees the hosts it's supposed to layer on.

Lab vs prod

Labs differ from prod in three ways that belong in inventory, not in playbooks:

AxisLabProd
HostsOne or two per role; maybe colocatedMany; spread across AZs
CredentialsCheap, scopedRotated, in Vault
become_methodOften plain sudoMaybe su, Kerberos-backed sudo
TLSSelf-signed / letsencrypt-stagingReal certs; pinned issuer
Retention7 days90 days
AlertingSilencedPaged

Every one of those is a variable. Put it in inventories/lab/group_vars/all/ vs inventories/prod/group_vars/all/. A playbook that says "use production defaults unless overridden" is wrong — the defaults should be safe (lab-like) and prod explicitly opts into the risky bits.

Validation with ansible-inventory

ansible-inventory is your friend. It reads the same inventory Ansible does, applies all plugins, and prints the result.

# Human-readable tree
ansible-inventory -i inventories/prod --graph

# Just one group
ansible-inventory -i inventories/prod --graph web

# Full hostvars for one host
ansible-inventory -i inventories/prod --host web01.example.com

# Machine-readable
ansible-inventory -i inventories/prod --list | jq '.web.hosts'

# Limit-resolution sanity check
ansible -i inventories/prod 'web:&eu:!canary' --list-hosts

CI snippet: guard against inventory drift

#!/usr/bin/env bash
set -euo pipefail

for env in dev stage prod; do
  echo "== $env =="
  ansible-inventory -i "inventories/$env" --graph > "/tmp/graph-$env.txt"
  diff -u "tests/expected-graph-$env.txt" "/tmp/graph-$env.txt"
done

If someone's inventory change moves a host into prod that shouldn't be there, the diff fails in CI. Pair with the checks in Ansible Testing.

Quick greps

# Is web01 actually in the 'eu' group in prod?
ansible-inventory -i inventories/prod --graph | grep -A1 '@eu:' | grep web01

# List all groups a host is in
ansible-inventory -i inventories/prod --host web01.example.com --yaml | grep -A999 '^ groups:'

# How many hosts match a pattern?
ansible -i inventories/prod 'web:&eu' --list-hosts | tail -n +2 | wc -l

Related reading: Ansible Variables, Project Structure, Best Practices, Ansible Testing.