YAML Pitfalls
- Quote any value that looks like a version, a country code, a phone number, an IP, or a date.
- Never start a number with
0unless you want octal. Quote it, or write'0123'. - Use two-space indents, never tabs, never mix.
- If the parser is YAML 1.1 (and Ansible's is),
yes,no,on,off,y,nare booleans. Quote them to keep them strings.
The Norway problem
A famous bug: a list of ISO country codes written as bare tokens silently turns one entry into a boolean.
countries:
- GB
- FR
- DE
- NO # <-- becomes false in YAML 1.1
- ES
NO (along with no, No) is a YAML 1.1 boolean meaning false. The list now contains one boolean among its strings. The symptom: your code that does country.startswith(...) crashes on Norway. Quote it:
countries: ['GB', 'FR', 'DE', 'NO', 'ES']
Or switch to a YAML 1.2 parser where only true/false are booleans. Ansible is stuck on 1.1 for backward compatibility.
Number coercion: octal and version strings
Leading zeros are octal
file_mode: 0644 # NOT 644 — parsed as octal 0o644 = decimal 420, which some modules reject
file_mode: '0644' # the string "0644" — exactly what file/copy expect
file_mode: "0o644" # explicit octal literal (YAML 1.2)
For Ansible's file/copy/template modules, always quote mode strings. The modules want a string like "0644", not an integer.
backup_hour: 0930 # YAML: decimal 930
backup_hour: '0930' # correct: preserves the leading zero
Numeric tails get trimmed
version: 1.0 # float 1.0 -- prints as "1.0"
version: 1.10 # float 1.1 -- "1.10" is lost forever
version: "1.10" # string "1.10" -- what you almost certainly wanted
The hidden classic: you bump your app's version from 1.9 to 1.10 and a deploy emits myapp-1.1.tar.gz. Version numbers are always strings. So are phone numbers, postal codes, account IDs, and anything with leading or trailing zeros.
Big numbers
account_id: 0012345678901234 # parsed, loses leading zeros and may lose precision
account_id: '0012345678901234' # exact
Hex, sexagesimal, and scientific
perms: 0x1a3 # hex integer 419
time: 12:34:56 # YAML 1.1 sexagesimal: 45296 (!) -- YAML 1.2 leaves it a string
scale: 1e3 # float 1000.0
The sexagesimal trap is infamous: mac: 01:23:45:67:89:ab is not a MAC address, it's a number. Quote MACs, timestamps, IP-style dotted strings if there is any doubt.
Multi-doc streams: --- and ...
A single YAML stream can carry multiple documents, separated by ---. Kubernetes relies on this extensively:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
ENV: prod
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: app
spec:
replicas: 3
template:
spec:
containers:
- name: app
image: myapp:1.0
...
--- opens a document; ... closes one (usually omitted — end of file is fine). Readers like kubectl apply -f loop and submit each document.
In Python:
import yaml
with open("deploy.yaml") as fh:
for doc in yaml.safe_load_all(fh): # note _all, not load()
print(doc["kind"])
Common mistake: yaml.safe_load(fh) on a multi-doc file returns only the first document, silently.
Anchors, aliases, and merge keys
YAML lets you name a node with an anchor (&name) and refer to it later with an alias (*name). The merge key (<<:) pulls all the keys of another mapping into the current one.
defaults: &defaults
image: myapp:1.0
replicas: 2
env:
LOG_LEVEL: info
prod:
<<: *defaults # pull in all keys from defaults
replicas: 6 # override just this one
stage:
<<: *defaults
env: # WARNING: replaces, does not deep-merge
LOG_LEVEL: debug
<<:is a YAML 1.1 feature that was removed in YAML 1.2. Some parsers still accept it (PyYAML does; ruamel.yaml does with a flag); others don't. CloudFormation and some newer tools reject it.- Merge is shallow. The nested
envdict instageabove does not merge withdefaults.env— it replaces it.
Reuse a value
common_timeout: &t 30
services:
api:
timeout: *t
web:
timeout: *t
This is fine in any YAML version; no merge involved.
Quoting: plain, single, double, folded, literal
| Style | Example | Behaviour |
|---|---|---|
| Plain | name: alice | Parser infers type; boolean/number coercion applies; subset of characters allowed. |
| Single-quoted | name: 'al\'ice' | No escapes except '' for a single quote. Everything else literal. |
| Double-quoted | name: "line1\nline2" | C-style escapes: \n, \t, \x41, \". |
Folded (>) | multiline, newlines → spaces | Good for long prose. Blank lines preserved. |
Literal (|) | multiline, newlines preserved | Good for scripts, configs, PEM blobs. |
hostname: alice.example.com # plain — fine
version: "1.10" # double — keeps trailing zero
answer: '*' # single — plain * is a flow indicator
notice: >
This is a
long line of
prose folded to
one line.
script: |
#!/bin/bash
set -euo pipefail
echo hi
When plain breaks
Plain scalars cannot start with any of ! & * [ ] { } , # | > ' " % @ \` or be surrounded by flow indicators. If your value starts with one of these, quote it.
password: P@ss:word # colon-space inside triggers a parse error
password: "P@ss:word" # fine
Indentation and the tab ban
- Spaces only. YAML forbids tabs at any position where indentation is meaningful. Most editors with "insert tab as spaces" solve it; editors without do not.
- Consistent indent width. Two spaces is the universal convention. Pick two; stick to it.
- Siblings align. Keys at the same level must start at the same column. Nested structures increase by exactly one indent.
- Block lists: the
-is part of the indentation.- name: foois fine;-name: foo(no space) is not; mixing-column positions between siblings is a parse error in strict parsers.
# BAD (tabs — invisible in most viewers; will not parse)
services:
app:
image: foo
# BAD (mixed indent: api is 4 spaces, db is 2)
services:
api:
replicas: 1
db:
replicas: 1
# GOOD
services:
api:
replicas: 1
db:
replicas: 1
grep -P '\t' file.yaml or cat -A file.yaml | grep '\^I'. Teach your editor to highlight them.
Booleans: the 1.1 vs 1.2 split
YAML 1.1 recognises a dozen booleans. YAML 1.2 cut the list to two. Ansible, Kubernetes (depending on version), and most Python tooling with PyYAML default to 1.1.
| Token | YAML 1.1 | YAML 1.2 |
|---|---|---|
true, false, True, False, TRUE, FALSE | Boolean | Boolean |
yes, Yes, YES | Boolean (true) | String |
no, No, NO | Boolean (false) | String |
on, On, ON | Boolean (true) | String |
off, Off, OFF | Boolean (false) | String |
y, Y, n, N | Boolean | String |
The fix is quoting. Or, if you are writing a new schema, say "true/false only" in the docs and lint for it.
tls:
enabled: yes # YAML 1.1: boolean true. YAML 1.2: string "yes". Surprise.
tls:
enabled: true # unambiguous across versions
Type tags (!!str, !!binary, %TAG)
When you want to force a type, prepend a tag:
pi: !!str 3.14159 # string, not float
pi: !!float "3.14159" # explicit float
b: !!binary |
R0lGODlhEAAQAMQfAJnG...
zero: !!int 0
empty_map: !!map {}
%TAG directives let you define short prefixes for custom tag URIs. Mostly you don't need them — they show up in libraries that emit explicit !python/object tags. Be careful with yaml.load() (unsafe) vs yaml.safe_load(): the former can instantiate arbitrary Python objects from tags and is a remote-code-execution hazard when loading untrusted YAML.
Bad / good reference table
| Bad | Good | Why |
|---|---|---|
country: NO | country: 'NO' | NO is a YAML 1.1 boolean. |
version: 1.10 | version: "1.10" | Float truncation: 1.10 → 1.1. |
mode: 644 | mode: '0644' | Ansible file mode wants a string. |
port: 0080 | port: 80 | Leading zero → octal 64. |
mac: 01:23:45:67:89:ab | mac: "01:23:45:67:89:ab" | YAML 1.1 sexagesimal. |
tls: yes | tls: true | Depends on parser version. |
phone: 07123456789 | phone: "07123456789" | Leading zero drop. |
date: 2026-04-23 | date: "2026-04-23" | 1.1 parses as datetime; 1.2 as date. If you want a string, quote. |
mixed: [GB, FR, no, ES] | mixed: ['GB', 'FR', 'no', 'ES'] | Third item becomes false. |
| Tabs for indentation | Two spaces, consistent | Tabs forbidden at indent positions. |
- key:value | - key: value | Missing space after colon: parse error or string. |
name: hello #note | name: hello # note | A comment needs a space before #. |
<<: *defaults (deep merge expected) | Merge at the leaf level explicitly, or use a templating step | Merge is shallow; removed in YAML 1.2. |
Single yaml.safe_load on --- streams | yaml.safe_load_all | Silently returns only the first doc. |
Password with :, *, or # plain | Double-quoted, or stored elsewhere | Parse errors or silent truncation. |
yamllint catches most of these, and configuring truthy: {allowed-values: ['true', 'false']} will flag every yes/on/etc.. Run it in CI. Most shops bite once; the ones that do it twice do not have yamllint.
See also: YAML Basics, Jinja2 Advanced, Ansible Variables.