YAML Pitfalls

The Norway problem, octal surprises, "1.10" becoming 1.1, multi-doc streams, anchors and merge keys, quoting rules, the tab ban, and every booleanish string YAML 1.1 thinks is a boolean.

Four rules that avoid 90% of YAML bugs
  • Quote any value that looks like a version, a country code, a phone number, an IP, or a date.
  • Never start a number with 0 unless you want octal. Quote it, or write '0123'.
  • Use two-space indents, never tabs, never mix.
  • If the parser is YAML 1.1 (and Ansible's is), yes, no, on, off, y, n are booleans. Quote them to keep them strings.

The Norway problem

A famous bug: a list of ISO country codes written as bare tokens silently turns one entry into a boolean.

countries:
  - GB
  - FR
  - DE
  - NO        # <-- becomes false in YAML 1.1
  - ES

NO (along with no, No) is a YAML 1.1 boolean meaning false. The list now contains one boolean among its strings. The symptom: your code that does country.startswith(...) crashes on Norway. Quote it:

countries: ['GB', 'FR', 'DE', 'NO', 'ES']

Or switch to a YAML 1.2 parser where only true/false are booleans. Ansible is stuck on 1.1 for backward compatibility.

Number coercion: octal and version strings

Leading zeros are octal

file_mode: 0644       # NOT 644 — parsed as octal 0o644 = decimal 420, which some modules reject
file_mode: '0644'     # the string "0644" — exactly what file/copy expect
file_mode: "0o644"    # explicit octal literal (YAML 1.2)

For Ansible's file/copy/template modules, always quote mode strings. The modules want a string like "0644", not an integer.

backup_hour: 0930       # YAML: decimal 930
backup_hour: '0930'     # correct: preserves the leading zero

Numeric tails get trimmed

version: 1.0         # float 1.0 -- prints as "1.0"
version: 1.10        # float 1.1 -- "1.10" is lost forever
version: "1.10"      # string "1.10" -- what you almost certainly wanted

The hidden classic: you bump your app's version from 1.9 to 1.10 and a deploy emits myapp-1.1.tar.gz. Version numbers are always strings. So are phone numbers, postal codes, account IDs, and anything with leading or trailing zeros.

Big numbers

account_id: 0012345678901234   # parsed, loses leading zeros and may lose precision
account_id: '0012345678901234' # exact

Hex, sexagesimal, and scientific

perms: 0x1a3        # hex integer 419
time:  12:34:56     # YAML 1.1 sexagesimal: 45296 (!)  -- YAML 1.2 leaves it a string
scale: 1e3          # float 1000.0

The sexagesimal trap is infamous: mac: 01:23:45:67:89:ab is not a MAC address, it's a number. Quote MACs, timestamps, IP-style dotted strings if there is any doubt.

Multi-doc streams: --- and ...

A single YAML stream can carry multiple documents, separated by ---. Kubernetes relies on this extensively:

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  ENV: prod
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: app
          image: myapp:1.0
...

--- opens a document; ... closes one (usually omitted — end of file is fine). Readers like kubectl apply -f loop and submit each document.

In Python:

import yaml
with open("deploy.yaml") as fh:
    for doc in yaml.safe_load_all(fh):   # note _all, not load()
        print(doc["kind"])

Common mistake: yaml.safe_load(fh) on a multi-doc file returns only the first document, silently.

Anchors, aliases, and merge keys

YAML lets you name a node with an anchor (&name) and refer to it later with an alias (*name). The merge key (<<:) pulls all the keys of another mapping into the current one.

defaults: &defaults
  image: myapp:1.0
  replicas: 2
  env:
    LOG_LEVEL: info

prod:
  <<: *defaults            # pull in all keys from defaults
  replicas: 6              # override just this one

stage:
  <<: *defaults
  env:                     # WARNING: replaces, does not deep-merge
    LOG_LEVEL: debug
Two big caveats.
  1. <<: is a YAML 1.1 feature that was removed in YAML 1.2. Some parsers still accept it (PyYAML does; ruamel.yaml does with a flag); others don't. CloudFormation and some newer tools reject it.
  2. Merge is shallow. The nested env dict in stage above does not merge with defaults.env — it replaces it.

Reuse a value

common_timeout: &t 30

services:
  api:
    timeout: *t
  web:
    timeout: *t

This is fine in any YAML version; no merge involved.

Quoting: plain, single, double, folded, literal

StyleExampleBehaviour
Plainname: aliceParser infers type; boolean/number coercion applies; subset of characters allowed.
Single-quotedname: 'al\'ice'No escapes except '' for a single quote. Everything else literal.
Double-quotedname: "line1\nline2"C-style escapes: \n, \t, \x41, \".
Folded (>)multiline, newlines → spacesGood for long prose. Blank lines preserved.
Literal (|)multiline, newlines preservedGood for scripts, configs, PEM blobs.
hostname: alice.example.com            # plain — fine
version:  "1.10"                       # double — keeps trailing zero
answer:   '*'                          # single — plain * is a flow indicator

notice: >
  This is a
  long line of
  prose folded to
  one line.

script: |
  #!/bin/bash
  set -euo pipefail
  echo hi

When plain breaks

Plain scalars cannot start with any of ! & * [ ] { } , # | > ' " % @ \` or be surrounded by flow indicators. If your value starts with one of these, quote it.

password: P@ss:word          # colon-space inside triggers a parse error
password: "P@ss:word"        # fine

Indentation and the tab ban

# BAD (tabs — invisible in most viewers; will not parse)
services:
	app:
		image: foo

# BAD (mixed indent: api is 4 spaces, db is 2)
services:
    api:
      replicas: 1
  db:
    replicas: 1

# GOOD
services:
  api:
    replicas: 1
  db:
    replicas: 1
Find hidden tabs with: grep -P '\t' file.yaml or cat -A file.yaml | grep '\^I'. Teach your editor to highlight them.

Booleans: the 1.1 vs 1.2 split

YAML 1.1 recognises a dozen booleans. YAML 1.2 cut the list to two. Ansible, Kubernetes (depending on version), and most Python tooling with PyYAML default to 1.1.

TokenYAML 1.1YAML 1.2
true, false, True, False, TRUE, FALSEBooleanBoolean
yes, Yes, YESBoolean (true)String
no, No, NOBoolean (false)String
on, On, ONBoolean (true)String
off, Off, OFFBoolean (false)String
y, Y, n, NBooleanString

The fix is quoting. Or, if you are writing a new schema, say "true/false only" in the docs and lint for it.

tls:
  enabled: yes      # YAML 1.1: boolean true. YAML 1.2: string "yes". Surprise.
tls:
  enabled: true     # unambiguous across versions

Type tags (!!str, !!binary, %TAG)

When you want to force a type, prepend a tag:

pi: !!str 3.14159       # string, not float
pi: !!float "3.14159"   # explicit float
b:  !!binary |
  R0lGODlhEAAQAMQfAJnG...

zero: !!int 0
empty_map: !!map {}

%TAG directives let you define short prefixes for custom tag URIs. Mostly you don't need them — they show up in libraries that emit explicit !python/object tags. Be careful with yaml.load() (unsafe) vs yaml.safe_load(): the former can instantiate arbitrary Python objects from tags and is a remote-code-execution hazard when loading untrusted YAML.

Bad / good reference table

BadGoodWhy
country: NOcountry: 'NO'NO is a YAML 1.1 boolean.
version: 1.10version: "1.10"Float truncation: 1.10 → 1.1.
mode: 644mode: '0644'Ansible file mode wants a string.
port: 0080port: 80Leading zero → octal 64.
mac: 01:23:45:67:89:abmac: "01:23:45:67:89:ab"YAML 1.1 sexagesimal.
tls: yestls: trueDepends on parser version.
phone: 07123456789phone: "07123456789"Leading zero drop.
date: 2026-04-23date: "2026-04-23"1.1 parses as datetime; 1.2 as date. If you want a string, quote.
mixed: [GB, FR, no, ES]mixed: ['GB', 'FR', 'no', 'ES']Third item becomes false.
Tabs for indentationTwo spaces, consistentTabs forbidden at indent positions.
- key:value- key: valueMissing space after colon: parse error or string.
name: hello #notename: hello # noteA comment needs a space before #.
<<: *defaults (deep merge expected)Merge at the leaf level explicitly, or use a templating stepMerge is shallow; removed in YAML 1.2.
Single yaml.safe_load on --- streamsyaml.safe_load_allSilently returns only the first doc.
Password with :, *, or # plainDouble-quoted, or stored elsewhereParse errors or silent truncation.
Tooling: yamllint catches most of these, and configuring truthy: {allowed-values: ['true', 'false']} will flag every yes/on/etc.. Run it in CI. Most shops bite once; the ones that do it twice do not have yamllint.

See also: YAML Basics, Jinja2 Advanced, Ansible Variables.