Terraform Basics

Providers, resources, data sources, variables, outputs, state, modules, drift, and the lifecycle block. A short guide to when Terraform is the right tool — and when it very much isn't.

If you only remember six things
  • State is the source of truth. If you lose it, Terraform forgets every resource it manages — plan accordingly (remote backend, versioning, lock).
  • Pin your providers and Terraform version in required_providers. Floating versions have broken production before and will again.
  • terraform plan is read-only. Run it in CI on every MR, even without apply. Drift shows up there.
  • A module is just a directory. Don't over-engineer — inline resources until a second caller needs them.
  • lifecycle { prevent_destroy = true } is the cheapest way to avoid a rm-rf-prod moment on RDS / S3 buckets.
  • Terraform is for long-lived infrastructure. App deploys, queue messages, short-lived resources: use the right tool instead.

Anatomy of a config

A Terraform project is one or more .tf files in a directory. Canonical split:

project/
├── versions.tf        # terraform {} block: required_version, required_providers
├── providers.tf       # provider "aws" {} etc.
├── variables.tf       # variable "foo" {}
├── main.tf            # resources
├── outputs.tf         # output "bar" {}
└── terraform.tfvars   # values for variables (gitignored if sensitive)

The filenames are convention — Terraform concatenates every .tf in the directory before parsing. Order within a file is irrelevant; Terraform builds a dependency graph from references.

Providers

# versions.tf
terraform {
  required_version = "~> 1.9"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.60"
    }
    cloudflare = {
      source  = "cloudflare/cloudflare"
      version = "~> 4.40"
    }
  }
}

# providers.tf
provider "aws" {
  region = "eu-west-1"
  # Auth comes from the environment: AWS_PROFILE or AWS_ACCESS_KEY_ID/…
  # Never put static keys in code.
}

provider "aws" {
  alias  = "useast1"
  region = "us-east-1"     # Second provider instance (e.g. for ACM + CloudFront)
}

Resources and data sources

A resource is something Terraform creates and manages. A data source is something Terraform reads but does not own.

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]  # Canonical
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.small"
  tags = {
    Name      = "web-01"
    ManagedBy = "terraform"
  }
}

Reference syntax:

Always tag (or label) everything with ManagedBy = "terraform". The first time someone clicks a button in the console on an unlabelled resource, it silently diverges from state and you have minutes of terraform apply to undo.

Variables and precedence

variable "region" {
  type        = string
  description = "AWS region for primary resources."
  default     = "eu-west-1"
}

variable "instance_count" {
  type    = number
  default = 1

  validation {
    condition     = var.instance_count >= 1 && var.instance_count <= 10
    error_message = "instance_count must be 1..10."
  }
}

variable "db_password" {
  type      = string
  sensitive = true     # redacts in plan/apply output and outputs
}

Variable precedence, lowest to highest (higher wins):

  1. default in the variable block
  2. terraform.tfvars
  3. *.auto.tfvars (alphabetical)
  4. -var-file=… on the CLI
  5. -var=foo=bar on the CLI
  6. Environment variables: TF_VAR_foo=bar

Gitignore any .tfvars file containing secrets, or keep secrets out of .tfvars entirely and use TF_VAR_db_password from a CI secret store.

Outputs

output "web_ip" {
  value       = aws_instance.web.public_ip
  description = "Public IPv4 of the web instance."
}

output "db_password" {
  value     = aws_db_instance.main.password
  sensitive = true      # redacts in CLI output; still plaintext in state
}

Outputs are the public interface of a root module or a child module. They are also how you move values between state files (terraform_remote_state data source, or — better — published to a central parameter store).

State and backends

State is a JSON file mapping Terraform's view of the world to the real resources. Local state (terraform.tfstate) is fine for learning; for anything real you want a remote backend that offers three things:

  1. Durability — the file lives somewhere versioned and backed up.
  2. Locking — only one apply at a time against the same state.
  3. Access control — who can read (and therefore see secrets in) state.

AWS: S3 + DynamoDB

terraform {
  backend "s3" {
    bucket         = "mycorp-tfstate-prod"
    key            = "platform/network.tfstate"
    region         = "eu-west-1"
    dynamodb_table = "tf-lock"          # provides state locking
    encrypt        = true
  }
}

Bucket versioning on, SSE-KMS, IAM read restricted to your CI role and the platform team. The DynamoDB table has a single LockID string key.

GCS

terraform {
  backend "gcs" {
    bucket = "mycorp-tfstate-prod"
    prefix = "platform/network"        # object-level lock is native in GCS
  }
}

Terraform Cloud / HCP Terraform

terraform {
  cloud {
    organization = "mycorp"
    workspaces { name = "platform-network-prod" }
  }
}

Gives you state, locking, remote runs, policy checks (Sentinel/OPA), and a UI. The free tier covers small teams.

GitLab-managed HTTP backend

terraform {
  backend "http" {
    address        = "https://gitlab.example.com/api/v4/projects/42/terraform/state/prod"
    lock_address   = "https://gitlab.example.com/api/v4/projects/42/terraform/state/prod/lock"
    unlock_address = "https://gitlab.example.com/api/v4/projects/42/terraform/state/prod/lock"
    lock_method    = "POST"
    unlock_method  = "DELETE"
    retry_wait_min = "5"
  }
}

Auth via TF_HTTP_USERNAME and TF_HTTP_PASSWORD (a personal access token or a project job token). GitLab stores state per project with native locking.

Modules

A module is any directory with .tf files. There are two kinds by role:

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.8"               # registry, semver-pinned

  name = "prod"
  cidr = "10.20.0.0/16"
  azs  = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]
  private_subnets = ["10.20.1.0/24", "10.20.2.0/24", "10.20.3.0/24"]
  public_subnets  = ["10.20.101.0/24", "10.20.102.0/24", "10.20.103.0/24"]
}

module "app" {
  source = "git::ssh://git@gitlab.example.com/infra/modules/app.git//terraform?ref=v1.4.2"
  vpc_id = module.vpc.vpc_id
}

Sources:

Don't build a module until you have two callers. A "module" with one consumer is just an abstraction tax. Inline, copy/paste, refactor into a module the second time you'd have to edit the same thing twice.

Plan, apply, drift

terraform init          # download providers, set up backend
terraform plan -out=tfplan
terraform apply tfplan  # apply exactly what was planned, no surprises
terraform destroy       # tear down everything in this state

Drift is when the real world differs from state. The canonical check is terraform plan on an unchanged config:

terraform plan -detailed-exitcode
# 0 = no diff; 1 = error; 2 = non-empty diff (drift or a pending change)

Run this on a schedule in CI and alert on exit code 2. The most common causes of drift are humans clicking in the console, auto-scaling changes the provider's resource doesn't track well, and IAM bits that other roles mutate.

fmt, validate, workspaces

terraform fmt -recursive          # canonical whitespace; run as a pre-commit hook
terraform validate                # syntactic + basic semantic checks
terraform providers               # show the dependency tree
terraform graph | dot -Tpng > graph.png

Workspaces let one backend store multiple states:

terraform workspace new staging
terraform workspace select staging
terraform.workspace   # reference as ${terraform.workspace} in HCL
Workspaces ≠ environments. Workspaces share the same code and provider config. For prod vs dev, prefer separate root modules (envs/prod/, envs/dev/) with separate state files. Workspaces are good for ephemeral per-PR stacks.

Lifecycle

resource "aws_s3_bucket" "logs" {
  bucket = "mycorp-logs-prod"

  lifecycle {
    prevent_destroy       = true   # apply will refuse to delete this
    create_before_destroy = true   # for resources that can't be replaced in place
    ignore_changes        = [tags["LastTouchedBy"]]  # stop chasing a field another system owns
    replace_triggered_by  = [aws_launch_template.web.latest_version]
  }
}

When NOT to use Terraform

ScenarioWhy Terraform is wrongWhat to use
App deployments (Docker image rollouts)Terraform wants to own a resource's lifecycle; frequent redeploys thrash state and locks.CD tool (ArgoCD, Flux, GitLab CI docker push + restart).
Short-lived resources (per-PR envs of hundreds of objects)State grows faster than you can clean it up.A thin script + the cloud CLI, or ephemeral workspaces torn down on PR close.
In-container config (install packages, copy files)Terraform is about resources, not OS state.Ansible (best practices), Packer (golden images).
Data mutation (DB rows, queue messages)Terraform diff-and-apply against row-level data is a bad fit and dangerous.Migrations (Flyway, sqitch, schema-diff tools).
Cluster-managed objects (Kubernetes Pods, HPAs that auto-scale)Terraform will fight the controller constantly.Manifests via GitOps; let Kubernetes own Kubernetes.

Next: Terraform + Cloudflare for a concrete real project, and Packer for the OS-image side of IaC.