Terraform Basics

Providers, resources, data sources, variables, outputs, state, modules, drift, and the lifecycle block. A short guide to when Terraform is the right tool — and when it very much isn't.

If you only remember six things

State is the source of truth. If you lose it, Terraform forgets every resource it manages — plan accordingly (remote backend, versioning, lock).
Pin your providers and Terraform version in required_providers. Floating versions have broken production before and will again.
terraform plan is read-only. Run it in CI on every MR, even without apply. Drift shows up there.
A module is just a directory. Don't over-engineer — inline resources until a second caller needs them.
lifecycle { prevent_destroy = true } is the cheapest way to avoid a rm-rf-prod moment on RDS / S3 buckets.
Terraform is for long-lived infrastructure. App deploys, queue messages, short-lived resources: use the right tool instead.

On this page

Anatomy of a config
Providers
Resources and data sources
Variables and precedence
Outputs
State and backends
Modules
Plan, apply, drift
fmt, validate, workspaces
Lifecycle
When NOT to use Terraform

Anatomy of a config

A Terraform project is one or more .tf files in a directory. Canonical split:

project/
├── versions.tf        # terraform {} block: required_version, required_providers
├── providers.tf       # provider "aws" {} etc.
├── variables.tf       # variable "foo" {}
├── main.tf            # resources
├── outputs.tf         # output "bar" {}
└── terraform.tfvars   # values for variables (gitignored if sensitive)

The filenames are convention — Terraform concatenates every .tf in the directory before parsing. Order within a file is irrelevant; Terraform builds a dependency graph from references.

Providers

# versions.tf
terraform {
  required_version = "~> 1.9"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.60"
    }
    cloudflare = {
      source  = "cloudflare/cloudflare"
      version = "~> 4.40"
    }
  }
}

# providers.tf
provider "aws" {
  region = "eu-west-1"
  # Auth comes from the environment: AWS_PROFILE or AWS_ACCESS_KEY_ID/…
  # Never put static keys in code.
}

provider "aws" {
  alias  = "useast1"
  region = "us-east-1"     # Second provider instance (e.g. for ACM + CloudFront)
}

Pin by minor (~> 5.60 means >= 5.60.0, < 6.0.0). Pinning the major only is asking for a breaking change on a Tuesday.
Auth via environment. Static credentials in provider blocks end up in state, git, and the operator's shell history.
Aliases let you target multiple regions/accounts from one config: provider = aws.useast1 on the resource.

Resources and data sources

A resource is something Terraform creates and manages. A data source is something Terraform reads but does not own.

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]  # Canonical
  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.small"
  tags = {
    Name      = "web-01"
    ManagedBy = "terraform"
  }
}

Reference syntax:

resource_type.name.attr for a resource (aws_instance.web.id)
data.resource_type.name.attr for a data source
var.foo, local.bar, module.baz.out for variables, locals, module outputs

Always tag (or label) everything with ManagedBy = "terraform". The first time someone clicks a button in the console on an unlabelled resource, it silently diverges from state and you have minutes of terraform apply to undo.

Variables and precedence

variable "region" {
  type        = string
  description = "AWS region for primary resources."
  default     = "eu-west-1"
}

variable "instance_count" {
  type    = number
  default = 1

  validation {
    condition     = var.instance_count >= 1 && var.instance_count <= 10
    error_message = "instance_count must be 1..10."
  }
}

variable "db_password" {
  type      = string
  sensitive = true     # redacts in plan/apply output and outputs
}

Variable precedence, lowest to highest (higher wins):

default in the variable block
terraform.tfvars
*.auto.tfvars (alphabetical)
-var-file=… on the CLI
-var=foo=bar on the CLI
Environment variables: TF_VAR_foo=bar

Gitignore any .tfvars file containing secrets, or keep secrets out of .tfvars entirely and use TF_VAR_db_password from a CI secret store.

Outputs

output "web_ip" {
  value       = aws_instance.web.public_ip
  description = "Public IPv4 of the web instance."
}

output "db_password" {
  value     = aws_db_instance.main.password
  sensitive = true      # redacts in CLI output; still plaintext in state
}

Outputs are the public interface of a root module or a child module. They are also how you move values between state files (terraform_remote_state data source, or — better — published to a central parameter store).

State and backends

State is a JSON file mapping Terraform's view of the world to the real resources. Local state (terraform.tfstate) is fine for learning; for anything real you want a remote backend that offers three things:

Durability — the file lives somewhere versioned and backed up.
Locking — only one apply at a time against the same state.
Access control — who can read (and therefore see secrets in) state.

AWS: S3 + DynamoDB

terraform {
  backend "s3" {
    bucket         = "mycorp-tfstate-prod"
    key            = "platform/network.tfstate"
    region         = "eu-west-1"
    dynamodb_table = "tf-lock"          # provides state locking
    encrypt        = true
  }
}

Bucket versioning on, SSE-KMS, IAM read restricted to your CI role and the platform team. The DynamoDB table has a single LockID string key.

GCS

terraform {
  backend "gcs" {
    bucket = "mycorp-tfstate-prod"
    prefix = "platform/network"        # object-level lock is native in GCS
  }
}

Terraform Cloud / HCP Terraform

terraform {
  cloud {
    organization = "mycorp"
    workspaces { name = "platform-network-prod" }
  }
}

Gives you state, locking, remote runs, policy checks (Sentinel/OPA), and a UI. The free tier covers small teams.

GitLab-managed HTTP backend

terraform {
  backend "http" {
    address        = "https://gitlab.example.com/api/v4/projects/42/terraform/state/prod"
    lock_address   = "https://gitlab.example.com/api/v4/projects/42/terraform/state/prod/lock"
    unlock_address = "https://gitlab.example.com/api/v4/projects/42/terraform/state/prod/lock"
    lock_method    = "POST"
    unlock_method  = "DELETE"
    retry_wait_min = "5"
  }
}

Auth via TF_HTTP_USERNAME and TF_HTTP_PASSWORD (a personal access token or a project job token). GitLab stores state per project with native locking.

Modules

A module is any directory with .tf files. There are two kinds by role:

Root module — the directory you run terraform apply in.
Child module — called from another module via module "x" { source = "..." }.

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.8"               # registry, semver-pinned

  name = "prod"
  cidr = "10.20.0.0/16"
  azs  = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]
  private_subnets = ["10.20.1.0/24", "10.20.2.0/24", "10.20.3.0/24"]
  public_subnets  = ["10.20.101.0/24", "10.20.102.0/24", "10.20.103.0/24"]
}

module "app" {
  source = "git::ssh://git@gitlab.example.com/infra/modules/app.git//terraform?ref=v1.4.2"
  vpc_id = module.vpc.vpc_id
}

Sources:

Terraform Registry (namespace/name/provider) — public or private. Always pin version.
Git (git::ssh://… or git::https://…). Pin with ?ref=v1.4.2 (a tag, not a branch).
Local path (./modules/foo) for modules in the same repo.

Don't build a module until you have two callers. A "module" with one consumer is just an abstraction tax. Inline, copy/paste, refactor into a module the second time you'd have to edit the same thing twice.

Plan, apply, drift

terraform init          # download providers, set up backend
terraform plan -out=tfplan
terraform apply tfplan  # apply exactly what was planned, no surprises
terraform destroy       # tear down everything in this state

Drift is when the real world differs from state. The canonical check is terraform plan on an unchanged config:

terraform plan -detailed-exitcode
# 0 = no diff; 1 = error; 2 = non-empty diff (drift or a pending change)

Run this on a schedule in CI and alert on exit code 2. The most common causes of drift are humans clicking in the console, auto-scaling changes the provider's resource doesn't track well, and IAM bits that other roles mutate.

fmt, validate, workspaces

terraform fmt -recursive          # canonical whitespace; run as a pre-commit hook
terraform validate                # syntactic + basic semantic checks
terraform providers               # show the dependency tree
terraform graph | dot -Tpng > graph.png

Workspaces let one backend store multiple states:

terraform workspace new staging
terraform workspace select staging
terraform.workspace   # reference as ${terraform.workspace} in HCL

Workspaces ≠ environments. Workspaces share the same code and provider config. For prod vs dev, prefer separate root modules (envs/prod/, envs/dev/) with separate state files. Workspaces are good for ephemeral per-PR stacks.

Lifecycle

resource "aws_s3_bucket" "logs" {
  bucket = "mycorp-logs-prod"

  lifecycle {
    prevent_destroy       = true   # apply will refuse to delete this
    create_before_destroy = true   # for resources that can't be replaced in place
    ignore_changes        = [tags["LastTouchedBy"]]  # stop chasing a field another system owns
    replace_triggered_by  = [aws_launch_template.web.latest_version]
  }
}

prevent_destroy — belt-and-braces for databases, buckets, DNS zones. To actually delete, you edit the code first.
create_before_destroy — for names that must stay unique (launch templates, IAM roles with assumed policies). Terraform creates the replacement, swaps references, then deletes the old.
ignore_changes — for attributes a different system writes (autoscaling-driven counts, labels from an operator). Without this, every apply fights the other system.
replace_triggered_by — force a replace when a dependency changes (e.g. bumping an AMI forces new instances).

When NOT to use Terraform

Scenario	Why Terraform is wrong	What to use
App deployments (Docker image rollouts)	Terraform wants to own a resource's lifecycle; frequent redeploys thrash state and locks.	CD tool (ArgoCD, Flux, GitLab CI `docker push` + restart).
Short-lived resources (per-PR envs of hundreds of objects)	State grows faster than you can clean it up.	A thin script + the cloud CLI, or ephemeral workspaces torn down on PR close.
In-container config (install packages, copy files)	Terraform is about resources, not OS state.	Ansible (best practices), Packer (golden images).
Data mutation (DB rows, queue messages)	Terraform diff-and-apply against row-level data is a bad fit and dangerous.	Migrations (Flyway, sqitch, schema-diff tools).
Cluster-managed objects (Kubernetes Pods, HPAs that auto-scale)	Terraform will fight the controller constantly.	Manifests via GitOps; let Kubernetes own Kubernetes.

Next: Terraform + Cloudflare for a concrete real project, and Packer for the OS-image side of IaC.