Redis Ops

Running Redis safely: install, tune redis.conf, choose RDB vs AOF, secure with ACLs, replicate, and pick between Sentinel and Cluster. Plus the single most common production mistake.

If you only remember six things

Set maxmemory. Set maxmemory-policy. Redis without a cap is a time bomb.
Never run KEYS * in production. Use SCAN. Every time.
RDB is a snapshot; AOF is a log. Different recovery characteristics — know which you have.
Sentinel = HA for a single shard. Cluster = sharded + HA. They are not interchangeable.
Authentication: ACLs (Redis 6+). Never requirepass alone and never shared.
Pipelining beats round-trips. Use it or lose half your throughput to network latency.

On this page

Install
redis.conf: the knobs that matter
Persistence: RDB vs AOF
Authentication & ACLs
Pub/Sub, Streams, and everyday data types
Replication with replicaof
Sentinel: HA for a single shard
Cluster: sharding with hash slots
Eviction
Monitoring & diagnostics
Common pitfalls

Install

# EL9 — ships a recent Redis in AppStream:
sudo dnf install -y redis
sudo systemctl enable --now redis

# Verify:
redis-cli ping       # PONG
redis-cli info server | head

On Debian/Ubuntu the package is redis-server and the service is redis-server. On EL the service and config live at /etc/redis/redis.conf (or /etc/redis.conf on older builds).

redis.conf: the knobs that matter

# /etc/redis/redis.conf (excerpt)
bind 127.0.0.1 10.0.0.7                   # never 0.0.0.0 without auth AND a firewall
port 6379
protected-mode yes                        # refuses remote connections without a password

# --- Memory ---
maxmemory 12gb
maxmemory-policy allkeys-lru              # see the eviction table below

# --- Persistence ---
save 3600 1
save 300 100
save 60 10000
appendonly yes
appendfsync everysec                      # good default
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# --- Replication ---
replica-read-only yes
repl-backlog-size 128mb
replica-serve-stale-data yes

# --- Security ---
requirepass redacted                      # legacy; prefer ACL users below
aclfile /etc/redis/users.acl
rename-command FLUSHDB ""                 # disable destructive commands entirely
rename-command FLUSHALL ""
rename-command CONFIG ""                  # or rename to an obscure string

# --- Clients ---
timeout 0
tcp-keepalive 60
maxclients 10000

# --- Logging ---
loglevel notice
logfile /var/log/redis/redis.log

Defense in depth. protected-mode yes, bind to specific interfaces, require auth, firewall to specific subnets. Redis is a remote code execution engine pretending to be a cache — DEBUG, CONFIG, and MODULE LOAD are all dangerous if an attacker reaches the port.

Persistence: RDB vs AOF

	RDB	AOF
What it is	Point-in-time binary snapshot	Append-only log of every write
Restore time	Fast (load one file)	Slower (replay all commands)
Worst-case data loss	Minutes (whatever's between snapshots)	≤ 1 second with `everysec`
Disk footprint	Small	Grows, then rewrites when large
Fork cost	Occasional big fork	Occasional rewrite fork
Good for	Cold backups, warm-up after restart	Durable writes you can't afford to lose

Production answer: run both. RDB snapshots go to backup storage; AOF bounds live data-loss to one second. Tune appendfsync:

always — fsync on every write. Safe, slow.
everysec — default. Lose at most one second on OS crash.
no — let the OS decide. Fast, unsafe.

# Trigger an RDB snapshot right now, non-blocking:
redis-cli BGSAVE
# AOF rewrite on demand (compacts the log):
redis-cli BGREWRITEAOF

Authentication & ACLs

Redis 6 introduced ACLs: per-user passwords, command whitelists, and key-pattern restrictions. Use them.

# Inside redis-cli as a CONFIG-capable user:
ACL SETUSER app_rw on >redacted ~appcache:* &* +@read +@write -@dangerous
ACL SETUSER app_ro on >redacted ~appcache:* +@read
ACL SETUSER admin  on >redacted ~* &* +@all
ACL SAVE             # persists to aclfile

# From a shell:
redis-cli --user app_rw --pass 'redacted' SET key value

~pattern — which keys are allowed.
&pattern — which pub/sub channels are allowed.
+@category / -@category — allow/deny whole categories (@read, @write, @dangerous, @admin).
nopass only inside a hard-firewalled network, and even then, no.

Pub/Sub, Streams, and everyday data types

# Strings
SET user:42:name "Jane"
INCR pageviews
EXPIRE user:42:name 3600

# Hashes (small objects, field-level access)
HSET user:42 name "Jane" email "jane@example.com" tier "pro"
HGET user:42 email

# Lists (queues, recent-X feeds)
LPUSH jobs '{"id":1,"op":"thumb"}'
BRPOP jobs 30                             # blocking pop

# Sets / sorted sets
SADD followers:42 17 29 31
ZADD leaderboard 2500 player:1 2450 player:2
ZRANGEBYSCORE leaderboard 2000 +inf WITHSCORES

# Pub/Sub (fire-and-forget; no persistence)
SUBSCRIBE events
PUBLISH events '{"user":42,"type":"login"}'

# Streams — persistent, consumer groups, like Kafka-lite
XADD orders '*' id 7 total 42.50
XGROUP CREATE orders workers '$' MKSTREAM
XREADGROUP GROUP workers worker1 COUNT 10 BLOCK 5000 STREAMS orders '>'
XACK orders workers 1700000000-0

Pub/Sub has no persistence, no delivery guarantees, no consumer groups. If a subscriber is offline, it misses messages. Use Streams when you need either persistence or work-queue semantics.

Replication with replicaof

# On the replica:
replicaof primary.redis 6379
masterauth redacted                       # required if primary has requirepass
replica-read-only yes

redis-cli INFO replication
# On a primary:
#   role:master
#   connected_slaves:2
#   slave0:ip=10.0.0.8,port=6379,state=online,offset=12345,lag=0
# On a replica:
#   role:slave
#   master_link_status:up
#   master_last_io_seconds_ago:0
#   master_sync_in_progress:0

Replication is asynchronous. A primary can ack a write before any replica has received it — a primary-crash window of lost writes exists. For stronger durability use WAIT numreplicas timeout:

redis-cli SET important "value"
redis-cli WAIT 1 500        # wait up to 500ms for 1 replica to confirm

Sentinel: HA for a single shard

Sentinel is a companion process that watches a primary + its replicas. When a quorum agrees the primary is down, Sentinel promotes a replica and tells clients the new address.

# /etc/redis/sentinel.conf (on each of 3+ Sentinel nodes)
port 26379
sentinel monitor mymaster 10.0.0.7 6379 2     # quorum = 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 30000
sentinel parallel-syncs mymaster 1
sentinel auth-pass mymaster redacted

redis-sentinel /etc/redis/sentinel.conf
# Ask Sentinel for the current primary:
redis-cli -p 26379 SENTINEL get-master-addr-by-name mymaster

Clients must be Sentinel-aware (most language clients are). They query Sentinel for the current primary, not a hard-coded host.

Cluster: sharding with hash slots

Redis Cluster partitions the keyspace into 16384 hash slots. Each key's slot is CRC16(key) mod 16384. Each node owns a range of slots; replicas shadow each primary.

# On every node:
cluster-enabled yes
cluster-config-file nodes.conf
cluster-node-timeout 15000
appendonly yes

# Bootstrap a 6-node cluster (3 primaries, 3 replicas):
redis-cli --cluster create \
  10.0.0.1:6379 10.0.0.2:6379 10.0.0.3:6379 \
  10.0.0.4:6379 10.0.0.5:6379 10.0.0.6:6379 \
  --cluster-replicas 1

# Inspect:
redis-cli -c -h 10.0.0.1 CLUSTER NODES
redis-cli -c -h 10.0.0.1 CLUSTER INFO

Multi-key commands across slots fail. MGET k1 k2 only works if both keys land in the same slot. Force co-location with a hash tag: the substring between { and } is the only part hashed, so user:{42}:name and user:{42}:email share a slot.

Eviction

When Redis hits maxmemory it applies the eviction policy to make room. If the policy is noeviction (default!) writes start failing with OOM errors.

Policy	Behaviour	When to use
`noeviction`	Return errors on write when full	When Redis is the system of record, never a cache
`allkeys-lru`	Evict least-recently-used key, any key	General-purpose cache. Probably what you want.
`allkeys-lfu`	Evict least-frequently-used key	Hot-set workloads where recency lies
`volatile-lru`	LRU among keys with a TTL	Mixed: persistent keys (no TTL) and cache keys (TTL)
`allkeys-random`	Evict a random key	Rarely
`volatile-ttl`	Evict the key expiring soonest	Queue-like workloads

Monitoring & diagnostics

# Top-level:
redis-cli INFO                           # sectioned; pipe to grep
redis-cli INFO memory
redis-cli INFO replication
redis-cli INFO stats

# Real-time:
redis-cli --stat                         # per-second summary
redis-cli --latency                      # ping latency to the server
redis-cli --latency-history -i 5         # long-run latency

# Big-key hunt (never in peak hours):
redis-cli --bigkeys                      # samples; not exhaustive
redis-cli --memkeys                      # sorts by memory

# Per-key memory:
redis-cli MEMORY USAGE appcache:user:42 SAMPLES 0

# Slow log:
redis-cli CONFIG SET slowlog-log-slower-than 10000   # microseconds, so 10ms
redis-cli SLOWLOG GET 20
redis-cli SLOWLOG RESET

# Trace what's happening RIGHT NOW (expensive!):
redis-cli MONITOR                        # prints every command; use for seconds, not minutes

Common pitfalls

Anti-pattern	Why it bites	Better
`KEYS *` in production	O(N) scan of every key, blocks the single thread	`SCAN 0 MATCH pattern COUNT 500`, in a loop
Storing 5 MB values	One big key blocks the event loop during serialise/replicate	Chunk the data; Redis likes millions of small values
No `maxmemory` cap	OS OOM kills Redis	Set it to ~80% of available RAM
Trusting `EXPIRE` for precise TTL	Expiry is lazy + sampled, not real-time	Fine for caches; don't use for scheduling
Using pub/sub as a durable bus	Offline subscriber = lost messages	Use Streams with consumer groups
Cluster + multi-key ops without hash tags	CROSSSLOT errors	Use `{tag}` to co-locate
No TLS on cross-subnet traffic	Plaintext auth over the network	`tls-port`, `tls-cert-file`, `tls-key-file`, `tls-ca-cert-file`
Running `DEBUG SLEEP` or `SAVE` on a live primary	Blocks every client for the duration	`BGSAVE`, and rename `DEBUG` out of existence