Rsyslog Forwarding

Designing reliable central log pipelines with relays, TLS, queueing, JSON templates, journald input, and delivery troubleshooting.

Forwarding heuristics
  • Use RELP when you care about guaranteed delivery, plain TCP when you want a good default, and UDP only for best-effort devices or very old senders.
  • Put a relay tier between edge hosts and the final log platform when links are slow, sites are remote, or the downstream stack is expensive to restart.
  • Turn on action queues and disk assist before production. Most "rsyslog lost logs" incidents are really "no queue when the receiver disappeared".
  • Treat log forwarding certificates like any other service identity: use a private CA, verify names, and rotate them deliberately. See PKI Design.
  • Monitor queue depth and suspended actions continuously. A forwarding pipeline that is silently backing up is already degraded.

Central logging architecture

Think in three layers: senders generate messages, relays buffer and normalize them, and a final destination stores or indexes them. Rsyslog is very good at the first two jobs. It does not need to be the search UI to be useful.

PatternUse it whenTrade-off
Host -> central collectorOne site, low latency, modest volumeSimplest design, but every host depends directly on the collector
Host -> site relay -> central collectorBranch offices, VPN links, unreliable WANsMore moving parts, but queues stay local to the site
Host -> local file + remote forwardYou want both local incident data and central searchConsumes more disk locally, but gives you a fallback during outages

A common production shape is: applications write to journald, rsyslog reads from journald, rsyslog forwards to a regional relay with disk-assisted queues, then the relay sends onward to the SIEM or log warehouse.

Separate buffering from indexing. Elasticsearch, Loki, Splunk, and friends are storage/search systems. Rsyslog relays are transport/buffering systems. Keep that boundary clear and outages are easier to survive.

UDP vs TCP vs RELP

ProtocolDefault portWhy use itMain drawback
UDP syslog514/udpWorks with legacy network gear; lowest overheadNo session, no backpressure, no delivery guarantee
TCP syslog514/tcpConnection-oriented, easier to firewall and debugConnection loss can still drop in-flight messages
Syslog over TLS6514/tcpEncrypted transport with peer identity checksCertificate lifecycle and name validation matter
RELP2514/tcpApplication-level acknowledgements and clean resume after failureBoth ends need RELP modules; fewer non-rsyslog senders support it
# Legacy shorthand
# UDP: single @
*.* @loghub01.example.com:514

# TCP: double @@
*.* @@loghub01.example.com:514

# Preferred RainerScript form for TCP
action(
  type="omfwd"
  target="loghub01.example.com"
  port="514"
  protocol="tcp"
)

# RELP sender
module(load="omrelp")
action(
  type="omrelp"
  target="loghub01.example.com"
  port="2514"
)
Do not cargo-cult UDP. It is acceptable for switches, PDUs, or gear that cannot do better. For Linux servers under your control, default to TCP or RELP.

Receiver design

The receiver should listen on explicit ports, store remote logs away from local system logs, and validate its config before restart. If you want per-host files, use a dynamic file template.

# /etc/rsyslog.d/10-receiver.conf
module(load="imtcp")
module(load="imrelp")

template(name="RemotePerHost" type="string"
  string="/var/log/remote/%HOSTNAME%/%PROGRAMNAME%.log")

ruleset(name="remote_store") {
  action(
    type="omfile"
    dynaFile="RemotePerHost"
    createDirs="on"
    dirCreateMode="0750"
    fileCreateMode="0640"
  )
}

input(type="imtcp" port="514" ruleset="remote_store")
input(type="imrelp" port="2514" ruleset="remote_store")
dnf install -y rsyslog rsyslog-relp policycoreutils-python-utils
mkdir -p /var/log/remote

firewall-cmd --permanent --add-port=514/tcp
firewall-cmd --permanent --add-port=2514/tcp
firewall-cmd --reload

# Non-default syslog ports need SELinux labelling
semanage port -a -t syslogd_port_t -p tcp 2514

rsyslogd -N1
systemctl restart rsyslog
ss -tlnp | egrep ':514|:2514'

For relay nodes, keep the storage ruleset independent from the forwarding ruleset. That way local disk writes still work if the upstream destination is down.

TLS forwarding

TLS gives you confidentiality and peer verification. Use a private CA, issue a server certificate to the receiver, and verify the receiver name from the sender with StreamDriverPermittedPeers.

# /etc/rsyslog.d/20-tls-receiver.conf
global(
  defaultNetstreamDriver="gtls"
  defaultNetstreamDriverCAFile="/etc/pki/rsyslog/ca.crt"
  defaultNetstreamDriverCertFile="/etc/pki/rsyslog/server.crt"
  defaultNetstreamDriverKeyFile="/etc/pki/rsyslog/server.key"
)

module(load="imtcp"
  StreamDriver.Name="gtls"
  StreamDriver.Mode="1"
  StreamDriver.AuthMode="x509/name")

input(type="imtcp" port="6514" ruleset="remote_store")
# /etc/rsyslog.d/20-tls-sender.conf
global(
  defaultNetstreamDriver="gtls"
  defaultNetstreamDriverCAFile="/etc/pki/rsyslog/ca.crt"
  defaultNetstreamDriverCertFile="/etc/pki/rsyslog/client.crt"
  defaultNetstreamDriverKeyFile="/etc/pki/rsyslog/client.key"
)

action(
  type="omfwd"
  target="loghub01.example.com"
  port="6514"
  protocol="tcp"
  StreamDriver="gtls"
  StreamDriverMode="1"
  StreamDriverAuthMode="x509/name"
  StreamDriverPermittedPeers="loghub01.example.com"
  action.resumeRetryCount="-1"
  queue.type="LinkedList"
  queue.filename="tls-forward"
  queue.saveOnShutdown="on"
)
firewall-cmd --permanent --add-port=6514/tcp
firewall-cmd --reload
semanage port -a -t syslogd_port_t -p tcp 6514

rsyslogd -N1
openssl s_client -connect loghub01.example.com:6514 \
  -CAfile /etc/pki/rsyslog/ca.crt \
  -verify_hostname loghub01.example.com
Peer names are exact. If the sender expects loghub01.example.com but the certificate only contains logs.example.com, the TLS handshake succeeds but rsyslog still rejects the peer as untrusted.

Queues and disk assist

A forwarding action without a queue is fragile. The queue gives rsyslog somewhere to hold messages while the receiver is restarting or the link is flapping. Disk assist keeps the queue alive when memory pressure or a long outage would otherwise drop events.

# /etc/rsyslog.d/30-forwarding.conf
global(workDirectory="/var/lib/rsyslog")

action(
  type="omfwd"
  target="logrelay01.example.com"
  port="6514"
  protocol="tcp"
  StreamDriver="gtls"
  StreamDriverMode="1"
  StreamDriverAuthMode="x509/name"
  StreamDriverPermittedPeers="logrelay01.example.com"
  action.resumeRetryCount="-1"
  action.reportSuspension="on"
  action.reportSuspensionContinuation="on"
  queue.type="LinkedList"
  queue.size="100000"
  queue.dequeueBatchSize="1000"
  queue.highWatermark="80000"
  queue.lowWatermark="20000"
  queue.filename="fwd-main"
  queue.maxDiskSpace="10g"
  queue.saveOnShutdown="on"
)
Size the disk intentionally. Ten gigabytes of queue space sounds large until a chatty app starts logging stack traces at 20 MB/s. Work backwards from outage tolerance and log rate, not gut feel.

Templates and JSON output

Structured output makes downstream parsing cheaper. A string template with JSON escaping is the safe baseline. Emit one JSON object per line so tools like Loki, Logstash, Fluent Bit, and jq can consume the stream directly.

template(name="jsonLine" type="string"
  string="{\"@timestamp\":\"%timereported:::date-rfc3339%\",\
\"host\":\"%hostname%\",\
\"fromhost_ip\":\"%fromhost-ip%\",\
\"program\":\"%programname%\",\
\"facility\":\"%syslogfacility-text%\",\
\"severity\":\"%syslogseverity-text%\",\
\"msg\":%msg:::json%}\n")

action(
  type="omfile"
  file="/var/log/remote/normalized.json"
  template="jsonLine"
)
# Forward the same JSON template onward instead of raw RFC3164/RFC5424
action(
  type="omfwd"
  target="logstash01.example.com"
  port="10514"
  protocol="tcp"
  template="jsonLine"
  queue.type="LinkedList"
  queue.filename="json-out"
)

If you need richer metadata, pull fields from journald or message parsing modules first, then emit them in the template. Keep the transport reliable and the schema boring.

journald integration

On modern systemd hosts, journald sees more structure than classic syslog sockets do. Loading imjournal lets rsyslog forward fields like _SYSTEMD_UNIT and _PID instead of just the flattened message text.

# /etc/rsyslog.d/05-journal.conf
module(load="imjournal"
  StateFile="imjournal.state"
  PersistStateInterval="1000")

# Avoid duplicate local logging if journald is already handing messages to rsyslog
$OmitLocalLogging on

if $!_SYSTEMD_UNIT == "sshd.service" then {
  action(
    type="omfwd"
    target="loghub01.example.com"
    port="6514"
    protocol="tcp"
    StreamDriver="gtls"
    StreamDriverMode="1"
    StreamDriverAuthMode="x509/name"
    StreamDriverPermittedPeers="loghub01.example.com"
  )
}
journalctl -u sshd -n 20 --output=json-pretty
journalctl -u rsyslog -n 50
logger -t rsyslog-test "journal smoke test from $(hostname)"
Watch for duplicates. If both imuxsock and imjournal ingest the same messages, every event appears twice. See systemd & journalctl for the journald side of that integration.

Monitoring queues and delivery

impstats exposes what rsyslog itself is doing: queue depth, enqueued messages, discarded events, action suspensions, and more. Bind it to its own ruleset so statistics are not stuck behind the same broken main queue they are meant to describe.

module(load="impstats"
  interval="60"
  severity="7"
  format="json"
  resetCounters="on"
  ruleset="stats")

ruleset(name="stats"
  queue.type="LinkedList"
  queue.filename="statsq"
  queue.saveOnShutdown="on") {
  action(type="omfile" file="/var/log/rsyslog-impstats.json")
}
tail -f /var/log/rsyslog-impstats.json
journalctl -u rsyslog -n 100 --no-pager

# End-to-end smoke test
logger -p local0.notice "forwarding test from $(hostname) at $(date -Iseconds)"

# Transport checks
ss -tnp | grep rsyslog
tcpdump -ni any port 6514

Alert on trends, not just hard failure: queue size climbing for ten minutes, repeated action suspended messages, or discarded.full becoming non-zero.

Troubleshooting

SymptomLikely causeWhat to check
No remote logs arrive; sender restarts cleanly Wrong protocol or port, firewall drop, or receiver not listening ss -tlnp on the receiver, nc -zv loghub01 514 or 6514 from sender, and tcpdump on both sides
action suspended in journalctl -u rsyslog Receiver down, TLS handshake failure, or name mismatch Inspect the journal, then verify with openssl s_client and compare cert SAN/CN to StreamDriverPermittedPeers
Queue files grow under /var/lib/rsyslog Downstream outage or queue too small for current burst rate Read impstats, check disk free space, and calculate expected hold time from current log rate
Messages from systemd units are missing fields or doubled imjournal not loaded, or loaded together with socket input without dedupe Review the active input modules and compare a raw journalctl --output=json event to what rsyslog emits
TLS port listens but rsyslog cannot bind on restart SELinux does not allow that non-default port ausearch -m AVC -ts recent, semanage port -l | grep syslogd, and the workflow in SELinux Debugging
Network devices send logs but lines are missing under load UDP loss on the network or sender-side rate limiting Prefer TCP/RELP where possible; for devices, check their own send queue and rate-limit settings
# Safe debugging sequence
rsyslogd -N1
systemctl restart rsyslog
journalctl -u rsyslog -n 50 --no-pager
logger -p authpriv.notice "rsyslog delivery test"
tcpdump -ni any host loghub01.example.com and port 6514

Cross-reference