Telegraf
Overview
Telegraf is the metrics collection agent that gathers host and service metrics and forwards them to Victoria Metrics for storage, alerting, and visualization.
Architecture
/usr/libexec/telegraf-services ← service status via ubus
/var/run/mwan3/iface_state/ ← WAN interface status via mwan3 state files
/proc filesystem ← CPU, memory, disk, network
│
▼
Telegraf (inputs.exec, inputs.cpu, inputs.mem, …)
│
▼
Victoria Metrics (http://127.0.0.1:8428)
│
└─▶ vmalert (alert rules evaluation)
Configuration Files
| Path | Description |
|---|---|
/etc/telegraf.conf |
Main Telegraf agent config and InfluxDB output |
/etc/telegraf.conf.d/*.conf |
Additional Telegraf input configurations for plugins |
Collected Metrics
To see the list of metrics collected by Telegraf, use:
/usr/bin/telegraf --config /etc/telegraf.conf --config-directory /etc/telegraf.conf.d --test
Service Health Monitoring (services.conf)
How it works: Every 60 seconds, Telegraf executes /usr/libexec/telegraf-services, which queries procd via ubus call service list, filters the fixed monitored service whitelist, and converts the matching configured instances to metrics.
Metric format:
procd_service_running{service="nginx", instance="instance1"} = 1 (running) or 0 (down)
procd_service_pid{service="nginx", instance="instance1"} = process_id
procd_service_exit_code{service="nginx", instance="instance1"} = last_exit_code
Only these services are monitored:
banip
conntrackd
cron
dedalo
dedalo_users_auth
dnsmasq
dropbear
keepalived
mwan3
netifyd
nginx
ns-api-server
ns-clm
ns-flashstart
ns-flows
ns-plug
ns-plug-alert-proxy
ns-stats
ns-ui
odhcpd
openvpn
qosify
rpcd
rsyslog
snort
swanctl
sysntpd
telegraf
victoria-metrics
vmalert
Services with no instances are skipped.
Notable skipped services:
adblock: excluded because ubus info is not reliable (always shows 1 instance even when disabled)
Querying Service Status
# All services and their running state
curl -s 'http://127.0.0.1:8428/api/v1/query?query=procd_service_running'
# Run collection script manually to preview output
/usr/libexec/telegraf-services
Multi-WAN Monitoring (mwan.conf)
How it works: Every 60 seconds, Telegraf executes /usr/libexec/telegraf-mwan, which reads /var/run/mwan3/iface_state/ to determine each WAN interface’s online/offline state (maintained by mwan3 in real-time).
Metric format:
mwan_interface_online{interface="wan"} = 1 (online) or 0 (offline)
Querying WAN Status
# All WAN interfaces and current state
curl -s 'http://127.0.0.1:8428/api/v1/query?query=mwan_interface_online'
# Run collection script manually
/usr/libexec/telegraf-mwan
Storage Status Monitoring (storage.conf)
How it works: Every 60 seconds, Telegraf executes /usr/libexec/telegraf-storage-status, which runs storage-status and exports the current storage health as a metric.
Metric format:
storage_status_error = 1 (error) or 0 (ok / not configured)
Advanced Configuration
To add custom metrics or modify collection intervals, edit the /etc/telegraf.conf.d/ files following Telegraf documentation. Common customizations:
- Modify collection intervals: change
intervalin main config - Add new input plugins: append
[[inputs.plugin_name]]sections
After changes, restart Telegraf:
/etc/init.d/telegraf restart
References
- Telegraf documentation
- Telegraf exec plugin
- OpenWrt procd init scripts
- OpenWrt ubus reference
- Victoria Metrics integration