Logging to IONOS Cloud LaaS with Nomad
Last year, I covered how to set up and configure the IONOS Cloud LaaS offering. Now it is time to showcase how we collect logs from Nomad via FluentBit and send them to the IONOS Cloud LaaS.
Over the past 1.5 years, we've experimented with various techniques to collect logs from Nomad and send them to the IONOS Cloud LaaS. Both Nomad and LaaS have their own limitations, which present challenges.
In IONOS LaaS, you define pipelines to accept logs from your applications. Each pipeline accepts tags, which allow you to group log messages. However, there are limitations: Initially, we could define 10 tags per pipeline, but currently, only 5 tags per pipeline are supported these days. Also, tag naming possibilities are limited: only 12 characters are allowed, and only letters can be used.
In Nomad, FluentBit operates directly on log files. We only have the task name from the task log file name to identify the task (or job) and route the logs to the correct pipeline and tag.
Fortunately, FluentBit is highly flexible, and with some Lua scripting, we can overcome these limitations. Ideally, we'd have a tag for each Nomad job. However, since we run many jobs on Nomad, creating multiple LaaS pipelines and maintaining complex logic in FluentBit would be cumbersome.
Instead, we opted for a simpler approach. We created 1 pipeline with 3 tags: Nomad, Prod, Stag
- The Nomad tag is used to route all logs from all service jobs (e.g. Traefik, SeaweedFS, ...)
- The Prod tag is used to route all logs from all Nomad jobs with production applications
- The Stag tag is used to route all logs from all Nomad jobs with staging (or testing) applications
This way, we can use the Pipeline tags as the main filter in our Grafana instance to filter the logs. Also, we push the name of the task to LaaS, which gives us an additional way to filter for a specific task.
Nomad Client configuration
First, we need to make sure that the Nomad client has access to the IONOS Cloud LaaS. Therefore, we need to add the following lines to the Nomad client configuration to expose the "logs" host volume:
client {
enabled = true
host_volume "logs" {
path = "/var/nomad/data/alloc"
read_only = true
}
}
Nomad Job configuration
We utilize the following Nomad job file to deploy and run FluentBit in Nomad, enabling it to access and process logs:
variable "tag" {
type = string
default = "4.0.3"
}
job "fluentbit" {
datacenters = ["dc1"]
type = "system"
node_pool = "all"
meta {
Version = "${var.tag}"
}
update {
stagger = "1s"
max_parallel = 1
}
group "fluentbit" {
volume "logs" {
type = "host"
read_only = true
source = "logs"
}
task "fluentbit" {
driver = "docker"
volume_mount {
volume = "logs"
destination = "/data"
read_only = false
}
template {
data = <<EOH
[SERVICE]
flush 5
daemon Off
log_level debug
parsers_file parsers.conf
plugins_file plugins.conf
http_server on
http_listen 0.0.0.0
http_port 2020
storage.metrics on
storage.path /var/log/flb-storage/
storage.sync normal
storage.checksum off
storage.backlog.mem_limit 5M
storage.backlog.flush_on_shutdown off
[INPUT]
Name tail
Tag Nomad
Path /data/*/alloc/logs/*.*.*
Exclude_Path *fluentbit*
Skip_Empty_Lines On
Path_Key filename
[FILTER]
Name lua
Match Nomad
script /fluent-bit/scripts/script.lua
call add_taskname
[FILTER]
Name rewrite_tag
Match Nomad
Rule $log_tag ^Prod$ Prod true
Rule $log_tag ^Stage$ Stage true
Rule $log_tag ^Nomad$ Nomad true
[OUTPUT]
Name forward
Match Nomad
Port {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .PORT }}{{ end }}
Host {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .HOST }}{{ end }}
tls on
Shared_Key {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .SHARED_KEY }}{{ end }}
storage.total_limit_size 5M
[OUTPUT]
Name forward
Match Prod
Port {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .PORT }}{{ end }}
Host {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .HOST }}{{ end }}
tls on
Shared_Key {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .SHARED_KEY }}{{ end }}
storage.total_limit_size 5M
[OUTPUT]
Name forward
Match Stage
Port {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .PORT }}{{ end }}
Host {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .HOST }}{{ end }}
tls on
Shared_Key {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .SHARED_KEY }}{{ end }}
storage.total_limit_size 5M
[OUTPUT]
Name stdout
Match *
storage.total_limit_size 5M
EOH
destination = "local/fluent-bit.conf"
}
template {
data = <<EOH
function add_taskname(tag, timestamp, record)
local filename = record["filename"] or ""
local taskname = string.match(filename, "^.*/(.*)%.std.*$") or "unknown"
record["taskname"] = taskname
-- new_tag is used to route to the matching OUTPUT directive via the rewrite_tag filter
local new_tag = "Nomad"
if string.match(taskname, "-stage$") then
new_tag = "Stage"
elseif string.match(taskname, "-prod$") then
new_tag = "Prod"
end
record["log_tag"] = new_tag
return 2, timestamp, record, new_tag
end
EOH
destination = "local/script.lua"
}
config {
image = "fluent/fluent-bit:${var.tag}"
force_pull = true
hostname = "fluentbit-prod"
network_mode = "host"
volumes = [
"local/fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf",
"local/script.lua:/fluent-bit/scripts/script.lua"
]
}
resources {
cpu = 100
memory = 256
}
}
}
}
FluentBit configuration
To overcome the limitation of only having access to the task name via the log file's filename, we've implemented a naming convention for our tasks. We append a specific postfix to each task name to enable identification of the corresponding tag. The postfixes used are:
- prod for production tasks
- stage for staging tasks
This convention allows us to differentiate between production and staging tasks and apply the correct tag accordingly.
To extract the postfix and to turn the postfix into the corresponding tag, we use the following Lua script:
function add_taskname(tag, timestamp, record)
local filename = record["filename"] or ""
local taskname = string.match(filename, "^.*/(.*)%.std.*$") or "unknown"
record["taskname"] = taskname
-- new_tag is used to route to the matching OUTPUT directive via the rewrite_tag filter
local new_tag = "Nomad"
if string.match(taskname, "-stage$") then
new_tag = "Stage"
elseif string.match(taskname, "-prod$") then
new_tag = "Prod"
end
record["log_tag"] = new_tag
return 2, timestamp, record, new_tag
end
However, that was not enough to get the log routing in FluentBit right. We had to add the following filter to route the log based on the log_tag
entry set by the Lua script to the expected output:
[FILTER]
Name rewrite_tag
Match Nomad
Rule $log_tag ^Prod$ Prod true
Rule $log_tag ^Stage$ Stage true
Rule $log_tag ^Nomad$ Nomad true
And since that still was not enough, we also had to define an OUTPUT directive for each tag:
[OUTPUT]
Name forward
Match Nomad
Port {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .PORT }}{{ end }}
Host {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .HOST }}{{ end }}
tls on
Shared_Key {{ with nomadVar "nomad/jobs/fluentbit" }}{{ .SHARED_KEY }}{{ end }}
storage.total_limit_size 5M
While this is redundant, we could at least pull the configuration (Host, Port, Shared Key) from Nomad variables.
After successfully deploying this setup over six months ago, it has consistently delivered value for our specific use case. The flexibility it provides has been particularly beneficial, allowing us to seamlessly route logs to the correct pipeline and tag without requiring frequent configuration updates - even as we continue to onboard new jobs to our Nomad cluster.