I knew log management was going to be an issue from the very beginning of setting up my kubernetes cluster. 4 months later, I finally have a solution that works for me.
Overview
I used fluent-bit to continuously tail my container logs, and syslog-ng to collect and store them.
This setup has the following advantages compared to relying on kubectl logs
:
- Logs are stored in a central location, not separated across nodes
- They are stored persistently
- Logs are decoupled from any Kubernetes resources. i.e. they are not lost when a pod is deleted, or even when the entire cluster is deleted
Prerequisites
- A Kubernetes cluster
- Some kind of storage class available in the cluster
Set up syslog-ng
This syslog-ng pod will serve as the destination for all logs collected by fluent-bit. It will later be referenced in the fluent-bit configuration.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
|
apiVersion: v1
kind: ConfigMap
metadata:
name: syslog
namespace: monitoring
data:
syslog-ng.conf: |
@version: 4.2
@include "scl.conf"
options {
time-zone("America/Denver");
};
source s_fluentd_tcp {
syslog(port(514) transport("tcp") flags(syslog-protocol));
};
source s_fluentd_udp {
syslog(port(514) transport("udp") flags(syslog-protocol));
};
# Dynamic log path with Namespace, Pod, and Container
destination d_namespace_logs {
file("/var/log/k8s/${.SDATA.kubernetes.namespace_name}/${.SDATA.kubernetes.app}.log"
create-dirs(yes)
template("$ISODATE ${.SDATA.kubernetes.pod_name} ${.SDATA.kubernetes.container_name} $MSG\n"));
};
log {
source(s_fluentd_tcp);
source(s_fluentd_udp);
destination(d_namespace_logs);
};
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: syslog
namespace: monitoring
labels:
app: syslog
spec:
replicas: 1
selector:
matchLabels:
app: syslog
template:
metadata:
labels:
app: syslog
spec:
containers:
- name: syslog
image: lscr.io/linuxserver/syslog-ng:latest
ports:
- containerPort: 514
- containerPort: 601
- containerPort: 6514
env:
- name: PUID
value: "1000"
- name: PGID
value: "1000"
- name: TZ
value: "America/Denver"
volumeMounts:
- mountPath: /var/log
name: syslog
- mountPath: /config/syslog-ng.conf
name: syslog-config
subPath: syslog-ng.conf
- name: logrotate
image: blacklabelops/logrotate
env:
- name: LOGROTATE_COPIES
value: "10"
- name: LOGS_DIRECTORIES
value: /var/log/
- name: LOGROTATE_INTERVAL
value: daily
- name: LOGROTATE_DATEFORMAT
value: "-%Y%m%d"
volumeMounts:
- mountPath: /var/log
name: syslog
securityContext:
fsGroup: 1000
volumes:
- name: syslog
persistentVolumeClaim:
claimName: syslog
- name: syslog-config
configMap:
name: syslog
---
apiVersion: v1
kind: Service
metadata:
name: syslog
namespace: monitoring
spec:
selector:
app: syslog
ports:
- protocol: UDP
port: 514
name: syslog-udp
targetPort: 514
- protocol: TCP
name: syslog-tcp
port: 601
targetPort: 601
- protocol: TCP
port: 6514
targetPort: 6514
name: syslog-tls
|
For the persistent volume, I used a storage class backed by CephFS, so that logs are stored redundantly across 3 storage nodes. The configuration is not included here, but any storage class should work.
syslog-ng configuration
In the syslog configuration, I used two variables natively available in syslog. They will be populated by fluent-bit when it sends logs to syslog-ng.
${.SDATA.kubernetes}
: $SDATA
is a special field that contains a data structure with multiple fields. The key kubernetes
will be set up later in the fluent-bit configuration to store kubernetes-specific fields.
$ISODATE
: The current date and time in ISO format. There are a few other date formats available.
$MSG
: The log message
The above configuration should create a directory structure like this:
1
2
3
4
5
6
7
|
.
|-- namespace1
| |-- app1.log
| |-- app2.log
|-- namespace2
| |-- app1.log
| |-- noapp.log
|
Logs will be stored according to the namespace and app name (deployment name) of the pod that generated them. If the app name is not available, the logs will be stored in a file named noapp.log
.
Log rotation
Along with the syslog-ng container, I also deployed a logrotate container. This container will rotate logs daily and keep the last 10 copies of each log file.
The result will look like this:
1
2
3
4
5
|
.
|-- namespace1
| |-- app1.log
| |-- app1.log-20250327
| |-- app1.log-20250326
|
Set up fluent-bit
I used helm
to install fluent-bit, with the following configuration:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
|
# values.yaml
serviceAccount:
create: true
annotations: {}
name:
rbac:
create: true
nodeAccess: false
eventsAccess: false
## https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/configuration-file
config:
service: |
[SERVICE]
Daemon Off
Flush {{ .Values.flush }}
Log_Level {{ .Values.logLevel }}
Parsers_File /fluent-bit/etc/parsers.conf
Parsers_File /fluent-bit/etc/conf/custom_parsers.conf
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_Port {{ .Values.metricsPort }}
Health_Check On
## https://docs.fluentbit.io/manual/pipeline/inputs
inputs: |
[INPUT]
Name tail
Path /var/log/containers/*.log
multiline.parser docker, cri
Tag kube.*
Mem_Buf_Limit 5MB
Skip_Long_Lines On
[INPUT]
Name systemd
Tag host.*
Systemd_Filter _SYSTEMD_UNIT=k3s.service
Read_From_Tail On
## https://docs.fluentbit.io/manual/pipeline/filters
filters: |
[FILTER]
Name kubernetes
Match kube.*
Merge_Log On
Keep_Log On
K8S-Logging.Parser On
K8S-Logging.Exclude on
Buffer_Size 64KB
[FILTER]
Name nest
Match kube.*
Operation lift
Nested_Under kubernetes
Add_prefix k8s_
[FILTER]
Name nest
Match kube.*
Operation lift
Nested_Under k8s_labels
Add_prefix k8s_
[FILTER]
Name modify
Match kube.*
Add k8s_app noapp
[FILTER]
Name nest
Match kube.*
Wildcard k8s_*
Operation nest
Nest_under kubernetes
Remove_prefix k8s_
## https://docs.fluentbit.io/manual/pipeline/outputs
outputs: |
[OUTPUT]
name syslog
match kube.*
host syslog.monitoring.svc.cluster.local
port 514
mode udp
syslog_format rfc5424
syslog_maxsize 2048
syslog_severity_key severity
syslog_facility_key facility
syslog_hostname_key hostname
syslog_appname_key appname
syslog_procid_key app
syslog_msgid_key msgid
syslog_sd_key kubernetes
syslog_message_key log
[OUTPUT]
Name stdout
Match *
Format json
|
Install fluent-bit with the values:
1
2
|
helm repo add fluent https://fluent.github.io/helm-charts
helm install fluent-bit fluent/fluent-bit -f values.yaml -n monitoring
|
Now both fluent-bit and syslog-ng should be running, producing logs in the destination configured earlier.
Filters
Here, I’m using the kubernetes
filter to parse logs generated by Kubernetes pods. The filter will generate a field in the log message called kubernetes
, which will contain all the kubernetes-specific fields.
However, the structure will be nested like this:
1
2
3
4
5
6
7
|
kubernetes:
namespace_name: namespace1
pod_name: pod1
container_name: container1
labels:
app: app1
...
|
And since syslog-ng cannot handle nested fields, I flattened the fields like this:
1
2
3
4
5
6
|
kubernetes:
namespace_name: namespace1
pod_name: pod1
container_name: container1
app: app1
...
|
This will enable syslog-ng to access fields like $SDATA.kubernetes.namespace_name
and $SDATA.kubernetes.app
.
Conclusion
It has been a few days since I set this up, and so far it’s running great.
I’ve also looked at some other solutions such as Grafana Loki, but I found them to be overkill for my needs. What I like about fluent-bit and syslog-ng is that the logs are just collected and stored in plain text, with minimal processing.