-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
We are seeing an issue where fluentd will stop processing logs after some time, but the parent and child processes seem to be running normally.
We are running fluentd in a docker container on a kubernetes cluster, mounting the docker log volume /var/log/containers on the host.
In a recent incident, we saw logs cease being forwarded to the sumologic output, but activity continued in the fluentd log until 12 minutes after that time, eventually no longer picking up new logs (e.g. "following tail of...") at some point after that. containers.log.pos continued being updated for 1 hour 13 minutes after the first sign of problems, until it stopped being updated.
Killing the fluentd child process gets everything going again.
Config, strace, lsof and sigdump included below.
Details:
-
fluentd or td-agent version.
fluentd 0.12.37 -
Environment information, e.g. OS.
host: 4.9.9-coreos-r1
container: debian jessie -
Your configuration
see attachments
Attachments:
fluentd config
lsof of child process
sigdump of child process
strace of child process
fluentd log