Hello Chris,
Thank you for the detailed write-up and for the time spent narrowing this down.
We would like to move this to a support ticket so we can collect the necessary environment details and verify whether this is a separate issue or something already addressed to our team recently. Based on your report, the behaviour appears to be tied to imunify-auditd-log-reader activity under audit log load, so we would like to inspect it on a case-by-case basis.
Please submit a ticket here: https://cloudlinux.zendesk.com/hc/en-us/requests/new
Also, please attach the Imunify360 doctor key to it: https://cloudlinux.zendesk.com/hc/en...w-to-obtain-it
imunify-auditd-log-reader bug
Collapse
X
-
imunify-auditd-log-reader bug
Spotted 1 bug, code defect causing measurable CPU waste that scales with server load.
(why link libstdc++ in a go binary anyway?)
daemon consumes sustained CPU on otherwise idle servers. Measured across two independent servers:
server a 5.5% ~30h 72h 2.4× realtime
server b 22.1% ~30h 366h 12.2× realtime
The fact that impact scales proportionally with server load (audit.log write frequency) is itself diagnostic — it points directly to a read loop driven by incoming events rather than a fixed-cost background operation.
binary opens an inotify file descriptor (fd=19) in O_NONBLOCK mode and relies on the Go runtime's netpoller to park the goroutine between events. The netpoller uses epoll internally.
However, inotify file descriptors have specific epoll semantics that this version of the Go runtime cannot satisfy — every epoll_ctl registration attempt fails.
$ strace -p 1268509 -e trace=epoll_wait,epoll_ctl -c -- sleep 10 %
time seconds usecs/call calls errors syscall
100.00 0.000287 5 55 55 epoll_ctl <- 100% error rate
0.00 0.000000 0 0 0 epoll_wait <- never called
epoll_wait is never invoked. The goroutine cannot park. As a result, it immediately retries the read after every EAGAIN, producing a tight spin loop.
The Go scheduler fires SIGURG signals continuously to preempt the spinning goroutine — which simply reschedules and spins again.
Proof — strace read trace:
read(19, "\xc8\x02\x00\x00\x02\x00\x00\x00\x00\x00\x00\ x00\ x00\x00\x00\x00", 65536) = 16
# ^ inotify IN_WRITE event consumed (16 bytes = one inotify_event struct, no filename)
read(19, 0xc0005c7e53, 65536) = -1 EAGAIN ← fd drained, goroutine should park here
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=1268509, si_uid=0} ---
--- SIGURG ... --- (×30+ more — Go scheduler attempting preemption)
read(19, "\xc8\x02\x00\x00\x02\x00...", 65536) = 16 ← next event, loop repeats
read(19, 0xc0005c7e53, 65536) = -1 EAGAIN
...
The decoded inotify event (wd=0x2c8, mask=0x2 = IN_WRITE, cookie=0, name_len=0) shows the mechanism is correctly receiving write notifications on audit.log — the event detection works fine. The failure is entirely in the goroutine's inability to park between events.
The standard Go pattern for inotify bypasses the netpoller entirely by using a blocking fd on a dedicated OS thread.
maybe it should a) use blocking mode, OS parks the thread, no go scheduler involvement
or migrate it to fsnotify ?
by the way, on imunify-agent-proxy you ship debug symbols in production and it's not stripped.Tags: None
Leave a comment: