mbdns 1.0.1Sunday, Jan 6, 2019 · 700 words · approx 4 mins to read
I’ve just released
mbdns v1.0.1, a bug fix release that stops
mbdns devouring all open file descriptors on the system. While it’s just a 2 line change, how I noticed, investigated and fixed the bug, and verified the fix was correct, might be useful to others.
mbdns runs an infinite loop to process record updates, and every record update opens a file descriptor (
fd) for the socket it needs to make the http request to Mythic Beasts' API endpoint. I’d mistakenly used an (idiomatic) golang language feature called
defer to read the request’s response, which works by releasing the object held by defer when the outer function body exits.
But because of the infinite loop, which never exits by design, each trip round the loop would consume an extra
fd per request and eventually exhaust them. Oops! How did I notice that was true? Running it for a while on the Ubiquiti EdgeRouter X (ER-X) that powers my home network caused
fd exhaustion quite quickly due to a low number on that platform, which stopped the router from doing its other jobs. Internet access suddenly went away!
How do you figure out the number of possible open file descriptors on a Unix-like platform? You run
ulimit -Hn to see the hard limit (4096 on an ER-X), and
ulimit -Sn to see the soft limit (1024 on the ER-X). I’d setup
mbdns to update 2 records every 300 seconds (24
fds consumed per hour), so it only took a little over 42 hours to exhaust the 1024
fd soft limit and bring the system to a halt.
mbdns was exhausting descriptors, I took at look at its log to see what was happening after the limit was reached. I should have taken a copy of the exact message returned by golang when a http request fails due to lack of
fds, but it was quite descriptive and very clear about what was happening: no
fds were available so sockets couldn’t be used.
Knowing that the HTTP request was the reason for exhaustion, due to knowing that on Unix platforms a socket operation opens a descriptor, all I needed to do was look at the
mbdns source code and go through the
process() loop to see where it was creating and holding on to
mbdns is incredibly simple, so the fix was easy: always read the response body and close it, regardless of whether it’ll be used or not (it only gets printed into the log on error during normal operation).
How did I verify the fix was correct? I modified a test version of
mbdns to shorten the loop iteration wait time to 10 seconds, ran it, and then asked the ER-X how many open file descriptors
mbdns was holding with the following:
lsof -p $(pidof mbdns)
That command lists the open files (
lsof) of the process ID belonging to
pidof mbdns). Running that a few times to see the output before, during and after a run round the loop a few times showed that there was an extra
fd per request per loop, but that it was released when the request was complete. Success!
There was some other interesting output from
lsof, showing me the other descriptors the process always keeps open. Those included what the shell was holding to run — one
fd to the binary image for
mbdns itself on disk — a descriptor for reading
/dev/urandom, which is presumably to give the golang runtime a source of randomness, and also the two file handles for
stdout that I redirect to the log to see what it’s doing!
lsof is a pretty handy utility to get to know, to help you get a handle (NOT SORRY) on what a process is doing with files on a Unix or Unix-like system.
If you’re an
mbdns user, please upgrade to v1.0.1 as soon as possible, lest you run out of file descriptors on the host system eventually (or pretty soon if you run it on a resource constrained platform like the ER-X!).