Hi there,
Still just feeling my way around userspace processing with netfilter,
so anything (such as stupidity on my part) is possible.
In the hope of teaching myself something I've hacked the nfqnl_test.c
code to give me statistics about packets outgoing from our mailserver.
One of the processes here which generates outgoing packets is part of
our Nagios/Icinga setup. It reports how many days remain before the
certificate for our mail server will expire. The Icinga server calls
a plugin on the mailserver. The plugin uses openssl to get the mail-
server to connect to itself to query the certificate, and reports the
valid days remaining result to Icinga. I chose to get the mailserver
to connect to its own LAN interface IP to request this information.
This worked fine for years. Right up until I installed nfqnl_test.
To begin with it didn't appear to cause any problems. But three and a
half days later, all the network connections to the box went down. To
recover, I had to go to a console and kill the nfqnl_test process.
The three and a half day delay turned out to be repeatable, so it took
quite a while to figure out what was going on. It *looks* like what's
happening is that if nfqnl_test sees a packet with *both* the source
and destination IPv4 addresses equal to that of the Ethernet interface
and it then calls
int result = nfq_set_verdict(qh, id, NF_ACCEPT, 0, NULL);
the packet is not accepted even though the return value says it was:
May 14 16:44:16 mail6 NFqueue[27783]: Line 254: pktlen=64 data=[450000405283400040060eb2c0a82c19c0a82c19cfe80019b5fe6f1b812db3e5]
May 14 16:44:16 mail6 NFqueue[27783]: Line 290: ACCEPT packet (nfq_set_verdict returned [32])
I don't know what happens to the packet. I think it gets stuck in the
queue, and eventually the queue fills up and the network hangs:
May 6 04:19:12 mail6 kernel: nfnetlink_queue: nf_queue: full at 1024 entries, dropping packets(s)
May 6 04:19:12 mail6 kernel: nfnetlink_queue: nf_queue: full at 1024 entries, dropping packets(s)
May 6 04:19:12 mail6 kernel: nfnetlink_queue: nf_queue: full at 1024 entries, dropping packets(s)
...
...
If I put in a rule (rule 4 below) which ACCEPTs these packets before
they're sent to userspace, things work as I expect them to work.
Chain OUTPUT (policy ACCEPT 35456608 packets, 9701968770 bytes)
num target prot opt in out source destination
1 NFQUEUE all -- * * 0.0.0.0/0 192.168.241.0/28 NFQUEUE num 11 bypass
2 ACCEPT !tcp -- * * 0.0.0.0/0 0.0.0.0/0
3 ACCEPT all -- * * 127.0.0.0/8 0.0.0.0/0
4 ACCEPT all -- * * 192.168.44.25 192.168.44.25
5 NFQUEUE tcp -- * * 0.0.0.0/0 0.0.0.0/0 NFQUEUE num 11 bypass
So to my questions:
1. Does any of this make sense?
2. How can I instrument the queue? I'd really like to know how many
packets are in it at any time I choose to look. I've looked at a few
utilities but nothing seems to answer the question.
3. Any suggestions to take the investigation further?
--
73,
Ged.