Hi!
I'm having, it looks like, an endless amount of issues with
netfilter hooking in the wrong order into the linux network
stack. This time, it is the netfilter "reject" action, which
triggers "martian source" detection in the network stack, in
a weird way.
I've two "external" network interfaces (pointing to two ISPs),
with policy routing set up for them (different IPs are routed
to/via different ISPs), -- eth-mts and eth-rinet. The main
routing table has no default route, there are two additional
routing tables (with just one "default" route pointing to one
or another ISP). And a few rules like 'from foo lookup bar',
routing specific IPs via one or another ISP.
Also there's an "internal" interface, eth-tls, with 192.168.x
addresses behind it. With NAT for certain internal hosts.
The policy rules:
from 192.168.177.2 lookup route-mts
from all lookup route-rinet
route-mts:
default via $route-mts dev eth-mts
route-rinet:
default via $route-rinet dev eth-rinet
The relevant netfilter rules:
ip filter chain forward {
type filter hook forward priority filter
ct state {established,related} accept
iifname eth-tls oifname eth-mts saddr 192.168.177.2 <some-condition>
accept
iifname eth-tls oifname eth-mts saddr 192.168.177.2 reject log
prefix "tls->ext reject: "
}
ip nat chain postrouting {
type nat hook postrouting priority srcnat
oifname eth-mts ip saddr 192.168.177.2 counter snat to $mts-ip
}
With these, the following packets are seen on eth-tls interface
(192.168.177.5 is this router box):
listening on eth-tls, link-type EN10MB (Ethernet), snapshot length
262144 bytes
10:13:12.353416 IP 192.168.177.2.59152 > 16.15.192.246.443: Flags [S],
seq 1254965270, win 64240, options [mss 1460,sackOK,TS val 1778537536
ecr 0,nop,wscale 7], length 0
10:13:12.353617 IP 192.168.177.5 > 192.168.177.2: ICMP 16.15.192.246 tcp
port 443 unreachable, length 68
So far, this all works as expected.
And at the same time, I see this in dmesg:
Jul 03 10:13:12 gate kernel: tls->ext reject: IN=eth-tls OUT=eth-mts
MAC=00:90:27:30:6d:1c:d6:4d:61:a4:e3:03:08:00 SRC=192.168.177.2
DST=16.15.192.246 LEN=60 TOS=0x00 PREC=0x00 TTL=63 ID=58986 DF PROTO=TCP
SPT=41754 DPT=443 WINDOW=64240 RES=0x00 SYN URGP=0
ul 03 10:13:12 gate kernel: IPv4: martian source 192.168.177.2 from
16.15.192.246, on dev eth-rinet
Jul 03 10:13:12 gate kernel: ll header: 00000000: 00 90 27 30 6d 1c d6
4d 61 a4 e3 03 08 00
and this logging makes no sense whatsoever.
This *seems* to be triggered by the reject packet (ICMP), which should
be injected by the netfilter into the network stack. Why it causes
"martians detection" is the first question - it should not, since
this packet is not sent by any external host.
Next, why it is logged as coming from eth-rinet interface?? It is
not coming from any interface, and it has nothing to do with this
interface either - initially it should've been routed to eth-mts,
not eth-rinet.
And finally, eth-rinet iface is configured to drop martians (in
addition to logging it) - rp_filter is set to 1. Yet, this ICMP packet
is actually sent to the originating host.
What is going on here?
It seems that interaction between netfilter code and the network
stack should be reviewed. There are a few rather serious bugs in
there, it looks like, - all are about the order of the hooks between
netfilter and network/routing stack. We're almost there, but not
quite there yet :)
Thanks,
/mjt