When a nexthop object is deleted, it is marked as dead and then fib_table_flush() is called to flush all the routes that are using the dead nexthop. The current logic in fib_table_flush() is to only flush error routes (e.g., blackhole) when it is called as part of network namespace dismantle (i.e., with flush_all=true). Therefore, error routes are not flushed when their nexthop object is deleted: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip route add 198.51.100.1/32 nhid 1 # ip route add blackhole 198.51.100.2/32 nhid 1 # ip nexthop del id 1 # ip route show blackhole 198.51.100.2 nhid 1 dev dummy1 As such, they keep holding a reference on the nexthop object which in turn holds a reference on the nexthop device, resulting in a reference count leak: # ip link del dev dummy1 [ 70.516258] unregister_netdevice: waiting for dummy1 to become free. Usage count = 2 Fix by flushing error routes when their nexthop is marked as dead. IPv6 does not suffer from this problem. Fixes: 493ced1ac47c ("ipv4: Allow routes to use nexthop objects") Reported-by: Tetsuo Handa Closes: https://lore.kernel.org/netdev/d943f806-4da6-4970-ac28-b9373b0e63ac@I-love.SAKURA.ne.jp/ Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com Signed-off-by: Ido Schimmel --- net/ipv4/fib_trie.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index 59a6f0a9638f..7e2c17fec3fc 100644 --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -2053,10 +2053,11 @@ int fib_table_flush(struct net *net, struct fib_table *tb, bool flush_all) continue; } - /* Do not flush error routes if network namespace is - * not being dismantled + /* When not flushing the entire table, skip error + * routes that are not marked for deletion. */ - if (!flush_all && fib_props[fa->fa_type].error) { + if (!flush_all && fib_props[fa->fa_type].error && + !(fi->fib_flags & RTNH_F_DEAD)) { slen = fa->fa_slen; continue; } -- 2.52.0 Add test cases that check that error routes (e.g., blackhole) are deleted when their nexthop is deleted. Output without "ipv4: Fix reference count leak when using error routes with nexthop objects": # ./fib_nexthops.sh -t "ipv4_fcnal ipv6_fcnal" IPv4 functional ---------------------- [...] WARNING: Unexpected route entry TEST: Error route removed on nexthop deletion [FAIL] IPv6 ---------------------- [...] TEST: Error route removed on nexthop deletion [ OK ] Tests passed: 20 Tests failed: 1 Tests skipped: 0 Output with "ipv4: Fix reference count leak when using error routes with nexthop objects": # ./fib_nexthops.sh -t "ipv4_fcnal ipv6_fcnal" IPv4 functional ---------------------- [...] TEST: Error route removed on nexthop deletion [ OK ] IPv6 ---------------------- [...] TEST: Error route removed on nexthop deletion [ OK ] Tests passed: 21 Tests failed: 0 Tests skipped: 0 Signed-off-by: Ido Schimmel --- tools/testing/selftests/net/fib_nexthops.sh | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/tools/testing/selftests/net/fib_nexthops.sh b/tools/testing/selftests/net/fib_nexthops.sh index 2b0a90581e2f..21026b667667 100755 --- a/tools/testing/selftests/net/fib_nexthops.sh +++ b/tools/testing/selftests/net/fib_nexthops.sh @@ -800,6 +800,14 @@ ipv6_fcnal() set +e check_nexthop "dev veth1" "" log_test $? 0 "Nexthops removed on admin down" + + # error routes should be deleted when their nexthop is deleted + run_cmd "$IP li set dev veth1 up" + run_cmd "$IP -6 nexthop add id 58 dev veth1" + run_cmd "$IP ro add blackhole 2001:db8:101::1/128 nhid 58" + run_cmd "$IP nexthop del id 58" + check_route6 "2001:db8:101::1" "" + log_test $? 0 "Error route removed on nexthop deletion" } ipv6_grp_refs() @@ -1459,6 +1467,13 @@ ipv4_fcnal() run_cmd "$IP ro del 172.16.102.0/24" log_test $? 0 "Delete route when not specifying nexthop attributes" + + # error routes should be deleted when their nexthop is deleted + run_cmd "$IP nexthop add id 23 dev veth1" + run_cmd "$IP ro add blackhole 172.16.102.100/32 nhid 23" + run_cmd "$IP nexthop del id 23" + check_route "172.16.102.100" "" + log_test $? 0 "Error route removed on nexthop deletion" } ipv4_grp_fcnal() -- 2.52.0