8021q(vlan_device_event) will add VLAN 0 when enabling the device, and remove it on disabling it if NETIF_F_HW_VLAN_CTAG_FILTER set. However, if changing filter feature during netdev runtime, null-ptr-unref[1] or bug_on[2] will be triggered by unregister_vlan_dev() for refcount imbalance. [1] BUG: KASAN: null-ptr-deref in unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) Write of size 8 at addr 0000000000000000 by task ip/382 CPU: 2 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #60 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack_lvl (lib/dump_stack.c:123) kasan_report (mm/kasan/report.c:636) unregister_vlan_dev (net/8021q/vlan.h:90 net/8021q/vlan.c:110) rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) [2] kernel BUG at net/8021q/vlan.c:99! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 382 Comm: ip Not tainted 6.16.0-rc3 #61 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:unregister_vlan_dev (net/8021q/vlan.c:99 (discriminator 1)) RSP: 0018:ffff88810badf310 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88810da84000 RCX: ffffffffb47ceb9a RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88810e8b43c8 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff6cefe80 R10: ffffffffb677f407 R11: ffff88810badf3c0 R12: ffff88810e8b4000 R13: 0000000000000000 R14: ffff88810642a5c0 R15: 000000000000017e FS: 00007f1ff68c20c0(0000) GS:ffff888163a24000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1ff5dad240 CR3: 0000000107e56000 CR4: 00000000000006f0 Call Trace: rtnl_dellink (net/core/rtnetlink.c:3511 net/core/rtnetlink.c:3553) rtnetlink_rcv_msg (net/core/rtnetlink.c:6945) netlink_rcv_skb (net/netlink/af_netlink.c:2535) netlink_unicast (net/netlink/af_netlink.c:1314 net/netlink/af_netlink.c:1339) netlink_sendmsg (net/netlink/af_netlink.c:1883) ____sys_sendmsg (net/socket.c:712 net/socket.c:727 net/socket.c:2566) ___sys_sendmsg (net/socket.c:2622) __sys_sendmsg (net/socket.c:2652) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) Root cause is as below: step1: add vlan0 for real_dev, such as bond, team. register_vlan_dev vlan_vid_add(real_dev,htons(ETH_P_8021Q),0) //refcnt=1 step2: disable vlan filter feature and enable real_dev step3: change filter from 0 to 1 vlan_device_event vlan_filter_push_vids ndo_vlan_rx_add_vid //No refcnt added to real_dev vlan0 step4: real_dev down vlan_device_event vlan_vid_del(dev, htons(ETH_P_8021Q), 0); //refcnt=0 vlan_info_rcu_free //free vlan0 step5: real_dev up vlan_device_event vlan_vid_add(dev, htons(ETH_P_8021Q), 0); vlan_info_alloc //alloc new empty vid0. refcnt=1 step6: delete vlan0 unregister_vlan_dev BUG_ON(!vlan_info); //will trigger it if step5 was not executed vlan_group_set_device array = vg->vlan_devices_arrays //null-ptr-ref will be triggered after step5 E.g. the following sequence can reproduce null-ptr-ref $ ip link add bond0 type bond mode 0 $ ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q $ ethtool -K bond0 rx-vlan-filter off $ ifconfig bond0 up $ ethtool -K bond0 rx-vlan-filter on $ ifconfig bond0 down $ ifconfig bond0 up $ ip link del vlan0 Add the auto_vid0 flag to track the refcount of vlan0, and use this flag to determine whether to dec refcount while disabling real_dev. Fixes: ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") Reported-by: syzbot+a8b046e462915c65b10b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=a8b046e462915c65b10b Co-developed-by: Ido Schimmel Signed-off-by: Ido Schimmel Signed-off-by: Dong Chenchen --- net/8021q/vlan.c | 42 +++++++++++++++++++++++++++++++++--------- net/8021q/vlan.h | 1 + 2 files changed, 34 insertions(+), 9 deletions(-) diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index 06908e37c3d9..9a6df8c1daf9 100644 --- a/net/8021q/vlan.c +++ b/net/8021q/vlan.c @@ -357,6 +357,35 @@ static int __vlan_device_event(struct net_device *dev, unsigned long event) return err; } +static void vlan_vid0_add(struct net_device *dev) +{ + struct vlan_info *vlan_info; + int err; + + if (!(dev->features & NETIF_F_HW_VLAN_CTAG_FILTER)) + return; + + pr_info("adding VLAN 0 to HW filter on device %s\n", dev->name); + + err = vlan_vid_add(dev, htons(ETH_P_8021Q), 0); + if (err) + return; + + vlan_info = rtnl_dereference(dev->vlan_info); + vlan_info->auto_vid0 = true; +} + +static void vlan_vid0_del(struct net_device *dev) +{ + struct vlan_info *vlan_info = rtnl_dereference(dev->vlan_info); + + if (!vlan_info || !vlan_info->auto_vid0) + return; + + vlan_info->auto_vid0 = false; + vlan_vid_del(dev, htons(ETH_P_8021Q), 0); +} + static int vlan_device_event(struct notifier_block *unused, unsigned long event, void *ptr) { @@ -378,15 +407,10 @@ static int vlan_device_event(struct notifier_block *unused, unsigned long event, return notifier_from_errno(err); } - if ((event == NETDEV_UP) && - (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER)) { - pr_info("adding VLAN 0 to HW filter on device %s\n", - dev->name); - vlan_vid_add(dev, htons(ETH_P_8021Q), 0); - } - if (event == NETDEV_DOWN && - (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER)) - vlan_vid_del(dev, htons(ETH_P_8021Q), 0); + if (event == NETDEV_UP) + vlan_vid0_add(dev); + else if (event == NETDEV_DOWN) + vlan_vid0_del(dev); vlan_info = rtnl_dereference(dev->vlan_info); if (!vlan_info) diff --git a/net/8021q/vlan.h b/net/8021q/vlan.h index 5eaf38875554..c7ffe591d593 100644 --- a/net/8021q/vlan.h +++ b/net/8021q/vlan.h @@ -33,6 +33,7 @@ struct vlan_info { struct vlan_group grp; struct list_head vid_list; unsigned int nr_vids; + bool auto_vid0; struct rcu_head rcu; }; -- 2.25.1 Add test cases for vlan_filter modification during runtime, which may triger null-ptr-ref or memory leak of vlan0. Signed-off-by: Dong Chenchen --- tools/testing/selftests/net/vlan_hw_filter.sh | 98 ++++++++++++++++--- 1 file changed, 86 insertions(+), 12 deletions(-) diff --git a/tools/testing/selftests/net/vlan_hw_filter.sh b/tools/testing/selftests/net/vlan_hw_filter.sh index 7bc804ffaf7c..0fb56baf28e4 100755 --- a/tools/testing/selftests/net/vlan_hw_filter.sh +++ b/tools/testing/selftests/net/vlan_hw_filter.sh @@ -3,27 +3,101 @@ readonly NETNS="ns-$(mktemp -u XXXXXX)" +ALL_TESTS=" + test_vlan_filter_check + test_vlan0_del_crash_01 + test_vlan0_del_crash_02 + test_vlan0_del_crash_03 + test_vid0_memleak +" + ret=0 +setup() { + ip netns add ${NETNS} +} + cleanup() { - ip netns del $NETNS + ip netns del $NETNS 2>/dev/null } trap cleanup EXIT fail() { - echo "ERROR: ${1:-unexpected return code} (ret: $_)" >&2 - ret=1 + echo "ERROR: ${1:-unexpected return code} (ret: $_)" >&2 + ret=1 +} + +tests_run() +{ + local current_test + for current_test in ${TESTS:-$ALL_TESTS}; do + $current_test + done +} + +test_vlan_filter_check() { + setup + ip netns exec ${NETNS} ip link add bond0 type bond mode 0 + ip netns exec ${NETNS} ip link add bond_slave_1 type veth peer veth2 + ip netns exec ${NETNS} ip link set bond_slave_1 master bond0 + ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter off + ip netns exec ${NETNS} ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0 + ip netns exec ${NETNS} ip link add link bond0 name bond0.0 type vlan id 0 + ip netns exec ${NETNS} ip link set bond_slave_1 nomaster + ip netns exec ${NETNS} ip link del veth2 || fail "Please check vlan HW filter function" + cleanup } -ip netns add ${NETNS} -ip netns exec ${NETNS} ip link add bond0 type bond mode 0 -ip netns exec ${NETNS} ip link add bond_slave_1 type veth peer veth2 -ip netns exec ${NETNS} ip link set bond_slave_1 master bond0 -ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter off -ip netns exec ${NETNS} ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0 -ip netns exec ${NETNS} ip link add link bond0 name bond0.0 type vlan id 0 -ip netns exec ${NETNS} ip link set bond_slave_1 nomaster -ip netns exec ${NETNS} ip link del veth2 || fail "Please check vlan HW filter function" +#enable vlan_filter feature of real_dev with vlan0 during running time +test_vlan0_del_crash_01() { + setup + ip netns exec ${NETNS} ip link add bond0 type bond mode 0 + ip netns exec ${NETNS} ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q + ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter off + ip netns exec ${NETNS} ifconfig bond0 up + ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter on + ip netns exec ${NETNS} ifconfig bond0 down + ip netns exec ${NETNS} ifconfig bond0 up + ip netns exec ${NETNS} ip link del vlan0 || fail "Please check vlan HW filter function" + cleanup +} + +#enable vlan_filter feature and add vlan0 for real_dev during running time +test_vlan0_del_crash_02() { + setup + ip netns exec ${NETNS} ip link add bond0 type bond mode 0 + ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter off + ip netns exec ${NETNS} ifconfig bond0 up + ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter on + ip netns exec ${NETNS} ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q + ip netns exec ${NETNS} ifconfig bond0 down + ip netns exec ${NETNS} ifconfig bond0 up + ip netns exec ${NETNS} ip link del vlan0 || fail "Please check vlan HW filter function" + cleanup +} + +#enable vlan_filter feature of real_dev during running time +#test kernel_bug of vlan unregister +test_vlan0_del_crash_03() { + setup + ip netns exec ${NETNS} ip link add bond0 type bond mode 0 + ip netns exec ${NETNS} ip link add link bond0 name vlan0 type vlan id 0 protocol 802.1q + ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter off + ip netns exec ${NETNS} ifconfig bond0 up + ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter on + ip netns exec ${NETNS} ifconfig bond0 down + ip netns exec ${NETNS} ip link del vlan0 || fail "Please check vlan HW filter function" + cleanup +} + +test_vid0_memleak() { + setup + ip netns exec ${NETNS} ip link add bond0 up type bond mode 0 + ip netns exec ${NETNS} ethtool -K bond0 rx-vlan-filter off + ip netns exec ${NETNS} ip link del dev bond0 || fail "Please check vlan HW filter function" + cleanup +} +tests_run exit $ret -- 2.25.1