The mlx5_irq_alloc() function can inadvertently free the entire rmap and end up in a crash[1] when the other threads tries to access this, when request_irq() fails due to exhausted IRQ vectors. This commit modifies the cleanup to remove only the specific IRQ mapping that was just added. This prevents removal of other valid mappings and ensures precise cleanup of the failed IRQ allocation's associated glue object. Note: This error is observed when both fwctl and rds configs are enabled. [1] mlx5_core 0000:05:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:05:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:06:00.0: Successfully registered panic handler for port 1 mlx5_core 0000:06:00.0: mlx5_irq_alloc:293:(pid 66740): Failed to request irq. err = -28 infiniband mlx5_0: mlx5_ib_test_wc:290:(pid 66740): Error -28 while trying to test write-combining support mlx5_core 0000:06:00.0: Successfully unregistered panic handler for port 1 mlx5_core 0000:03:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 mlx5_core 0000:05:00.0: mlx5_irq_alloc:293:(pid 28895): Failed to request irq. err = -28 general protection fault, probably for non-canonical address 0xe277a58fde16f291: 0000 [#1] SMP NOPTI RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Call Trace:     ? show_trace_log_lvl+0x1d6/0x2f9   ? show_trace_log_lvl+0x1d6/0x2f9   ? mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core]   ? __die_body.cold+0x8/0xa   ? die_addr+0x39/0x53   ? exc_general_protection+0x1c4/0x3e9   ? dev_vprintk_emit+0x5f/0x90   ? asm_exc_general_protection+0x22/0x27   ? free_irq_cpu_rmap+0x23/0x7d   mlx5_irq_alloc.cold+0x5d/0xf3 [mlx5_core]   irq_pool_request_vector+0x7d/0x90 [mlx5_core]   mlx5_irq_request+0x2e/0xe0 [mlx5_core]   mlx5_irq_request_vector+0xad/0xf7 [mlx5_core]   comp_irq_request_pci+0x64/0xf0 [mlx5_core]   create_comp_eq+0x71/0x385 [mlx5_core]   ? mlx5e_open_xdpsq+0x11c/0x230 [mlx5_core]   mlx5_comp_eqn_get+0x72/0x90 [mlx5_core]   ? xas_load+0x8/0x91   mlx5_comp_irqn_get+0x40/0x90 [mlx5_core]   mlx5e_open_channel+0x7d/0x3c7 [mlx5_core]   mlx5e_open_channels+0xad/0x250 [mlx5_core]   mlx5e_open_locked+0x3e/0x110 [mlx5_core]   mlx5e_open+0x23/0x70 [mlx5_core]   __dev_open+0xf1/0x1a5   __dev_change_flags+0x1e1/0x249   dev_change_flags+0x21/0x5c   do_setlink+0x28b/0xcc4   ? __nla_parse+0x22/0x3d   ? inet6_validate_link_af+0x6b/0x108   ? cpumask_next+0x1f/0x35   ? __snmp6_fill_stats64.constprop.0+0x66/0x107   ? __nla_validate_parse+0x48/0x1e6   __rtnl_newlink+0x5ff/0xa57   ? kmem_cache_alloc_trace+0x164/0x2ce   rtnl_newlink+0x44/0x6e   rtnetlink_rcv_msg+0x2bb/0x362   ? __netlink_sendskb+0x4c/0x6c   ? netlink_unicast+0x28f/0x2ce   ? rtnl_calcit.isra.0+0x150/0x146   netlink_rcv_skb+0x5f/0x112   netlink_unicast+0x213/0x2ce   netlink_sendmsg+0x24f/0x4d9   __sock_sendmsg+0x65/0x6a   ____sys_sendmsg+0x28f/0x2c9   ? import_iovec+0x17/0x2b   ___sys_sendmsg+0x97/0xe0   __sys_sendmsg+0x81/0xd8   do_syscall_64+0x35/0x87   entry_SYSCALL_64_after_hwframe+0x6e/0x0 RIP: 0033:0x7fc328603727 Code: c3 66 90 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 0b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 35 44 89 c7 48 89 44 24 08 e8 44 ed ff ff 48 RSP: 002b:00007ffe8eb3f1a0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 000000000000000d RCX: 00007fc328603727 RDX: 0000000000000000 RSI: 00007ffe8eb3f1f0 RDI: 000000000000000d RBP: 00007ffe8eb3f1f0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 R13: 0000000000000000 R14: 00007ffe8eb3f3c8 R15: 00007ffe8eb3f3bc   ---[ end trace f43ce73c3c2b13a2 ]--- RIP: 0010:free_irq_cpu_rmap+0x23/0x7d Code: 0f 1f 80 00 00 00 00 48 85 ff 74 6b 55 48 89 fd 53 66 83 7f 06 00 74 24 31 db 48 8b 55 08 0f b7 c3 48 8b 04 c2 48 85 c0 74 09 <8b> 38 31 f6 e8 c4 0a b8 ff 83 c3 01 66 3b 5d 06 72 de b8 ff ff ff RSP: 0018:ff384881640eaca0 EFLAGS: 00010282 RAX: e277a58fde16f291 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ff2335e2e20b3600 RSI: 0000000000000000 RDI: ff2335e2e20b3400 RBP: ff2335e2e20b3400 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 00000000ffffffe4 R12: ff384881640ead88 R13: ff2335c3760751e0 R14: ff2335e2e1672200 R15: ff2335c3760751f8 FS:  00007fc32ac22480(0000) GS:ff2335e2d6e00000(0000) knlGS:0000000000000000 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f651ab54000 CR3: 00000029f1206003 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x1dc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) kvm-guest: disable async PF for cpu 0 Fixes: 3354822cde5a ("net/mlx5: Use dynamic msix vectors allocation") Signed-off-by: Mohith Kumar Thummaluru Tested-by: Mohith Kumar Thummaluru Reviewed-by: Moshe Shemesh ---   drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c | 3 +--   1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c index 40024cfa3099..822e92ed2d45 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c @@ -325,8 +325,7 @@ struct mlx5_irq *mlx5_irq_alloc(struct mlx5_irq_pool *pool, int i,   err_req_irq:   #ifdef CONFIG_RFS_ACCEL       if (i && rmap && *rmap) { -        free_irq_cpu_rmap(*rmap); -        *rmap = NULL; +        irq_cpu_rmap_remove(*rmap, irq->map.virq);       }   err_irq_rmap:   #endif -- 2.43.5