From: Cosmin Ratiu esw_functions_changed_event_handler -> esw_vfs_changed_event_handler is called from the esw->work_queue and acquires the devlink lock. Changing the esw mode is done via .eswitch_mode_set (acquires devlink lock in the devlink_nl_pre_doit call) -> mlx5_devlink_eswitch_mode_set -> mlx5_eswitch_disable_locked -> mlx5_eswitch_event_handler_unregister -> flush_workqueue. This creates a circular lock dependency which could lead to a real deadlock, as the code flushing the workqueue is holding the devlink lock, and the work handler being flushed could try to acquire it. Fix that by adding a new bool field 'notifier_enabled' next to the event handler scheduling the work, keeping it true while the notifier is active, and using it to repeatedly try to acquire the devlink lock from the work handler while true, with a slight delay to avoid busy looping. This avoids the deadlock because the event handler will be removed first (turning 'notifier_enabled' false), and the work handler will eventually give up in acquiring the lock because the work is no longer necessary. Fixes: f1bc646c9a06 ("net/mlx5: Use devl_ API in mlx5_esw_offloads_devlink_port_register") Signed-off-by: Cosmin Ratiu Reviewed-by: Moshe Shemesh Reviewed-by: Dragos Tatulea Signed-off-by: Tariq Toukan --- drivers/net/ethernet/mellanox/mlx5/core/eswitch.c | 6 +++++- drivers/net/ethernet/mellanox/mlx5/core/eswitch.h | 1 + .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 12 +++++++++++- 3 files changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c index 4b7a1ce7f406..fddc3b33222d 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c @@ -1066,14 +1066,18 @@ static void mlx5_eswitch_event_handler_register(struct mlx5_eswitch *esw) if (esw->mode == MLX5_ESWITCH_OFFLOADS && mlx5_eswitch_is_funcs_handler(esw->dev)) { MLX5_NB_INIT(&esw->esw_funcs.nb, mlx5_esw_funcs_changed_handler, ESW_FUNCTIONS_CHANGED); + esw->esw_funcs.notifier_enabled = true; mlx5_eq_notifier_register(esw->dev, &esw->esw_funcs.nb); } } static void mlx5_eswitch_event_handler_unregister(struct mlx5_eswitch *esw) { - if (esw->mode == MLX5_ESWITCH_OFFLOADS && mlx5_eswitch_is_funcs_handler(esw->dev)) + if (esw->mode == MLX5_ESWITCH_OFFLOADS && + mlx5_eswitch_is_funcs_handler(esw->dev)) { + esw->esw_funcs.notifier_enabled = false; mlx5_eq_notifier_unregister(esw->dev, &esw->esw_funcs.nb); + } flush_workqueue(esw->work_queue); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h index ad1073f7b79f..e20574a197e4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h @@ -338,6 +338,7 @@ struct mlx5_host_work { struct mlx5_esw_functions { struct mlx5_nb nb; + bool notifier_enabled; bool host_funcs_disabled; u16 num_vfs; u16 num_ec_vfs; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index ea94a727633f..0199bea2cb31 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -3597,7 +3597,17 @@ esw_vfs_changed_event_handler(struct mlx5_eswitch *esw, const u32 *out) return; devlink = priv_to_devlink(esw->dev); - devl_lock(devlink); + /* Repeatedly try to grab the lock with a delay while this work is + * still relevant. + * This allows a concurrent mlx5_eswitch_event_handler_unregister + * (holding the devlink lock) to flush the wq without deadlocking. + */ + while (!devl_trylock(devlink)) { + if (!esw->esw_funcs.notifier_enabled) + return; + schedule_timeout_interruptible(msecs_to_jiffies(10)); + } + /* Number of VFs can only change from "0 to x" or "x to 0". */ if (esw->esw_funcs.num_vfs > 0) { mlx5_eswitch_unload_vf_vports(esw, esw->esw_funcs.num_vfs); -- 2.44.0