From: Aaron Ma ice_resume() schedules an asynchronous PF reset and returns immediately. The reset runs later in ice_service_task(). If userspace tries to bring up the net device before the reset finishes, ice_open() fails with -EBUSY: ice_resume() ice_schedule_reset() # sets ICE_PFR_REQ, returns ... ice_open() ice_is_reset_in_progress() # ICE_PFR_REQ still set, -EBUSY ... ice_service_task() ice_do_reset() ice_rebuild() # clears ICE_PFR_REQ, too late Reproduced on E800 series NICs during suspend/resume with irdma enabled, where the aux device probe widens the race window. ice 0000:81:00.0: can't open net device while reset is in progress Add a best-effort wait (10s timeout, matching ice_devlink_info_get()) for the reset to complete before returning from ice_resume(). In practice the reset completes in ~300ms. Fixes: 769c500dcc1e ("ice: Add advanced power mgmt for WoL") Cc: stable@vger.kernel.org Reviewed-by: Kohei Enju Reviewed-by: Aleksandr Loktionov Reviewed-by: Przemek Kitszel Signed-off-by: Aaron Ma Tested-by: Alexander Nowlin Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_main.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index e2fd2dab03e3..d88835482d3a 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -5637,6 +5637,16 @@ static int ice_resume(struct device *dev) /* Restart the service task */ mod_timer(&pf->serv_tmr, round_jiffies(jiffies + pf->serv_tmr_period)); + /* Best-effort wait for the scheduled reset to finish so that the + * device is operational before returning. Without this, userspace + * (e.g. NetworkManager) may try to open the net device while the + * asynchronous reset is still in progress, hitting -EBUSY. + */ + ret = ice_wait_for_reset(pf, secs_to_jiffies(10)); + if (ret) + dev_err(dev, "Wait for reset timed out (10s) during resume: %d\n", + ret); + return 0; } -- 2.47.1 From: Dawid Osuchowski When a virtual function sends an IRQ map command, the PF will set up interrupts according to that request. However, because these interrupts are never reset, the next time Virtual Function initializes, the interrupts are still enabled for a given VF, which leads to performance degradation in certain cases due to interrupts being unexpectedly enabled and thus causing interrupt floods. Cc: stable@vger.kernel.org Fixes: 1071a8358a28 ("ice: Implement virtchnl commands for AVF support") Suggested-by: Vladimir Medvedkin Reviewed-by: Aleksandr Loktionov Signed-off-by: Dawid Osuchowski Reviewed-by: Simon Horman Tested-by: Patryk Holda Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/ice/ice_vf_lib.c | 27 +++++++++++++++++++ .../ethernet/intel/ice/ice_vf_lib_private.h | 1 + drivers/net/ethernet/intel/ice/virt/queues.c | 21 +++++++++++++++ 3 files changed, 49 insertions(+) diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c b/drivers/net/ethernet/intel/ice/ice_vf_lib.c index 27e4acb1620f..7ce8f66eebbf 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c @@ -847,6 +847,30 @@ static void ice_notify_vf_reset(struct ice_vf *vf) NULL); } +/** + * ice_reset_interrupts - clear all queue interrupt configuration for a VSI + * @vsi: the VSI whose interrupt registers should be cleared + * + * Zero the QINT_RQCTL and QINT_TQCTL registers for all allocated queues + * in the VSI. This clears the entire register including MSIX_INDX, ITR_INDX, + * CAUSE_ENA and NEXTQ fields, unlike ice_vf_dis_rxq_interrupt() which only + * clears the CAUSE_ENA bit. + */ +void ice_reset_interrupts(struct ice_vsi *vsi) +{ + struct ice_pf *pf = vsi->back; + struct ice_hw *hw = &pf->hw; + int i; + + ice_for_each_alloc_rxq(vsi, i) + wr32(hw, QINT_RQCTL(vsi->rxq_map[i]), 0); + + ice_for_each_alloc_txq(vsi, i) + wr32(hw, QINT_TQCTL(vsi->txq_map[i]), 0); + + ice_flush(hw); +} + /** * ice_reset_vf - Reset a particular VF * @vf: pointer to the VF structure @@ -918,6 +942,9 @@ int ice_reset_vf(struct ice_vf *vf, u32 flags) ice_dis_vf_qs(vf); + /* cleanup interrupt registers */ + ice_reset_interrupts(vsi); + /* Call Disable LAN Tx queue AQ whether or not queues are * enabled. This is needed for successful completion of VFR. */ diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib_private.h b/drivers/net/ethernet/intel/ice/ice_vf_lib_private.h index 5392b0404986..321d29c25b7c 100644 --- a/drivers/net/ethernet/intel/ice/ice_vf_lib_private.h +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib_private.h @@ -26,6 +26,7 @@ void ice_initialize_vf_entry(struct ice_vf *vf); void ice_deinitialize_vf_entry(struct ice_vf *vf); void ice_dis_vf_qs(struct ice_vf *vf); +void ice_reset_interrupts(struct ice_vsi *vsi); int ice_check_vf_init(struct ice_vf *vf); enum virtchnl_status_code ice_err_to_virt_err(int err); struct ice_port_info *ice_vf_get_port_info(struct ice_vf *vf); diff --git a/drivers/net/ethernet/intel/ice/virt/queues.c b/drivers/net/ethernet/intel/ice/virt/queues.c index 31be2f76181c..431c9c546b04 100644 --- a/drivers/net/ethernet/intel/ice/virt/queues.c +++ b/drivers/net/ethernet/intel/ice/virt/queues.c @@ -224,6 +224,24 @@ void ice_vf_ena_rxq_interrupt(struct ice_vsi *vsi, u32 q_idx) wr32(hw, QINT_RQCTL(pfq), reg | QINT_RQCTL_CAUSE_ENA_M); } +/** + * ice_vf_dis_rxq_interrupt - disable Rx queue interrupt via QINT_RQCTL + * @vsi: VSI of the VF to configure + * @q_idx: VF queue index used to determine the queue in the PF's space + */ +static void ice_vf_dis_rxq_interrupt(struct ice_vsi *vsi, u32 q_idx) +{ + struct ice_hw *hw = &vsi->back->hw; + u32 pfq = vsi->rxq_map[q_idx]; + u32 reg; + + reg = rd32(hw, QINT_RQCTL(pfq)); + reg &= ~QINT_RQCTL_CAUSE_ENA_M; + wr32(hw, QINT_RQCTL(pfq), reg); + + ice_flush(hw); +} + /** * ice_vc_ena_qs_msg * @vf: pointer to the VF info @@ -416,6 +434,8 @@ int ice_vc_dis_qs_msg(struct ice_vf *vf, u8 *msg) goto error_param; } + for_each_set_bit(vf_q_id, &q_map, ICE_MAX_RSS_QS_PER_VF) + ice_vf_dis_rxq_interrupt(vsi, vf_q_id); bitmap_zero(vf->rxq_ena, ICE_MAX_RSS_QS_PER_VF); } else if (q_map) { for_each_set_bit(vf_q_id, &q_map, ICE_MAX_RSS_QS_PER_VF) { @@ -436,6 +456,7 @@ int ice_vc_dis_qs_msg(struct ice_vf *vf, u8 *msg) goto error_param; } + ice_vf_dis_rxq_interrupt(vsi, vf_q_id); /* Clear enabled queues flag */ clear_bit(vf_q_id, vf->rxq_ena); } -- 2.47.1 From: David Carlier idpf_idc_vport_dev_ctrl(adapter, false) clears vport->vdev_info->adev to NULL but keeps vport->vdev_info itself. An MTU change after that calls idpf_idc_vdev_mtu_event(), which dereferences vdev_info->adev for device_lock() before reaching the (!adev || ...) check. Cache vdev_info->adev once with READ_ONCE() and bail out if NULL before locking. Use the cached pointer on both the lock and unlock paths so the unlock matches the device actually acquired and cannot re-fetch a NULL slot. Fixes: ed6e1c8796a4 ("idpf: implement IDC vport aux driver MTU change handler") Cc: stable@vger.kernel.org Signed-off-by: David Carlier Reviewed-by: Aleksandr Loktionov Tested-by: Jakub Andrysiak Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/idpf/idpf_idc.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/idpf/idpf_idc.c b/drivers/net/ethernet/intel/idpf/idpf_idc.c index b7d6b08fc89e..9f764135507c 100644 --- a/drivers/net/ethernet/intel/idpf/idpf_idc.c +++ b/drivers/net/ethernet/intel/idpf/idpf_idc.c @@ -162,9 +162,12 @@ void idpf_idc_vdev_mtu_event(struct iidc_rdma_vport_dev_info *vdev_info, set_bit(event_type, event.type); - device_lock(&vdev_info->adev->dev); - adev = vdev_info->adev; - if (!adev || !adev->dev.driver) + adev = READ_ONCE(vdev_info->adev); + if (!adev) + return; + + device_lock(&adev->dev); + if (!adev->dev.driver) goto unlock; iadrv = container_of(adev->dev.driver, struct iidc_rdma_vport_auxiliary_drv, @@ -172,7 +175,7 @@ void idpf_idc_vdev_mtu_event(struct iidc_rdma_vport_dev_info *vdev_info, if (iadrv->event_handler) iadrv->event_handler(vdev_info, &event); unlock: - device_unlock(&vdev_info->adev->dev); + device_unlock(&adev->dev); } /** -- 2.47.1 From: Matt Vollrath If an error is encountered while mapping TX buffers, the driver should unmap any buffers already mapped for that skb. Because count is incremented before each frag mapping, it will always match the correct number of unmappings needed when dma_error is reached. Decrementing count before the while loop in dma_error causes an off-by-one error. If any mapping was successful before an unsuccessful mapping, exactly one DMA mapping (the head) would leak. This bug was introduced by a 2010 fix for an endless loop in dma_error. All other affected drivers have already been fixed. Fixes: c1fa347f20f1 ("e1000/e1000e/igb/igbvf/ixgb/ixgbe: Fix tests of unsigned in *_tx_map()") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-4-7-opus Signed-off-by: Matt Vollrath Signed-off-by: Tony Nguyen --- drivers/net/ethernet/intel/igbvf/netdev.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c index 0a3d0a1cba43..c686ee120a14 100644 --- a/drivers/net/ethernet/intel/igbvf/netdev.c +++ b/drivers/net/ethernet/intel/igbvf/netdev.c @@ -2190,8 +2190,6 @@ static inline int igbvf_tx_map_adv(struct igbvf_adapter *adapter, buffer_info->time_stamp = 0; buffer_info->length = 0; buffer_info->mapped_as_page = false; - if (count) - count--; /* clear timestamp and dma mappings for remaining portion of packet */ while (count--) { -- 2.47.1