Hi, Reminder On 5/28/25 10:25 AM, Muhammad Usama Anjum wrote: > Soft reminder > > On 5/16/25 11:49 PM, Muhammad Usama Anjum wrote: >> Fix dma_direct_alloc() failure at resume time during bhie_table >> allocation because of memory pressure. There is a report where at >> resume time, the memory from the dma doesn't get allocated and MHI >> fails to re-initialize. >> >> To fix it, don't free the memory at power down during suspend / >> hibernation. Instead, use the same allocated memory again after every >> resume / hibernation. This patch has been tested with resume and >> hibernation both. >> >> Optimize the rddm and fbc bhie allocations. The rddm is of constant >> size for a given hardware. While the fbc_image size depends on the >> firmware. If the firmware changes, we'll free and allocate new memory >> for it. This patch is motivated from the ath12k [1] and ath11k [2] >> patches. They don't free the memory and reuse the same memory if new >> size is same. The firmware caching hasn't been implemented for the >> drivers other than in the nouveau. (The changing of firmware isn't >> tested/supported for wireless drivers. But let's follow the example >> patches here.) >> >> [1] https://lore.kernel.org/all/20240419034034.2842-1-quic_bqiang@xxxxxxxxxxx/ >> [2] https://lore.kernel.org/all/20220506141448.10340-1-quic_akolli@xxxxxxxxxxx/ >> >> Tested-on: WCN6855 WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6 >> Tested-on: WCN7850 hw2.0 WLAN.HMT.1.1.c5-00284-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 >> >> Acked-by: Jeff Johnson <jeff.johnson@xxxxxxxxxxxxxxxx> >> Tested-by: Baochen Qiang <quic_bqiang@xxxxxxxxxxx> >> Signed-off-by: Muhammad Usama Anjum <usama.anjum@xxxxxxxxxxxxx> >> --- >> Changes since v1: >> - Don't free bhie tables during suspend/hibernation only >> - Handle fbc_image changed size correctly >> - Remove fbc_image getting set to NULL in *free_bhie_table() >> >> Changes since v2: >> - Remove the new mhi_partial_unprepare_after_power_down() and instead >> update mhi_power_down_keep_dev() to use >> mhi_power_down_unprepare_keep_dev() as suggested by Mani >> - Update all users of this API such as ath12k (previously only ath11k >> was updated) >> - Define prev_fw_sz in docs >> - Do better alignment of comments >> >> Changes since v3: >> - Fix state machine of ath12k by setting ATH12K_MHI_DEINIT with >> ATH12K_MHI_POWER_OFF_KEEP_DEV state (Thanks Sebastian for testing and >> finding the problem) >> - Use static with mhi_power_down_unprepare_keep_dev() >> - Remove crash log as it was showing that kworker wasn't able to >> allocate memory. >> >> Changes since v4: >> - Update description >> - Use __mhi_power_down_unprepare_keep_dev() in >> mhi_unprepare_after_power_down() >> >> Changes since v5: >> - Update description to don't give an impression that all bhie >> allocations are being fixed. mhi_load_image_bhie() doesn't require >> this optimization. >> >> This patch doesn't have fixes tag as we are avoiding error in case of >> memory pressure. We are just making this driver more robust by not >> freeing the memory and using the same after resuming. >> --- >> drivers/bus/mhi/host/boot.c | 15 +++++++++++---- >> drivers/bus/mhi/host/init.c | 18 ++++++++++++------ >> drivers/bus/mhi/host/internal.h | 2 ++ >> drivers/bus/mhi/host/pm.c | 1 + >> drivers/net/wireless/ath/ath11k/mhi.c | 8 ++++---- >> drivers/net/wireless/ath/ath12k/mhi.c | 14 ++++++++++---- >> include/linux/mhi.h | 2 ++ >> 7 files changed, 42 insertions(+), 18 deletions(-) >> >> diff --git a/drivers/bus/mhi/host/boot.c b/drivers/bus/mhi/host/boot.c >> index efa3b6dddf4d2..bc8459798bbee 100644 >> --- a/drivers/bus/mhi/host/boot.c >> +++ b/drivers/bus/mhi/host/boot.c >> @@ -584,10 +584,17 @@ void mhi_fw_load_handler(struct mhi_controller *mhi_cntrl) >> * device transitioning into MHI READY state >> */ >> if (fw_load_type == MHI_FW_LOAD_FBC) { >> - ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz); >> - if (ret) { >> - release_firmware(firmware); >> - goto error_fw_load; >> + if (mhi_cntrl->fbc_image && fw_sz != mhi_cntrl->prev_fw_sz) { >> + mhi_free_bhie_table(mhi_cntrl, mhi_cntrl->fbc_image); >> + mhi_cntrl->fbc_image = NULL; >> + } >> + if (!mhi_cntrl->fbc_image) { >> + ret = mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->fbc_image, fw_sz); >> + if (ret) { >> + release_firmware(firmware); >> + goto error_fw_load; >> + } >> + mhi_cntrl->prev_fw_sz = fw_sz; >> } >> >> /* Load the firmware into BHIE vec table */ >> diff --git a/drivers/bus/mhi/host/init.c b/drivers/bus/mhi/host/init.c >> index 13e7a55f54ff4..8419ea8a5419b 100644 >> --- a/drivers/bus/mhi/host/init.c >> +++ b/drivers/bus/mhi/host/init.c >> @@ -1173,8 +1173,9 @@ int mhi_prepare_for_power_up(struct mhi_controller *mhi_cntrl) >> /* >> * Allocate RDDM table for debugging purpose if specified >> */ >> - mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->rddm_image, >> - mhi_cntrl->rddm_size); >> + if (!mhi_cntrl->rddm_image) >> + mhi_alloc_bhie_table(mhi_cntrl, &mhi_cntrl->rddm_image, >> + mhi_cntrl->rddm_size); >> if (mhi_cntrl->rddm_image) { >> ret = mhi_rddm_prepare(mhi_cntrl, >> mhi_cntrl->rddm_image); >> @@ -1200,6 +1201,14 @@ int mhi_prepare_for_power_up(struct mhi_controller *mhi_cntrl) >> } >> EXPORT_SYMBOL_GPL(mhi_prepare_for_power_up); >> >> +void __mhi_unprepare_keep_dev(struct mhi_controller *mhi_cntrl) >> +{ >> + mhi_cntrl->bhi = NULL; >> + mhi_cntrl->bhie = NULL; >> + >> + mhi_deinit_dev_ctxt(mhi_cntrl); >> +} >> + >> void mhi_unprepare_after_power_down(struct mhi_controller *mhi_cntrl) >> { >> if (mhi_cntrl->fbc_image) { >> @@ -1212,10 +1221,7 @@ void mhi_unprepare_after_power_down(struct mhi_controller *mhi_cntrl) >> mhi_cntrl->rddm_image = NULL; >> } >> >> - mhi_cntrl->bhi = NULL; >> - mhi_cntrl->bhie = NULL; >> - >> - mhi_deinit_dev_ctxt(mhi_cntrl); >> + __mhi_unprepare_keep_dev(mhi_cntrl); >> } >> EXPORT_SYMBOL_GPL(mhi_unprepare_after_power_down); >> >> diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h >> index ce566f7d2e924..41b3fb835880b 100644 >> --- a/drivers/bus/mhi/host/internal.h >> +++ b/drivers/bus/mhi/host/internal.h >> @@ -427,4 +427,6 @@ void mhi_unmap_single_no_bb(struct mhi_controller *mhi_cntrl, >> void mhi_unmap_single_use_bb(struct mhi_controller *mhi_cntrl, >> struct mhi_buf_info *buf_info); >> >> +void __mhi_unprepare_keep_dev(struct mhi_controller *mhi_cntrl); >> + >> #endif /* _MHI_INT_H */ >> diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c >> index e6c3ff62bab1d..c2c09c308b9b7 100644 >> --- a/drivers/bus/mhi/host/pm.c >> +++ b/drivers/bus/mhi/host/pm.c >> @@ -1263,6 +1263,7 @@ void mhi_power_down_keep_dev(struct mhi_controller *mhi_cntrl, >> bool graceful) >> { >> __mhi_power_down(mhi_cntrl, graceful, false); >> + __mhi_unprepare_keep_dev(mhi_cntrl); >> } >> EXPORT_SYMBOL_GPL(mhi_power_down_keep_dev); >> >> diff --git a/drivers/net/wireless/ath/ath11k/mhi.c b/drivers/net/wireless/ath/ath11k/mhi.c >> index acd76e9392d31..c5dc776b23643 100644 >> --- a/drivers/net/wireless/ath/ath11k/mhi.c >> +++ b/drivers/net/wireless/ath/ath11k/mhi.c >> @@ -460,12 +460,12 @@ void ath11k_mhi_stop(struct ath11k_pci *ab_pci, bool is_suspend) >> * workaround, otherwise ath11k_core_resume() will timeout >> * during resume. >> */ >> - if (is_suspend) >> + if (is_suspend) { >> mhi_power_down_keep_dev(ab_pci->mhi_ctrl, true); >> - else >> + } else { >> mhi_power_down(ab_pci->mhi_ctrl, true); >> - >> - mhi_unprepare_after_power_down(ab_pci->mhi_ctrl); >> + mhi_unprepare_after_power_down(ab_pci->mhi_ctrl); >> + } >> } >> >> int ath11k_mhi_suspend(struct ath11k_pci *ab_pci) >> diff --git a/drivers/net/wireless/ath/ath12k/mhi.c b/drivers/net/wireless/ath/ath12k/mhi.c >> index 08f44baf182a5..3af524ccf4a5a 100644 >> --- a/drivers/net/wireless/ath/ath12k/mhi.c >> +++ b/drivers/net/wireless/ath/ath12k/mhi.c >> @@ -601,6 +601,12 @@ static int ath12k_mhi_set_state(struct ath12k_pci *ab_pci, >> >> ath12k_mhi_set_state_bit(ab_pci, mhi_state); >> >> + /* mhi_power_down_keep_dev() has been updated to DEINIT without >> + * freeing bhie tables >> + */ >> + if (mhi_state == ATH12K_MHI_POWER_OFF_KEEP_DEV) >> + ath12k_mhi_set_state_bit(ab_pci, ATH12K_MHI_DEINIT); >> + >> return 0; >> >> out: >> @@ -635,12 +641,12 @@ void ath12k_mhi_stop(struct ath12k_pci *ab_pci, bool is_suspend) >> * workaround, otherwise ath12k_core_resume() will timeout >> * during resume. >> */ >> - if (is_suspend) >> + if (is_suspend) { >> ath12k_mhi_set_state(ab_pci, ATH12K_MHI_POWER_OFF_KEEP_DEV); >> - else >> + } else { >> ath12k_mhi_set_state(ab_pci, ATH12K_MHI_POWER_OFF); >> - >> - ath12k_mhi_set_state(ab_pci, ATH12K_MHI_DEINIT); >> + ath12k_mhi_set_state(ab_pci, ATH12K_MHI_DEINIT); >> + } >> } >> >> void ath12k_mhi_suspend(struct ath12k_pci *ab_pci) >> diff --git a/include/linux/mhi.h b/include/linux/mhi.h >> index dd372b0123a6d..6fd218a877855 100644 >> --- a/include/linux/mhi.h >> +++ b/include/linux/mhi.h >> @@ -306,6 +306,7 @@ struct mhi_controller_config { >> * if fw_image is NULL and fbc_download is true (optional) >> * @fw_sz: Firmware image data size for normal booting, used only if fw_image >> * is NULL and fbc_download is true (optional) >> + * @prev_fw_sz: Previous firmware image data size, when fbc_download is true >> * @edl_image: Firmware image name for emergency download mode (optional) >> * @rddm_size: RAM dump size that host should allocate for debugging purpose >> * @sbl_size: SBL image size downloaded through BHIe (optional) >> @@ -382,6 +383,7 @@ struct mhi_controller { >> const char *fw_image; >> const u8 *fw_data; >> size_t fw_sz; >> + size_t prev_fw_sz; >> const char *edl_image; >> size_t rddm_size; >> size_t sbl_size; > > -- Regards, Usama