Hi Dan, On Wed, 6 Aug 2025 at 20:24, Dan Carpenter <dan.carpenter@xxxxxxxxxx> wrote: > > On Tue, Aug 05, 2025 at 12:50:28AM +0530, Naresh Kamboju wrote: > > While booting and testing selftest cgroups and filesystem testing on arm64 > > dragonboard-410c the following kernel warnings / errors noticed and system > > halted and did not recover with selftests Kconfig enabled running the kernel > > Linux next tag next-20250804. > > > > Regression Analysis: > > - New regression? Yes > > - Reproducibility? Re-validation is in progress > > > > First seen on the next-20250804 > > Good: next-20250801 > > Bad: next-20250804 > > > > Test regression: next-20250804 Unable to handle kernel execute from > > non-executable memory at virtual address idem_hash > > Test regression: next-20250804 refcount_t: addition on 0; > > use-after-free refcount_warn_saturate > > > > Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx> > > > > ## Test crash log > > [ 9.811341] Unable to handle kernel NULL pointer dereference at > > virtual address 000000000000002e > > [ 9.811444] Mem abort info: > > [ 9.821150] ESR = 0x0000000096000004 > > [ 9.833499] SET = 0, FnV = 0 > > [ 9.833566] EA = 0, S1PTW = 0 > > [ 9.835511] FSC = 0x04: level 0 translation fault > > [ 9.838901] Data abort info: > > [ 9.843788] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 > > [ 9.846565] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > > [ 9.851938] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > > [ 9.853510] rtc-pm8xxx 200f000.spmi:pmic@0:rtc@6000: registered as rtc0 > > [ 9.856992] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000856f8000 > > [ 9.862446] rtc-pm8xxx 200f000.spmi:pmic@0:rtc@6000: setting system > > clock to 1970-01-01T00:00:31 UTC (31) > > [ 9.868789] [000000000000002e] pgd=0000000000000000, p4d=0000000000000000 > > [ 9.875459] Internal error: Oops: 0000000096000004 [#1] SMP > > [ 9.889547] input: pm8941_pwrkey as > > /devices/platform/soc@0/200f000.spmi/spmi-0/0-00/200f000.spmi:pmic@0:pon@800/200f000.spmi:pmic@0:pon@800:pwrkey/input/input1 > > [ 9.891545] Modules linked in: qcom_spmi_temp_alarm rtc_pm8xxx > > qcom_pon(+) qcom_pil_info videobuf2_dma_sg ubwc_config qcom_q6v5 > > venus_core(+) qcom_sysmon qcom_spmi_vadc v4l2_fwnode llcc_qcom > > v4l2_async qcom_vadc_common qcom_common ocmem v4l2_mem2mem drm_gpuvm > > videobuf2_memops qcom_glink_smem videobuf2_v4l2 drm_exec mdt_loader > > qmi_helpers gpu_sched drm_dp_aux_bus qnoc_msm8916 videodev > > drm_display_helper qcom_stats videobuf2_common cec qcom_rng > > drm_client_lib mc phy_qcom_usb_hs socinfo rpmsg_ctrl display_connector > > rpmsg_char ramoops rmtfs_mem reed_solomon drm_kms_helper fuse drm > > backlight > > [ 9.912286] input: pm8941_resin as > > /devices/platform/soc@0/200f000.spmi/spmi-0/0-00/200f000.spmi:pmic@0:pon@800/200f000.spmi:pmic@0:pon@800:resin/input/input2 > > [ 9.941186] CPU: 2 UID: 0 PID: 221 Comm: (udev-worker) Not tainted > > 6.16.0-next-20250804 #1 PREEMPT > > [ 9.941200] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) > > [ 9.941206] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > [ 9.941215] pc : dev_pm_opp_put (/builds/linux/drivers/opp/core.c:1685) > > [ 9.941233] lr : core_clks_enable+0x54/0x148 venus_core > > [ 10.004266] sp : ffff8000842b35f0 > > [ 10.004273] x29: ffff8000842b35f0 x28: ffff8000842b3ba0 x27: ffff0000047be938 > > [ 10.004289] x26: 0000000000000000 x25: 0000000000000000 x24: ffff80007b350ba0 > > [ 10.004303] x23: ffff00000ba380c8 x22: ffff00000ba38080 x21: 0000000000000000 > > [ 10.004316] x20: 0000000000000000 x19: ffffffffffffffee x18: 00000000ffffffff > > [ 10.004330] x17: 0000000000000000 x16: 1fffe000017541a1 x15: ffff8000842b3560 > > [ 10.004344] x14: 0000000000000000 x13: 007473696c5f7974 x12: 696e696666615f65 > > [ 10.004358] x11: 00000000000000c0 x10: 0000000000000020 x9 : ffff80007b33f2bc > > [ 10.004371] x8 : ffffffffffffffde x7 : ffff0000044a4800 x6 : 0000000000000000 > > [ 10.004384] x5 : 0000000000000002 x4 : 00000000c0000000 x3 : 0000000000000001 > > [ 10.004397] x2 : 0000000000000002 x1 : ffffffffffffffde x0 : ffffffffffffffee > > [ 10.004412] Call trace: > > [ 10.004417] dev_pm_opp_put (/builds/linux/drivers/opp/core.c:1685) (P) > > [ 10.004435] core_clks_enable+0x54/0x148 venus_core > > [ 10.004504] core_power_v1+0x78/0x90 venus_core > > [ 10.004560] venus_runtime_resume+0x6c/0x98 venus_core > > [ 10.004616] pm_generic_runtime_resume > > Could you try adding some error checking to core_clks_enable()? > Does the patch below help? Your patch works. The attached patch from Sasha fixes this reported problem on today's Linux next tag. $ git log --oneline next-20250805..next-20250807 -- drivers/media/platform/qcom/venus/pm_helpers.c 7881cd6886a89 media: venus: Fix OPP table error handling - Naresh
commit 7881cd6886a89eda848192d3f5759ce08672e084 Author: Sasha Levin <sashal@xxxxxxxxxx> Date: Tue Aug 5 08:58:20 2025 -0400 media: venus: Fix OPP table error handling The venus driver fails to check if dev_pm_opp_find_freq_{ceil,floor}() returns an error pointer before calling dev_pm_opp_put(). This causes a crash when OPP tables are not present in device tree. Unable to handle kernel access to user memory outside uaccess routines at virtual address 000000000000002e ... pc : dev_pm_opp_put+0x1c/0x4c lr : core_clks_enable+0x4c/0x16c [venus_core] Add IS_ERR() checks before calling dev_pm_opp_put() to avoid dereferencing error pointers. Fixes: b179234b5e59 ("media: venus: pm_helpers: use opp-table for the frequency") Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> diff --git a/drivers/media/platform/qcom/venus/pm_helpers.c b/drivers/media/platform/qcom/venus/pm_helpers.c index 8dd5a9b0d060c..e32f8862a9f90 100644 --- a/drivers/media/platform/qcom/venus/pm_helpers.c +++ b/drivers/media/platform/qcom/venus/pm_helpers.c @@ -48,7 +48,8 @@ static int core_clks_enable(struct venus_core *core) int ret; opp = dev_pm_opp_find_freq_ceil(dev, &freq); - dev_pm_opp_put(opp); + if (!IS_ERR(opp)) + dev_pm_opp_put(opp); for (i = 0; i < res->clks_num; i++) { if (IS_V6(core)) { @@ -660,7 +661,8 @@ static int decide_core(struct venus_inst *inst) /*TODO : divide this inst->load by work_route */ opp = dev_pm_opp_find_freq_floor(dev, &max_freq); - dev_pm_opp_put(opp); + if (!IS_ERR(opp)) + dev_pm_opp_put(opp); min_loaded_core(inst, &min_coreid, &min_load, false); min_loaded_core(inst, &min_lp_coreid, &min_lp_load, true); @@ -1121,7 +1123,8 @@ static int load_scale_v4(struct venus_inst *inst) freq = max(freq_core1, freq_core2); opp = dev_pm_opp_find_freq_floor(dev, &max_freq); - dev_pm_opp_put(opp); + if (!IS_ERR(opp)) + dev_pm_opp_put(opp); if (freq > max_freq) { dev_dbg(dev, VDBGL "requested clock rate: %lu scaling clock rate : %lu\n", @@ -1131,7 +1134,8 @@ static int load_scale_v4(struct venus_inst *inst) } opp = dev_pm_opp_find_freq_ceil(dev, &freq); - dev_pm_opp_put(opp); + if (!IS_ERR(opp)) + dev_pm_opp_put(opp); set_freq: