On Thu, Jul 24, 2025 at 04:54:43PM +0300, Eugen Hristev wrote: > kmemdump is a mechanism which allows the kernel to mark specific memory > areas for dumping or specific backend usage. > Once regions are marked, kmemdump keeps an internal list with the regions > and registers them in the backend. > Further, depending on the backend driver, these regions can be dumped using > firmware or different hardware block. > Regions being marked beforehand, when the system is up and running, there > is no need nor dependency on a panic handler, or a working kernel that can > dump the debug information. > The kmemdump approach works when pstore, kdump, or another mechanism do not. > Pstore relies on persistent storage, a dedicated RAM area or flash, which > has the disadvantage of having the memory reserved all the time, or another > specific non volatile memory. Some devices cannot keep the RAM contents on > reboot so ramoops does not work. Some devices do not allow kexec to run > another kernel to debug the crashed one. > For such devices, that have another mechanism to help debugging, like > firmware, kmemdump is a viable solution. > > kmemdump can create a core image, similar with /proc/vmcore, with only > the registered regions included. This can be loaded into crash tool/gdb and > analyzed. > To have this working, specific information from the kernel is registered, > and this is done at kmemdump init time, no need for the kmemdump user to > do anything. > > This version of the kmemdump patch series includes two backend drivers: > one is the Qualcomm Minidump backend, and the other one is the Debug Kinfo > backend for Android devices, reworked from this source here: > https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/android/debug_kinfo.c > written originally by Jone Chou <jonechou@xxxxxxxxxx> > > Initial version of kmemdump and discussion is available here: > https://lore.kernel.org/lkml/20250422113156.575971-1-eugen.hristev@xxxxxxxxxx/ > > Kmemdump has been presented and discussed at Linaro Connect 2025, > including motivation, scope, usability and feasability. > Video of the recording is available here for anyone interested: > https://www.youtube.com/watch?v=r4gII7MX9zQ&list=PLKZSArYQptsODycGiE0XZdVovzAwYNwtK&index=14 > > The implementation is based on the initial Pstore/directly mapped zones > published as an RFC here: > https://lore.kernel.org/all/20250217101706.2104498-1-eugen.hristev@xxxxxxxxxx/ > > The back-end implementation for qcom_minidump is based on the minidump > patch series and driver written by Mukesh Ojha, thanks: > https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@xxxxxxxxxxx/ > > *** How to use kmemdump with minidump backend on Qualcomm platform guide *** > > Prerequisites: > Crash tool with target=ARM64 and minor changes required for usual crash mode > (minimal mode works without the patch) > A patch can be applied from here https://p.calebs.dev/49a048 > This patch will be eventually sent in a reworked way to crash tool. > > Target kernel must be built with : > CONFIG_DEBUG_INFO_REDUCED=n ; this will have vmlinux include all the debugging > information needed for crash tool. > > Otherwise, the kernel requires these as well: > CONFIG_KMEMDUMP, CONFIG_KMEMDUMP_COREIMAGE, and the backend > CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND > > Kernel arguments: > Kernel firmware must be set to mode 'mini' by kernel module parameter > like this : qcom_scm.download_mode=mini > > After the kernel boots, and qcom_minidump module is loaded, everything is ready for > a possible crash. > > Once the crash happens, the firmware will kick in and you will see on > the console the message saying Sahara init, etc, that the firmware is > waiting in download mode. (this is subject to firmware supporting this > mode, I am using sa8775p-ride board) > > Example of log on the console: > " > [...] > B - 1096414 - usb: init start > B - 1100287 - usb: qusb_dci_platform , 0x19 > B - 1105686 - usb: usb3phy: PRIM success: lane_A , 0x60 > B - 1107455 - usb: usb2phy: PRIM success , 0x4 > B - 1112670 - usb: dci, chgr_type_det_err > B - 1117154 - usb: ID:0x260, value: 0x4 > B - 1121942 - usb: ID:0x108, value: 0x1d90 > B - 1124992 - usb: timer_start , 0x4c4b40 > B - 1129140 - usb: vbus_det_pm_unavail > B - 1133136 - usb: ID:0x252, value: 0x4 > B - 1148874 - usb: SUPER , 0x900e > B - 1275510 - usb: SUPER , 0x900e > B - 1388970 - usb: ID:0x20d, value: 0x0 > B - 1411113 - usb: ENUM success > B - 1411113 - Sahara Init > B - 1414285 - Sahara Open > " > > Once the board is in download mode, you can use the qdl tool (I > personally use edl , have not tried qdl yet), to get all the regions as > separate files. > The tool from the host computer will list the regions in the order they > were downloaded. > > Once you have all the files simply use `cat` to put them all together, > in the order of the indexes. > For my kernel config and setup, here is my cat command : (you can use a script > or something, I haven't done that so far): > > `cat memory/md_KELF1.BIN memory/md_Kvmcorein2.BIN memory/md_Kconfig3.BIN \ > memory/md_Kmemsect4.BIN memory/md_Ktotalram5.BIN memory/md_Kcpu_poss6.BIN \ > memory/md_Kcpu_pres7.BIN memory/md_Kcpu_onli8.BIN memory/md_Kcpu_acti9.BIN \ > memory/md_Kjiffies10.BIN memory/md_Klinux_ba11.BIN memory/md_Knr_threa12.BIN \ > memory/md_Knr_irqs13.BIN memory/md_Ktainted_14.BIN memory/md_Ktaint_fl15.BIN \ > memory/md_Kmem_sect16.BIN memory/md_Knode_dat17.BIN memory/md_Knode_sta18.BIN \ > memory/md_K__per_cp19.BIN memory/md_Knr_swapf20.BIN memory/md_Kinit_uts21.BIN \ > memory/md_Kprintk_r22.BIN memory/md_Kprintk_r23.BIN memory/md_Kprb24.BIN \ > memory/md_Kprb_desc25.BIN memory/md_Kprb_info26.BIN memory/md_Kprb_data27.BIN \ > memory/md_Krunqueue28.BIN memory/md_Khigh_mem29.BIN memory/md_Kinit_mm30.BIN \ > memory/md_Kinit_mm_31.BIN memory/md_Kunknown32.BIN memory/md_Kunknown33.BIN \ > memory/md_Kunknown34.BIN memory/md_Kunknown35.BIN memory/md_Kunknown36.BIN \ > memory/md_Kunknown37.BIN memory/md_Kunknown38.BIN memory/md_Kunknown39.BIN \ > memory/md_Kunknown40.BIN memory/md_Kunknown41.BIN memory/md_Kunknown42.BIN \ > memory/md_Kunknown43.BIN memory/md_Kunknown44.BIN memory/md_Kunknown45.BIN \ > memory/md_Kunknown46.BIN memory/md_Kunknown47.BIN memory/md_Kunknown50.BIN \ > memory/md_Kunknown51.BIN memory/md_Kunknown52.BIN \ > memory/md_Kunknown53.BIN > ~/minidump_image` > > Once you have the resulted file, use `crash` tool to load it, like this: > `./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image` > > There is also a --minimal mode for ./crash that would work without any patch applied > to crash tool, but you can't inspect symbols, etc. Unfortunately for me, only with --minimal option, I could see the 'log'. ./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image WARNING: kernel version inconsistency between vmlinux and dumpfile crash: read error: kernel virtual address: ffffff8ed7f380d8 type: "IRQ stack pointer" crash: read error: kernel virtual address: ffffff8ed7f510d8 type: "IRQ stack pointer" crash: read error: kernel virtual address: ffffff8ed7f6a0d8 type: "IRQ stack pointer" crash: read error: kernel virtual address: ffffff8ed7f830d8 type: "IRQ stack pointer" crash: read error: kernel virtual address: ffffff8ed7f9c0d8 type: "IRQ stack pointer" crash: read error: kernel virtual address: ffffff8ed7fb50d8 type: "IRQ stack pointer" crash: read error: kernel virtual address: ffffff8ed7fce0d8 type: "IRQ stack pointer" crash: read error: kernel virtual address: ffffff8ed7fe70d8 type: "IRQ stack pointer" crash: read error: kernel virtual address: ffffffc0817c5d80 type: "maple_init read mt_slots" crash: read error: kernel virtual address: ffffffc0817c5d78 type: "maple_init read mt_pivots" crash: read error: kernel virtual address: ffffff8efb89e2c0 type: "memory section root table" Looks like something more you are using in your setup to make it work. -Mukesh > > Once you load crash you will see something like this : > > KERNEL: /home/eugen/linux-minidump/vmlinux [TAINTED] > DUMPFILE: /home/eugen/new > CPUS: 8 [OFFLINE: 7] > DATE: Thu Jan 1 02:00:00 EET 1970 > UPTIME: 00:00:29 > TASKS: 0 > NODENAME: qemuarm64 > RELEASE: 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty > VERSION: #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025 > MACHINE: aarch64 (unknown Mhz) > MEMORY: 34.2 GB > PANIC: "" > crash> log > [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd4b2] > [ 0.000000] Linux version 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty (eugen@eugen-station) (aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 20240614, GNU ld (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 2.42.0.20240614) #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025 > > > *** Debug Kinfo backend driver *** > I don't have any device to actually test this. So I have not. > I hacked the driver to just use a kmalloc'ed area to save things instead > of the shared memory, and dumped everything there and checked whether it looks > sane. If someone is willing to try it out, thanks ! and let me know. > I know there is no binding documentation for the compatible either. > > Thanks for everyone reviewing and bringing ideas into the discussion. > > Eugen > > Changelog since the v1 of the RFC: > - Reworked the whole minidump implementation based on suggestions from Thomas Gleixner. > This means new API, macros, new way to store the regions inside kmemdump > (ditched the IDR, moved to static allocation, have a static default backend, etc) > - Reworked qcom_minidump driver based on review from Bjorn Andersson > - Reworked printk log buffer registration based on review from Petr Mladek > > I appologize if I missed any review comments. I know there is still lots of work > on this series and hope I will improve it more and more. > Patches are sent on top of next-20250721 > > Eugen Hristev (29): > kmemdump: introduce kmemdump > Documentation: add kmemdump > kmemdump: add coreimage ELF layer > Documentation: kmemdump: add section for coreimage ELF > kmemdump: introduce qcom-minidump backend driver > soc: qcom: smem: add minidump device > init/version: Annotate static information into Kmemdump > cpu: Annotate static information into Kmemdump > genirq/irqdesc: Annotate static information into Kmemdump > panic: Annotate static information into Kmemdump > sched/core: Annotate static information into Kmemdump > timers: Annotate static information into Kmemdump > kernel/fork: Annotate static information into Kmemdump > mm/page_alloc: Annotate static information into Kmemdump > mm/init-mm: Annotate static information into Kmemdump > mm/show_mem: Annotate static information into Kmemdump > mm/swapfile: Annotate static information into Kmemdump > mm/percpu: Annotate static information into Kmemdump > mm/mm_init: Annotate static information into Kmemdump > printk: Register information into Kmemdump > kernel/configs: Register dynamic information into Kmemdump > mm/numa: Register information into Kmemdump > mm/sparse: Register information into Kmemdump > kernel/vmcore_info: Register dynamic information into Kmemdump > kmemdump: Add additional symbols to the coreimage > init/version: Annotate init uts name separately into Kmemdump > kallsyms: Annotate static information into Kmemdump > mm/init-mm: Annotate additional information into Kmemdump > kmemdump: Add Kinfo backend driver > > Documentation/debug/index.rst | 17 ++ > Documentation/debug/kmemdump.rst | 104 +++++++++ > MAINTAINERS | 18 ++ > drivers/Kconfig | 4 + > drivers/Makefile | 2 + > drivers/debug/Kconfig | 55 +++++ > drivers/debug/Makefile | 6 + > drivers/debug/kinfo.c | 304 +++++++++++++++++++++++++ > drivers/debug/kmemdump.c | 239 +++++++++++++++++++ > drivers/debug/kmemdump_coreimage.c | 223 ++++++++++++++++++ > drivers/debug/qcom_minidump.c | 353 +++++++++++++++++++++++++++++ > drivers/soc/qcom/smem.c | 10 + > include/asm-generic/vmlinux.lds.h | 13 ++ > include/linux/kmemdump.h | 219 ++++++++++++++++++ > init/version.c | 6 + > kernel/configs.c | 6 + > kernel/cpu.c | 5 + > kernel/fork.c | 2 + > kernel/irq/irqdesc.c | 2 + > kernel/kallsyms.c | 10 + > kernel/panic.c | 4 + > kernel/printk/printk.c | 28 ++- > kernel/sched/core.c | 2 + > kernel/time/timer.c | 3 +- > kernel/vmcore_info.c | 3 + > mm/init-mm.c | 12 + > mm/mm_init.c | 2 + > mm/numa.c | 5 +- > mm/page_alloc.c | 2 + > mm/percpu.c | 3 + > mm/show_mem.c | 2 + > mm/sparse.c | 16 +- > mm/swapfile.c | 2 + > 33 files changed, 1670 insertions(+), 12 deletions(-) > create mode 100644 Documentation/debug/index.rst > create mode 100644 Documentation/debug/kmemdump.rst > create mode 100644 drivers/debug/Kconfig > create mode 100644 drivers/debug/Makefile > create mode 100644 drivers/debug/kinfo.c > create mode 100644 drivers/debug/kmemdump.c > create mode 100644 drivers/debug/kmemdump_coreimage.c > create mode 100644 drivers/debug/qcom_minidump.c > create mode 100644 include/linux/kmemdump.h > > -- > 2.43.0 > -- -Mukesh Ojha