Re: ublk: kernel crash when killing SPDK application

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 15, 2025 at 10:58:37AM +0000, Guy Eisenberg wrote:
> I am writing to report a kernel crash that occurred after terminating (kill -9) an SPDK application using ublk.
> Below are the details of the incident, including steps to reproduce the issue and the call stack.
> 
> Incident Description:
> After terminating an SPDK application, the system occasionally experiences a kernel crash.
> This issue is not consistent but happens once every few tries under the following conditions.
> We are using kernel 6.14.0-061400-generic
> 
> Steps to Reproduce:
> 1. install SPDK:
>       git clone https://github.com/spdk/spdk ;
>       cd spdk
>       ./configure --disable-coverage --disable-debug --disable-tests --enable-unit-tests --without-crypto --without-fio --with-vhost --with-rdma --without-nvme-cuse --without-fuse --without-vfio-user --without-vtune --without-iscsi-initiator --without-shared --with-ublk --with-uring --with-raid5f
>       make
>       make install
> 2.  Create SPDK bdev (here we used PCI 0000.8b.00.0 as the nvme target, and named the bdev as guy_bdev):
>       ./spdk/scripts/setup.sh reset
>       ./spdk/scripts/setup.sh
>       /usr/local/bin/spdk_tgt --mem-size 2048 -m 0xff
>       ./spdk/scripts/rpc.py bdev_nvme_attach_controller -b guy_bdev -t PCIe -a 0000.8b.00.0
> 3. Expose it via ublk
>       modprobe ublk_drv
>       ./spdk/scripts/rpc.py ublk_create_target
>       ./spdk/scripts/rpc.py ublk_start_disk -q 8 -d 128 guy_bdevn1 0
> 4. Run IO to the /dev/ublkb0 that was created
>       Kill the spdk_tgt process (kill -9)
> 
> 
> Call Stack:
>       Below is the call stack captured during one of the crashes:
> 
> [54346.157495] [ T288311] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [54346.157625] [ T288311] #PF: supervisor write access in kernel mode
> [54346.157708] [ T288311] #PF: error_code(0x0002) - not-present page
> [54346.157790] [ T288311] PGD 0 P4D 0 
> [54346.157911] [ T288311] Oops: Oops: 0002 [#1] PREEMPT SMP PTI
> [54346.158010] [ T288311] CPU: 0 UID: 0 PID: 288311 Comm: reactor_0 Kdump: loaded Tainted: G           OE      6.14.0-061400-generic #202503241442
> [54346.158264] [ T288311] Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
> [54346.158374] [ T288311] Hardware name: Supermicro SYS-2028BT-HNR+/X10DRT-B+, BIOS 2.0 01/10/2017
> [54346.158490] [ T288311] RIP: 0010:percpu_ref_get_many+0x35/0x50

Looks one uring_cmd use-after-free issue.

And the following patchset may avoid it:

	https://lore.kernel.org/linux-block/20250414112554.3025113-1-ming.lei@xxxxxxxxxx/

If you can build & test kernel, please apply the following debug patch
against v6.14 and post the panic log.


diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index ca9a67b5b537..6e50e8b9f836 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -1127,6 +1127,7 @@ static void ubq_complete_io_cmd(struct ublk_io *io, int res,
 
 	/* tell ublksrv one io request is coming */
 	io_uring_cmd_done(io->cmd, res, 0, issue_flags);
+	io->cmd = NULL;
 }
 
 #define UBLK_REQUEUE_DELAY_MS	3
@@ -1498,8 +1499,10 @@ static void ublk_cancel_cmd(struct ublk_queue *ubq, struct ublk_io *io,
 		io->flags |= UBLK_IO_FLAG_CANCELED;
 	spin_unlock(&ubq->cancel_lock);
 
-	if (!done)
+	if (!done) {
 		io_uring_cmd_done(io->cmd, UBLK_IO_RES_ABORT, 0, issue_flags);
+		io->cmd = NULL;
+	}
 }
 
 /*
@@ -1770,6 +1773,8 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd,
 	if (!ubq || ub_cmd->q_id != ubq->q_id)
 		goto out;
 
+	WARN_ON_ONCE(ubq->canceling);
+
 	if (ubq->ubq_daemon && ubq->ubq_daemon != current)
 		goto out;
 



Thanks,
Ming





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux