On 7/22/25 6:37 AM, Damien Le Moal wrote: > On 7/21/25 11:04 PM, Nilay Shroff wrote: >> When setting up a null block device, we initialize a tagset that >> includes a driver_data field—typically used by block drivers to >> store a pointer to driver-specific data. In the case of null_blk, >> this should point to the struct nullb instance. >> >> However, due to recent tagset refactoring in the null_blk driver, we >> missed initializing driver_data when creating a shared tagset. As a >> result, software queues (ctx) fail to map correctly to new hardware >> queues (hctx). For example, increasing the number of submit queues >> triggers an nr_hw_queues update, which invokes null_map_queues() to >> remap queues. Since set->driver_data is unset, null_map_queues() >> fails to map any ctx to the new hctxs, leading to hctx->nr_ctx == 0, >> effectively making the hardware queues unusable for I/O. >> >> This patch fixes the issue by ensuring that set->driver_data is properly >> initialized to point to the struct nullb during tagset setup. >> >> Fixes: 72ca28765fc4 ("null_blk: refactor tag_set setup") >> Reviewed-by: Hannes Reinecke <hare@xxxxxxx> >> Signed-off-by: Nilay Shroff <nilay@xxxxxxxxxxxxx> >> --- >> drivers/block/null_blk/main.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c >> index aa163ae9b2aa..fbae0427263d 100644 >> --- a/drivers/block/null_blk/main.c >> +++ b/drivers/block/null_blk/main.c >> @@ -1856,6 +1856,7 @@ static int null_setup_tagset(struct nullb *nullb) >> { >> if (nullb->dev->shared_tags) { >> nullb->tag_set = &tag_set; >> + nullb->tag_set->driver_data = nullb; > > This looks better, but in the end, why is this even needed ? Since this is a > shared tagset, multiple nullb devices can use that same tagset, so setting the > driver_data pointer to this device only seems incorrect. > > Checking the code, the only function that makes use of this pointer is > null_map_queues(), which correctly test for private_data being NULL. > > So why do we need this ? Isn't your patch 1/2 enough to fix the crash you got ? > I think you were right — the first patch alone is sufficient to prevent the crash. Initially, I thought the second patch might also be necessary, especially for the scenario where the user increases the number of submit_queues for a null_blk device. While the block layer does create a new hardware queue (hctx) in response to such an update, null_blk (null_map_queues()) does not map any software queue (ctx) to it. As a result, the newly added hctx remains unused for I/O. Given this behavior, I believe we should not allow users to update submit_queues on a null_blk device if it is using a shared tagset. Currently, the code allows this update, but it’s misleading since the change has no actual effect. Would it make sense to explicitly prevent updating submit_queues in this case? That would align the interface with the actual behavior and avoid potential confusion and also saves us allocating memory for hctx which remains unused. Maybe we should have this check, in nullb_update_nr_hw_queues(): diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index fbae0427263d..aed2a6904db1 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -388,6 +388,12 @@ static int nullb_update_nr_hw_queues(struct nullb_device *dev, if (!submit_queues) return -EINVAL; + /* + * Cannot update queues with shared tagset. + */ + if (dev->shared_tags) + return -EINVAL; + /* * Make sure that null_init_hctx() does not access nullb->queues[] past * the end of that array. If the above change looks good, maybe I can update second patch with the above change. Thanks, --Nilay