On 7/22/25 5:30 PM, Nilay Shroff wrote: > > > On 7/22/25 6:37 AM, Damien Le Moal wrote: >> On 7/21/25 11:04 PM, Nilay Shroff wrote: >>> When setting up a null block device, we initialize a tagset that >>> includes a driver_data field—typically used by block drivers to >>> store a pointer to driver-specific data. In the case of null_blk, >>> this should point to the struct nullb instance. >>> >>> However, due to recent tagset refactoring in the null_blk driver, we >>> missed initializing driver_data when creating a shared tagset. As a >>> result, software queues (ctx) fail to map correctly to new hardware >>> queues (hctx). For example, increasing the number of submit queues >>> triggers an nr_hw_queues update, which invokes null_map_queues() to >>> remap queues. Since set->driver_data is unset, null_map_queues() >>> fails to map any ctx to the new hctxs, leading to hctx->nr_ctx == 0, >>> effectively making the hardware queues unusable for I/O. >>> >>> This patch fixes the issue by ensuring that set->driver_data is properly >>> initialized to point to the struct nullb during tagset setup. >>> >>> Fixes: 72ca28765fc4 ("null_blk: refactor tag_set setup") >>> Reviewed-by: Hannes Reinecke <hare@xxxxxxx> >>> Signed-off-by: Nilay Shroff <nilay@xxxxxxxxxxxxx> >>> --- >>> drivers/block/null_blk/main.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c >>> index aa163ae9b2aa..fbae0427263d 100644 >>> --- a/drivers/block/null_blk/main.c >>> +++ b/drivers/block/null_blk/main.c >>> @@ -1856,6 +1856,7 @@ static int null_setup_tagset(struct nullb *nullb) >>> { >>> if (nullb->dev->shared_tags) { >>> nullb->tag_set = &tag_set; >>> + nullb->tag_set->driver_data = nullb; >> >> This looks better, but in the end, why is this even needed ? Since this is a >> shared tagset, multiple nullb devices can use that same tagset, so setting the >> driver_data pointer to this device only seems incorrect. >> >> Checking the code, the only function that makes use of this pointer is >> null_map_queues(), which correctly test for private_data being NULL. >> >> So why do we need this ? Isn't your patch 1/2 enough to fix the crash you got ? >> > I think you were right — the first patch alone is sufficient to prevent the crash. > Initially, I thought the second patch might also be necessary, especially for the > scenario where the user increases the number of submit_queues for a null_blk device. > While the block layer does create a new hardware queue (hctx) in response to such > an update, null_blk (null_map_queues()) does not map any software queue (ctx) to it. > As a result, the newly added hctx remains unused for I/O. > > Given this behavior, I believe we should not allow users to update submit_queues > on a null_blk device if it is using a shared tagset. Currently, the code allows > this update, but it’s misleading since the change has no actual effect. > > Would it make sense to explicitly prevent updating submit_queues in this case? > That would align the interface with the actual behavior and avoid potential > confusion and also saves us allocating memory for hctx which remains unused. > Maybe we should have this check, in nullb_update_nr_hw_queues(): > > diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c > index fbae0427263d..aed2a6904db1 100644 > --- a/drivers/block/null_blk/main.c > +++ b/drivers/block/null_blk/main.c > @@ -388,6 +388,12 @@ static int nullb_update_nr_hw_queues(struct nullb_device *dev, > if (!submit_queues) > return -EINVAL; > > + /* > + * Cannot update queues with shared tagset. > + */ > + if (dev->shared_tags) > + return -EINVAL; > + I think that is fine since null_init_global_tag_set() uses g_submit_queues (module parameter). > /* > * Make sure that null_init_hctx() does not access nullb->queues[] past > * the end of that array. > > If the above change looks good, maybe I can update second patch with > the above change. > > Thanks, > --Nilay > -- Damien Le Moal Western Digital Research