On 7/22/25 6:17 PM, Yu Kuai wrote: > Hi, > > 在 2025/07/22 16:30, Nilay Shroff 写道: >> >> >> On 7/22/25 6:37 AM, Damien Le Moal wrote: >>> On 7/21/25 11:04 PM, Nilay Shroff wrote: >>>> When setting up a null block device, we initialize a tagset that >>>> includes a driver_data field—typically used by block drivers to >>>> store a pointer to driver-specific data. In the case of null_blk, >>>> this should point to the struct nullb instance. >>>> >>>> However, due to recent tagset refactoring in the null_blk driver, we >>>> missed initializing driver_data when creating a shared tagset. As a >>>> result, software queues (ctx) fail to map correctly to new hardware >>>> queues (hctx). For example, increasing the number of submit queues >>>> triggers an nr_hw_queues update, which invokes null_map_queues() to >>>> remap queues. Since set->driver_data is unset, null_map_queues() >>>> fails to map any ctx to the new hctxs, leading to hctx->nr_ctx == 0, >>>> effectively making the hardware queues unusable for I/O. >>>> >>>> This patch fixes the issue by ensuring that set->driver_data is properly >>>> initialized to point to the struct nullb during tagset setup. >>>> >>>> Fixes: 72ca28765fc4 ("null_blk: refactor tag_set setup") >>>> Reviewed-by: Hannes Reinecke <hare@xxxxxxx> >>>> Signed-off-by: Nilay Shroff <nilay@xxxxxxxxxxxxx> >>>> --- >>>> drivers/block/null_blk/main.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c >>>> index aa163ae9b2aa..fbae0427263d 100644 >>>> --- a/drivers/block/null_blk/main.c >>>> +++ b/drivers/block/null_blk/main.c >>>> @@ -1856,6 +1856,7 @@ static int null_setup_tagset(struct nullb *nullb) >>>> { >>>> if (nullb->dev->shared_tags) { >>>> nullb->tag_set = &tag_set; >>>> + nullb->tag_set->driver_data = nullb; >>> >>> This looks better, but in the end, why is this even needed ? Since this is a >>> shared tagset, multiple nullb devices can use that same tagset, so setting the >>> driver_data pointer to this device only seems incorrect. >>> >>> Checking the code, the only function that makes use of this pointer is >>> null_map_queues(), which correctly test for private_data being NULL. >>> >>> So why do we need this ? Isn't your patch 1/2 enough to fix the crash you got ? >>> >> I think you were right — the first patch alone is sufficient to prevent the >> crash. >> Initially, I thought the second patch might also be necessary, especially for >> the >> scenario where the user increases the number of submit_queues for a null_blk >> device. >> While the block layer does create a new hardware queue (hctx) in response to >> such >> an update, null_blk (null_map_queues()) does not map any software queue (ctx) >> to it. >> As a result, the newly added hctx remains unused for I/O. >> >> Given this behavior, I believe we should not allow users to update submit_queues >> on a null_blk device if it is using a shared tagset. Currently, the code allows >> this update, but it’s misleading since the change has no actual effect. >> >> Would it make sense to explicitly prevent updating submit_queues in this case? >> That would align the interface with the actual behavior and avoid potential >> confusion and also saves us allocating memory for hctx which remains unused. >> Maybe we should have this check, in nullb_update_nr_hw_queues(): >> >> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c >> index fbae0427263d..aed2a6904db1 100644 >> --- a/drivers/block/null_blk/main.c >> +++ b/drivers/block/null_blk/main.c >> @@ -388,6 +388,12 @@ static int nullb_update_nr_hw_queues(struct nullb_device >> *dev, >> if (!submit_queues) >> return -EINVAL; >> + /* >> + * Cannot update queues with shared tagset. >> + */ >> + if (dev->shared_tags) >> + return -EINVAL; >> + >> /* >> * Make sure that null_init_hctx() does not access nullb->queues[] past >> * the end of that array. >> >> If the above change looks good, maybe I can update second patch with >> the above change. > > I'm good with this change, howerver, with a quick look it seems lots of > configfs api are broken for shared_tags. For example: > > If we switch shared_tags from 0 to 1, and then read submit_queues, > looks like it's the old dev->submit_queues instead of g_submit_queues; g_submit_queues is used as the initial value for dev->submit_queues. See nullb_alloc_dev(). So if we prevent changing dev->submit_queues with configfs, we'll get whatever g_submit_queues was set on modprobe for the shared tagset. -- Damien Le Moal Western Digital Research