Re: [RFC PATCH v2] hfs: add return values to hfs_brec_lenoff and hfs_bnode_read to improve robustness

杨晨志 <yang.chenzhi@xxxxxxxx> · Thu, 28 Aug 2025 12:35:01 +0000

在 2025/8/28 4:04, Viacheslav Dubeyko 写道:
> Frankly speaking, I don't think that reworking this method for
> returning the error code is necessary. Currently, we return the length
> value (u16) and we can return U16_MAX as for off as for len for the
> case of incorrect offset or erroneous logic. We can treat U16_MAX as
> error condition and we can check off and len for this value. Usually,
> HFS b-tree node has 512 bytes in size and as offset as length cannot be
> equal to U16_MAX (or bigger). And we don't need to change the input and
> output arguments if we will check for U16_MAX value.

Using U16_MAX as the error return value is reasonable. This change also 
applies to hfsplus, since the maximum node_size in hfsplus is 32768 
(0x8000), which is less than U16_MAX. Adopting this approach helps avoid 
extensive interface changes. I agree with your point.

>> diff --git a/fs/hfs/bnode.c b/fs/hfs/bnode.c
>> index e8cd1a31f247..b0bbaf016b8d 100644
>> --- a/fs/hfs/bnode.c
>> +++ b/fs/hfs/bnode.c
>> @@ -57,26 +57,16 @@ int check_and_correct_requested_length(struct
>> hfs_bnode *node, int off, int len)
>>   	return len;
>>   }
>>   
>> -void hfs_bnode_read(struct hfs_bnode *node, void *buf, int off, int
>> len)
>> +int __hfs_bnode_read(struct hfs_bnode *node, void *buf, u16 off, u16
>> len)
> 
> I don't follow why do we need to introduce __hfs_bnode_read(). One
> method for all is enough. And I think we still can use only
> hfs_bnode_read(). Because, we can initialize the buffer by 0x00 or 0xFF
> in the case we cannot read if offset or length are invalid. Usually,
> every method checks (or should) check the returning value of
> hfs_bnode_read().
> 
>>   {
>>   	struct page *page;
>>   	int pagenum;
>>   	int bytes_read;
>>   	int bytes_to_read;
>>   
>> -	if (!is_bnode_offset_valid(node, off))
>> -		return;
>> -
>> -	if (len == 0) {
>> -		pr_err("requested zero length: "
>> -		       "NODE: id %u, type %#x, height %u, "
>> -		       "node_size %u, offset %d, len %d\n",
>> -		       node->this, node->type, node->height,
>> -		       node->tree->node_size, off, len);
>> -		return;
>> -	}
>> -
>> -	len = check_and_correct_requested_length(node, off, len);
>> +	/* len = 0 is invalid: prevent use of an uninitalized
>> buffer*/
>> +	if (!len || !hfs_off_and_len_is_valid(node, off, len))
>> +		return -EINVAL;
>>   
>>   	off += node->page_offset;
>>   	pagenum = off >> PAGE_SHIFT;
>> @@ -93,6 +83,47 @@ void hfs_bnode_read(struct hfs_bnode *node, void
>> *buf, int off, int len)
>>   		pagenum++;
>>   		off = 0; /* page offset only applies to the first
>> page */
>>   	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int __hfs_bnode_read_u16(struct hfs_bnode *node, u16* buf,
>> u16 off)
> 
> I don't see the point to introduce another version of method because we
> can return U16_MAX as invalid value.
> 
>> +{
>> +	__be16 data;
>> +	int res;
>> +
>> +	res = __hfs_bnode_read(node, (void*)(&data), off, 2);
>> +	if (res)
>> +		return res;
>> +	*buf = be16_to_cpu(data);
>> +	return 0;
>> +}
>> +
>> +
>> +static int __hfs_bnode_read_u8(struct hfs_bnode *node, u8* buf, u16
>> off)
> 
> And we can return U8_MAX as invalid value here too.

My original plan was to add a return value to hfs_bnode_read(). However, 
since modifying this interface would require changes across dozens of 
call sites—which seems too extensive—and introducing a return value 
might necessitate additional error handling in certain functions (e.g., 
hfs_bnode_split()), I wasn’t entirely sure how to proceed.

Therefore, I created __hfs_bnode_read() -- a function that behaves 
identically to hfs_bnode_read but includes a return value—as a way to 
experiment and facilitate discussion. I don’t think we ultimately need 
an additional function; its purpose is strictly to help evaluate, within 
this RFC patch, whether adding a return value to hfs_bnode_read is 
desirable and whether the approach taken in __hfs_bnode_read() would be 
suitable.

I agree that we should check return values at every call site—there are 
too many issues in HFS caused by missing checks. However, I’m a bit 
confused by the suggestion to validate the hfs_bnode_read buffer 
directly. If the buffer is a structured type (such as btree_key), 
detecting errors becomes more challenging. In such cases, we may need to 
rely on methods like memcmp() or memchr_inv(). From that perspective, 
adding a return value to hfs_bnode_read seems an easier way.

If you have a lightweight method to quickly verify the validity of a 
read without introducing a return value, please let me know. I can 
follow up with a new patch to address it.
>> @@ -251,23 +283,28 @@ void hfs_bnode_dump(struct hfs_bnode *node)
>>   
>>   	off = node->tree->node_size - 2;
>>   	for (i = be16_to_cpu(desc.num_recs); i >= 0; off -= 2, i--)
>> {
>> -		key_off = hfs_bnode_read_u16(node, off);
>> +		res = __hfs_bnode_read_u16(node, &key_off, off);
>> +		if (res) return;
>>   		hfs_dbg_cont(BNODE_MOD, " %d", key_off);
>>   		if (i && node->type == HFS_NODE_INDEX) {
>> -			int tmp;
>> -
>> -			if (node->tree->attributes &
>> HFS_TREE_VARIDXKEYS)
>> -				tmp = (hfs_bnode_read_u8(node,
>> key_off) | 1) + 1;
>> -			else
>> +			u8 tmp, data;
>> +			if (node->tree->attributes &
>> HFS_TREE_VARIDXKEYS) {
>> +				res = __hfs_bnode_read_u8(node,
>> &data, key_off);
>> +				if (res) return;
> 
> This breaks the kernel code style.

Thank you for pointing out the formatting issues. I’ll be more careful 
about that in future development.

> 
> Please, see my comments above. I think we can use U16_MAX as invalid
> value.
> 

I understand your point. For functions like hfs_brec_lenoff, 
hfs_bnode_read_u16, and hfs_bnode_read_u8, using U16_MAX or U8_MAX as 
error return values does seem reasonable.

I will submit a new patch for modifying hfs_brec_lenoff using U16_MAX as 
returned-value to prevent potential out-of-bounds issues in 
hfs_bmap_alloc and hfs_bmap_free.

Regarding changes to hfs_bnode_read, I’d appreciate further guidance 
from you before proceeding with any modifications.

>> @@ -394,7 +401,9 @@ void hfs_bmap_free(struct hfs_bnode *node)
>>   			hfs_bnode_put(node);
>>   			return;
>>   		}
>> -		len = hfs_brec_lenoff(node, 0, &off);
>> +		res = hfs_brec_lenoff(node, 0, &off, &len);
>> +		if (res)
>> +			goto fail;
>>   	}
>>   	off += node->page_offset + nidx / 8;
>>   	page = node->page[off >> PAGE_SHIFT];
>> @@ -415,4 +424,8 @@ void hfs_bmap_free(struct hfs_bnode *node)
>>   	hfs_bnode_put(node);
>>   	tree->free_nodes++;
>>   	mark_inode_dirty(tree->inode);
>> +	return;
>> +fail:
>> +	hfs_bnode_put(node);
>> +	pr_err("fail to free a bnode due to invalid off or len\n");
> 
> I am not sure that breaking free bnode logic because of incorrect
> length (and, maybe, even offset) is correct behavior. Because, we can
> correct length and continue to free bnode. So, I am not sure that
> complete failure is the correct way of managing the situation.
> 

I'll look into this more carefully — the current approach does seem 
suboptimal.

Thanks,
Chenzhi Yang