On Thu, Jul 24, 2025 at 09:45:51PM +0800, Fan Gong wrote: > > > + > > > +/* Data provided to/by cmdq is arranged in structs with little endian fields but > > > + * every dword (32bits) should be swapped since HW swaps it again when it > > > + * copies it from/to host memory. > > > + */ > > > > This scheme may work on little endian hosts. > > But if so it seems unlikely to work on big endian hosts. > > > > I expect you want be32_to_cpu_array() for data coming from hw, > > with a source buffer as an array of __be32 while > > the destination buffer is an array of u32. > > > > And cpu_to_be32_array() for data going to the hw, > > with the types of the source and destination buffers reversed. > > > > If those types don't match your data, then we have > > a framework to have that discussion. > > > > > > That said, it is more usual for drivers to keep structures in the byte > > order they are received. Stored in structures with members with types, in > > this case it seems that would be __be32, and accessed using a combination > > of BIT/GENMASK, FIELD_PREP/FIELD_GET, and cpu_to_be*/be*_to_cpu (in this > > case cpu_to_be32/be32_to_cpu). > > > > An advantage of this approach is that the byte order of > > data is only changed when needed. Another is that it is clear > > what the byte order of data is. > > There is a simplified example: > > Here is a 64 bit little endian that may appear in cmdq: > __le64 x > After the swap it will become: > __be32 x_lo > __be32 x_hi > This is NOT __be64. > __be64 looks like this: > __be32 x_hi > __be32 x_lo Sure, byte swapping 64 bit entities is different to byte swapping two consecutive 32 bit entities. I completely agree. > > So the swapped data by HW is neither BE or LE. In this case, we should use > swab32 to obtain the correct LE data because our driver currently supports LE. > This is for compensating for bad HW decisions. Let us assume that the host is reading data provided by HW. If the swab32 approach works on a little endian host to allow the host to access 32-bit values in host byte order. Then this is because it outputs a 32-bit little endian values. But, given the same input, it will not work on a big endian host. This is because the same little endian output will be produced, while the host byte order is big endian. I think you need something based on be32_to_cpu()/cpu_to_be32(). This will effectively be swab32 on little endian hosts (no change!). And a no-op on big endian hosts (addressing my point above). More specifically, I think you should use be32_to_cpu_array() and cpu_to_be32_array() instead of swab32_array(). > > > > +void hinic3_cmdq_buf_swab32(void *data, int len) > > > +{ > > > + u32 *mem = data; > > > + u32 i; > > > + > > > + for (i = 0; i < len / sizeof(u32); i++) > > > + mem[i] = swab32(mem[i]); > > > +} > > > > This seems to open code swab32_array(). > > We will use swab32_array in next patch. > Besides, we will use LE for cmdq structs to avoid confusion and enhance > readability. Thanks.