On Tue, Apr 01, 2025 at 11:51:07PM +0200, Heiner Kallweit wrote: > On 01.04.2025 22:55, Bjorn Helgaas wrote: > > Hi, > > > > The PCIe spec is ambiguous about how the VPD checksum should be > > computed, and resolving this ambiguity might break drivers. > > > > PCIe r6.0 sec 6.27 says only the VPD-R list should be included in the > > checksum: > > > > One VPD-R (10h) tag is used as a header for the read-only keywords. > > The VPD-R list (including tag and length) must checksum to zero. > > This requirement I don't find in the PCI 3.0 spec, not sure with which > version it was added. I think it's there; that same text is in PCI r3.0, Appendix I, about five lines before I.1. > Interestingly this doesn't even require the presence of a RV > keyword. Or the presence of the RV keyword is implicitly assumed. I.3.1.1 says RV is required, and I guess it has to be last in VPD-R to cover any reserved space (as needed, I suppose to align to the VPD-W area, which might be in a different chip). > Maybe this part isn't meant literally. I can imagine they wanted to > clarify that checksum calculation excludes the VPD-W area. > And unfortunately they weren't precise enough, and introduced the > ambiguity you found. > > > But sec 6.27.2.2 says "all bytes in VPD ... up to the checksum byte": > > > > RV The first byte of this item is a checksum byte. The checksum is > > correct if the sum of all bytes in VPD (from VPD address 0 up > > to and including this byte) is zero. > > This one can be found identically in the PCI v3.0 spec already: > > The checksum is correct if the sum of all bytes in VPD (from > VPD address 0 up to and including this byte) is zero. > > I don't think they want to break backwards-compatibility, therefore > this requirement should still be valid. Yes, and I think the RV description is more specific and is what I would have used to implement it. > > These are obviously different unless VPD-R happens to be the first > > item in VPD. But sec 6.27 and 6.27.2.1 suggest that the Identifier > > String item should be the first item, preceding the VPD-R list: > > > > The first VPD tag is the Identifier String (02h) and provides the > > product name of the device. [6.27] > > > > Large resource type Identifier String (02h) > > > > This tag is the first item in the VPD storage component. It > > contains the name of the add-in card in alphanumeric characters. > > [6.27.2.1, Table 6-23] > > > > I think pci_vpd_check_csum() follows sec 6.27.2.2: it sums all the > > bytes in the buffer up to and including the checksum byte of the RV > > keyword. The range starts at 0, not at the beginning of the VPD-R > > read-only list, so it likely includes the Identifier String. > > > > As far as I can tell, only the broadcom/tg3 and chelsio/cxgb4/t4 > > drivers use pci_vpd_check_csum(). Of course, other drivers might > > compute the checksum themselves. > > > > Any thoughts on how this spec ambiguity should be resolved? > > > > Any idea how devices in the field populate their VPD? > > > > Can you share any VPD dumps from devices that include an RV keyword > > item? > > I have only very dated devices which likely date back to before > the existence of PCIe r6.0. So their VPD dump may not really help. > > IIRC there's an ongoing discussion regarding making VPD content > user-readable on mlx5 devices. Maybe check with the Mellanox/Nvidia > guys how they interpret the spec and implemented VPD checksumming. Good idea, cc'd.