Re: [PATCH v4] crypto: riscv/poly1305 - import OpenSSL/CRYPTOGAMS implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



+#ifndef	__CHERI_PURE_CAPABILITY__
+	andi	$tmp0,$inp,7		# $inp % 8
+	andi	$inp,$inp,-8		# align $inp
+	slli	$tmp0,$tmp0,3		# byte to bit offset
+#endif
+	ld	$in0,0($inp)
+	ld	$in1,8($inp)
+#ifndef	__CHERI_PURE_CAPABILITY__
+	beqz	$tmp0,.Laligned_key
+
+	ld	$tmp2,16($inp)
+	neg	$tmp1,$tmp0		# implicit &63 in sll
+	srl	$in0,$in0,$tmp0
+	sll	$tmp3,$in1,$tmp1
+	srl	$in1,$in1,$tmp0
+	sll	$tmp2,$tmp2,$tmp1
+	or	$in0,$in0,$tmp3
+	or	$in1,$in1,$tmp2
+
+.Laligned_key:

This code is going through a lot of trouble to work on RISC-V CPUs that don't
support efficient misaligned memory accesses.  That includes issuing loads of
memory outside the bounds of the given buffer, which is questionable (even if
it's guaranteed to not cross a page boundary).

It's indeed guaranteed to not cross a page *nor* even cache-line boundaries.
Hence they can't trigger any externally observed side effects the
corresponding unaligned loads won't. What is the concern otherwise? [Do note
that the boundaries are not crossed on a boundary-enforcable CHERI platform
;-)]

With this, we get:

- More complex code.

My rationale is as follows. It's beneficial to have this code to cover the whole spectrum of processor implementations. I for one would even say it's important, because penalties on processors that can't handle misaligned access efficiently are just too high to ignore. Now, it's possible to bypass it with #ifdef-s (as done for CHERI), but to make things less confusing, a.k.a. *less* complex, it's preferred to rely on the compiler predefines (as done for CHERI). Later compiler versions introduced apparently suitable predefines for this, __riscv_misaligned_slow/fast/avoid. However, as of the moment of this writing the macros in question don't seem to depend on the -mcpu parameter. But it's probably reasonable to assume that they will at a later point. So the suggestion would be to use these. Does it sound reasonable? Or would you insist on a custom macro that would need to be set depending on CONFIG_RISCV_EFFICIENT_UNALIGNED_ACCESS?

- Slower on CPUs that do support efficient misaligned accesses.

With arguably marginal penalty, as discussed in the previous message. In the context one can also view it as a trade-off between a small penalty and increased #ifdef spaghetti :-)

- The buffer underflows and overflows could cause problems with future CPU
   behavior.  (Did you consider the palette memory extension, for example?)

Pallette memory extension colours fixed-size, hence accordingly aligned, blocks. Since the block size is larger than the word load size, any aligned load would be safe, because even a single "excess" or "short" byte would colour the whole block accordingly.

Just in case to be clear. The argument is about loads. Misaligned stores is naturally different matter and it would be inappropriate to handle them in the similar manner.

That being said, if there will continue to be many RISC-V CPUs that don't
support efficient misaligned accesses, then we effectively need to do this
anyway.  I hoped that things might be developing along the lines of ARM, where
eventually misaligned accesses started being supported uniformly.  But perhaps
RISC-V is still in the process of learning that lesson.

One has to recognize that it can also be a matter of cost. I mean imagine you want to license the least expensive IP from SiFive, or have very limited space for MCU. Well, Linux, naturally having higher minimum requirements, doesn't have to care about these, but it doesn't mean that nobody would :-)

The rest of the kernel's RISC-V crypto code, which is based on the vector
extension, just assumes that efficient misaligned memory accesses are supported.

Was it tested on real hardware though? I wonder what hardware is out there
that supports the vector crypto extensions?

If I remember correctly, SiFive tested it on their hardware.

Cool! The question was rather "how did it do performance-wise in the context of this discussion," but never mind. Thanks! In a way there is a contradiction. RISC-V as a concept is about openness to everybody, while SiFive is naturally about itself ;-)

Cheers.






[Index of Archives]     [Kernel]     [Gnu Classpath]     [Gnu Crypto]     [DM Crypt]     [Netfilter]     [Bugtraq]
  Powered by Linux