On Thu, Sep 11, 2025 at 11:27:42AM -0700, Eric Biggers wrote: > On Thu, Sep 11, 2025 at 03:41:59PM +0800, Guan-Chun Wu wrote: > > Rework base64_encode() and base64_decode() with extended interfaces > > that support custom 64-character tables and optional '=' padding. > > This makes them flexible enough to cover both standard RFC4648 Base64 > > and non-standard variants such as base64url. > > RFC4648 specifies both base64 and base64url. > Got it, I'll update the commit message in the next version. > > The encoder is redesigned to process input in 3-byte blocks, each > > mapped directly into 4 output symbols. Base64 naturally encodes > > 24 bits of input as four 6-bit values, so operating on aligned > > 3-byte chunks matches the algorithm's structure. This block-based > > approach eliminates the need for bit-by-bit streaming, reduces shifts, > > masks, and loop iterations, and removes data-dependent branches from > > the main loop. > > There already weren't any data-dependent branches in the encoder. > Got it, I'll update the commit message in the next version. > > The decoder replaces strchr()-based lookups with direct table-indexed > > mapping. It processes input in 4-character groups and supports both > > padded and non-padded forms. Validation has been strengthened: illegal > > characters and misplaced '=' padding now cause errors, preventing > > silent data corruption. > > The decoder already detected invalid inputs. > You're right, the decoder already rejected invalid inputs. What has been strengthened in the new version is the padding handling (length must be a multiple of 4, and = only allowed in the last two positions). > > While this is a mechanical update following the lib/base64 rework, > > nvme-auth also benefits from the performance improvements in the new > > encoder/decoder, achieving faster encode/decode without altering the > > output format. > > > > The reworked encoder and decoder unify Base64 handling across the kernel > > with higher performance, stricter correctness, and flexibility to support > > subsystem-specific variants. > > Which part is more strictly correct? > The stricter correctness here refers to the decoder, specifically the padding checks (length must be a multiple of 4, and = only allowed in the last two positions). > > diff --git a/lib/base64.c b/lib/base64.c > > index 9416bded2..b2bd5dab5 100644 > > --- a/lib/base64.c > > +++ b/lib/base64.c > > @@ -15,104 +15,236 @@ > > #include <linux/string.h> > > #include <linux/base64.h> > > > > -static const char base64_table[65] = > > - "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"; > > +#define BASE64_6BIT_MASK 0x3f /* Mask to extract lowest 6 bits */ > > +#define BASE64_BITS_PER_BYTE 8 > > +#define BASE64_CHUNK_BITS 6 > > + > > +/* Output-char-indexed shifts: for output chars 0,1,2,3 respectively */ > > +#define BASE64_SHIFT_OUT0 (BASE64_CHUNK_BITS * 3) /* 18 */ > > +#define BASE64_SHIFT_OUT1 (BASE64_CHUNK_BITS * 2) /* 12 */ > > +#define BASE64_SHIFT_OUT2 (BASE64_CHUNK_BITS * 1) /* 6 */ > > +/* OUT3 uses 0 shift and just masks with BASE64_6BIT_MASK */ > > + > > +/* For extracting bytes from the 24-bit value (decode main loop) */ > > +#define BASE64_SHIFT_BYTE0 (BASE64_BITS_PER_BYTE * 2) /* 16 */ > > +#define BASE64_SHIFT_BYTE1 (BASE64_BITS_PER_BYTE * 1) /* 8 */ > > + > > +/* Tail (no padding) shifts to extract bytes */ > > +#define BASE64_TAIL2_BYTE0_SHIFT ((BASE64_CHUNK_BITS * 2) - BASE64_BITS_PER_BYTE) /* 4 */ > > +#define BASE64_TAIL3_BYTE0_SHIFT ((BASE64_CHUNK_BITS * 3) - BASE64_BITS_PER_BYTE) /* 10 */ > > +#define BASE64_TAIL3_BYTE1_SHIFT ((BASE64_CHUNK_BITS * 3) - (BASE64_BITS_PER_BYTE * 2)) /* 2 */ > > + > > +/* Extra: masks for leftover validation (no padding) */ > > +#define BASE64_MASK(n) ({ \ > > + unsigned int __n = (n); \ > > + __n ? ((1U << __n) - 1U) : 0U; \ > > +}) > > +#define BASE64_TAIL2_UNUSED_BITS (BASE64_CHUNK_BITS * 2 - BASE64_BITS_PER_BYTE) /* 4 */ > > +#define BASE64_TAIL3_UNUSED_BITS (BASE64_CHUNK_BITS * 3 - BASE64_BITS_PER_BYTE * 2) /* 2 */ > > These #defines make the code unnecessarily hard to read. Most of them > should just be replaced with the integer literals. > Got it, thanks for the feedback. I'll simplify this in the next version. > > * This implementation hasn't been optimized for performance. > > But the commit message claims performance improvements. > That was my mistake. I forgot to update this part of the comment. I’ll fix it in the next version. > > * > > * Return: the length of the resulting decoded binary data in bytes, > > * or -1 if the string isn't a valid base64 string. > > base64 => Base64, since multiple variants are supported now. Refer to > the terminology used by RFC4686. Base64 is the general term, and > "base64" and "base64url" specific variants of Base64. > > - Eric Ok, I'll update the comments to use Base64. Best regards, Guan-chun