Em qua., 16 de jul. de 2025 às 20:52, Dave Chinner <david@xxxxxxxxxxxxx> escreveu: > > On Wed, Jul 16, 2025 at 03:20:37PM -0300, Marcelo Moreira wrote: > > The `strncpy` function is deprecated for NUL-terminated strings as > > explained in the "strncpy() on NUL-terminated strings" section of > > Documentation/process/deprecated.rst. > > > > In `xrep_symlink_salvage_inline()`, the `target_buf` (which is `sc->buf`) > > is intended to hold a NUL-terminated symlink path. The original code > > used `strncpy(target_buf, ifp->if_data, nr)`, where `nr` is the maximum > > number of bytes to copy. > > Yes, this prevents source buffer overrun in the event the corrupted > symlink data buffer is not correctly nul terminated or there is a > length mismatch between the symlink data and the inode metadata. > > This patch removes the explicit source buffer overrun protection the > code currently provides. > > > This approach is problematic because `strncpy()` > > does not guarantee NUL-termination if the source string is truncated > > exactly at `nr` bytes, which can lead to out-of-bounds read issues > > if the buffer is later treated as a NUL-terminated string. > > No, that can't happen, because sc->buf is initialised to contain > NULs and is large enough to hold a max length symlink plus the > trailing NUL termination. Hence it will always be NUL-terminated, > even if the symlink length (nr) is capped at XFS_SYMLINK_MAXLEN. > > > Evidence from `fs/xfs/scrub/symlink.c` (e.g., `strnlen(sc->buf, > > XFS_SYMLINK_MAXLEN)`) confirms that `sc->buf` is indeed expected to be > > NUL-terminated. > > Please read the code more carefully. This is -explicitly- called out > in a comment in xrep_symlink_salvage() before it starts to process > the symlink data buffer that we just used strncpy() to place the > data in: > > /* > * NULL-terminate the buffer because the ondisk target does not > * do that for us. If salvage didn't find the exact amount of > * data that we expected to find, don't salvage anything. > */ > target_buf[buflen] = 0; > if (strlen(target_buf) != sc->ip->i_disk_size) > buflen = 0; > > Also, have a look at the remote symlink data copy above the inline > salvage code you are changing (xrep_symlink_salvage_remote()). > > It uses memcpy() to reconstruct the symlink data from multiple > source buffers. It does *not* explicitly NUL-terminate sc->buf after > using memcpy() to copy from the on disk structures to sc->buf. The > on-disk symlink data is *not* NUL-terminated, either. > > IOWs, the salvage code that reconstructs the symlink data does not > guarantee NUL termination, so we do it explicitly before the data in > the returned buffer is used. > > > Furthermore, `sc->buf` is allocated with > > `kvzalloc(XFS_SYMLINK_MAXLEN + 1, ...)`, explicitly reserving space for > > the NUL terminator. > > .... and initialising the entire buffer to contain NULs. IOWs, we > have multiple layers of defence against data extraction not doing > NUL-termination of the data it extracts. > > > This change improves code safety and clarity by using a safer function for > > string copying. > > I disagree. It is a bad change because it uses strscpy() incorrectly > for the context. i.e. it removes explicit source buffer overrun > protection whilst making the incorrect assumption that the callers > need to be protected from unterminated strings in the destination > buffer. > > This code is dealing with *corrupt structures*, so it -must not- > make any assumptions about the validity of incoming data structures, > nor the validity of recovered data. It has to treat them as is they > are always invalid, and explicitly protect against all types of > buffer overruns. > > IOW, if you must replace strncpy() in xrep_symlink_salvage_inline(), > then the correct replacement memcpy(). Using some other strcpy() > variant is just as easy to get wrong as strncpy() if you don't > understand why strncpy() is safe to use in the first place. > > -Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx got it, `kvzalloc` ensures that `sc->buf` is indeed NUL-terminated from the start, and the explicit NUL termination (I hadn't seen that) in `xrep_symlink_salvage()` (target_buf[buflen] = 0;) further clarifies that the data on disk is treated as raw, non-NUL-terminated bytes. Thank you very much Dave for your detailed review and for taking the time to explain the nuances of this code. I clearly misunderstood several critical aspects of how `strncpy()` was being used here and the protective mechanisms already in place. My apologies for the incorrect assumptions in my commit message. Given that the original `strncpy()` is safe and correctly implemented for this context, and understanding that `memcpy()` would be the correct replacement if a change were deemed necessary for non-NUL-terminated raw data, I have a question: Do you still think a replacement is necessary here, or would you prefer to keep the existing `strncpy()` given its correct and safe usage in this context? If a replacement is preferred, I will resubmit a V2 using `memcpy()` instead. Thank you again for teaching me these important details. I'm learning a lot from your feedback :D -- Cheers, Marcelo Moreira