On Tue, Apr 22, 2025 at 1:03 PM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: > > On 4/22/25 20:47, Pavel Begunkov wrote: > > On 4/22/25 19:30, Mina Almasry wrote: > >> On Tue, Apr 22, 2025 at 11:19 AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: > >>> > >>> On 4/22/25 14:56, Mina Almasry wrote: > >>>> On Tue, Apr 22, 2025 at 1:43 AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: > >>>>> > >>>>> On 4/18/25 00:15, Mina Almasry wrote: > >>>>>> Currently net_iovs support only pp ref counts, and do not support a > >>>>>> page ref equivalent. > >>>>> > >>>>> Makes me wonder why it's needed. In theory, nobody should ever be > >>>>> taking page references without going through struct ubuf_info > >>>>> handling first, all in kernel users of these pages should always > >>>>> be paired with ubuf_info, as it's user memory, it's not stable, > >>>>> and without ubuf_info the user is allowed to overwrite it. > >>>>> > >>>> > >>>> The concern about the stability of the from-userspace data is already > >>>> called out in the MSG_ZEROCOPY documentation that we're piggybacking > >>>> devmem TX onto: > >>> > >>> Sure, I didn't object that. There is no problem as long as the > >>> ubuf_info semantics is followed, which by extension mean that > >>> any ref manipulation should already be gated on ubuf_info, and > >>> there should be no need in changing generic paths. > >>> > >> > >> I'm sorry I'm not following. skb_frag_ref is how the net stack obtains > >> references on an skb_frag, regardless on whether the frag is a > >> MSG_ZEROCOPY one with ubuf info, or a regular tx frag without a > >> ubuf_info, or even an io_uring frag which I think have the > > > > Yep > > > >> msg->ubuf_info like we discussed previously. I don't see the net stack > >> in the current code special casing how it obtains refs on frags, and I > >> don't see the need to add special casing. Can you elaborate in more > > > > You'll be special casing it either way, it's probably unavoidable, > > just here it is in put/get_netmem. > > > >> detail what is the gating you expect, and why? Are you asking that I > >> check the skb has a ubuf_info before allowing to grab the reference on > >> the dmabuf binding? Or something else? > > > > get_page() already shouldn't be a valid operation for ubuf backed frags > > apart from few cases where frags are copied/moved together with ubuf. This is where I'm not following. Per the v5 changelog of this commit, all these skb_helpers hit skb_frag_ref (which is just get_page underneath): tcp_grow_skb, __skb_zcopy_downgrade_managed, __pskb_copy_fclone, pskb_expand_head, skb_zerocopy, skb_split, pksb_carve_inside_header, pskb_care_inside_nonlinear, tcp_clone_payload, skb_segment, skb_shift, skb_try_coalesce. I don't see many of them opt-out of skb_frag_ref if the skb is unreadable or has ubuf_info. Are you saying all/most/some of these callers are invalid? I tend to assume merged code is the correct one unless I have ample expertise to say otherwise. > > The frags are essentially bundled with ubuf and shouldn't exist without > > it, because otherwise user can overwrite memory with all the following > > nastiness. If there are some spots violating that, I'd rather say they > > should be addressed. > > > > Instead of adding net_iov / devmem handling in generic paths affecting > > everyone, you could change those functions where it's get_page() are > > called legitimately. The niov/devmem part of get/put_netmem doesn't > > even have the same semantics as the page counterparts as it cannot > > prevent from reallocation. That might be fine, but it's not clear > > Actually, maybe it's not that exclusive to netiov, same reallocation > argument is true for user pages, even though they're reffed > separately. > > It might be fine to leave this approach, while suboptimal it should > be easier for you. Depends on how folks feel about the extra > overhead in the normal tx path. > Right, I think there is only 2 ways to handle all the code paths in the tcp stack that hit skb_frag_ref: 1. We go over all of them and make sure they're unreachable for unreadable skbs: if (!skb_frags_readable()) return; // or something 2. or, we just add net_iov support in skb_frag_ref. This patch series does the latter, which IMO is much preferred. FWIW I'm surprised that adding net_iov support to skb_frag_ref/unref is facing uncertainty. I've added net_iov support for many skb helpers in commit 65249feb6b3df ("net: add support for skbs with unreadable frags") and commit 9f6b619edf2e8 ("net: support non paged skb frags"). skb_frag_ref/unref is just 1 helper I "missed" because it's mostly (but not entirely) used by the TX path. -- Thanks, Mina