> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@xxxxxxxxxx> On Behalf Of > Alexander Lobakin > Sent: Tuesday, August 26, 2025 9:25 PM > To: intel-wired-lan@xxxxxxxxxxxxxxxx > Cc: Lobakin, Aleksander <aleksander.lobakin@xxxxxxxxx>; Kubiak, Michal > <michal.kubiak@xxxxxxxxx>; Fijalkowski, Maciej > <maciej.fijalkowski@xxxxxxxxx>; Nguyen, Anthony L > <anthony.l.nguyen@xxxxxxxxx>; Kitszel, Przemyslaw > <przemyslaw.kitszel@xxxxxxxxx>; Andrew Lunn <andrew+netdev@xxxxxxx>; > David S. Miller <davem@xxxxxxxxxxxxx>; Eric Dumazet > <edumazet@xxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; Paolo Abeni > <pabeni@xxxxxxxxxx>; Alexei Starovoitov <ast@xxxxxxxxxx>; Daniel > Borkmann <daniel@xxxxxxxxxxxxx>; Simon Horman <horms@xxxxxxxxxx>; > NXNE CNSE OSDT ITP Upstreaming > <nxne.cnse.osdt.itp.upstreaming@xxxxxxxxx>; bpf@xxxxxxxxxxxxxxx; > netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx > Subject: [Intel-wired-lan] [PATCH iwl-next v5 01/13] xdp, libeth: make the > xdp_init_buff() micro-optimization generic > > Often times the compilers are not able to expand two consecutive 32-bit > writes into one 64-bit on the corresponding architectures. This applies to > xdp_init_buff() called for every received frame (or at least once per each 64 > frames when the frag size is fixed). > Move the not-so-pretty hack from libeth_xdp straight to xdp_init_buff(), but > using a proper union around ::frame_sz and ::flags. > The optimization is limited to LE architectures due to the structure layout. > > One simple example from idpf with the XDP series applied (Clang 22-git, > CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE => -O2): > > add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-27 (-27) > Function old new delta > idpf_vport_splitq_napi_poll 5076 5049 -27 > > The perf difference with XDP_DROP is around +0.8-1% which I see as more > than satisfying. > > Suggested-by: Simon Horman <horms@xxxxxxxxxx> > Signed-off-by: Alexander Lobakin <aleksander.lobakin@xxxxxxxxx> > --- > include/net/libeth/xdp.h | 11 +---------- > include/net/xdp.h | 28 +++++++++++++++++++++++++--- > 2 files changed, 26 insertions(+), 13 deletions(-) > Tested-by: R,Ramu <ramu.r@xxxxxxxxx>