On Tue, May 13, 2025 at 11:21 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Tue, 13 May 2025 10:03:01 +0800 Jason Xing <kerneljasonxing@xxxxxxxxx> wrote: > > > On Tue, May 13, 2025 at 8:52 AM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > > > On Mon, 12 May 2025 10:49:35 +0800 Jason Xing <kerneljasonxing@xxxxxxxxx> wrote: > > > > > > > From: Jason Xing <kernelxing@xxxxxxxxxxx> > > > > > > > > Use for_each_possible_cpu to create per-cpu relayfs file to avoid later > > > > hotplug cpu which doesn't have its own file. > > > > > > I don't understand this. Exactly what problem are we trying to solve? > > > > The reason behind this change is can we directly allocate per possible > > cpu at the initialization phase. After this, even if some cpu goes > > online, we don't need to care about it. > > > > The idea is directly borrowed from the networking area where people > > use possible cpu iteration for most cases. > > I'd rephrase that as "I know this is wrong but I'm lazy so I'll do it > this way for now and I'll fix it up later but I never actually do so". > > Yes, handling hotplug is a hassle and for_each_possible_cpu() saves a > couple of hours at development time, but its cost is forever. > > > > > > > > Reviewed-by: Yushan Zhou <katrinzhou@xxxxxxxxxxx> > > > > Signed-off-by: Jason Xing <kernelxing@xxxxxxxxxxx> > > > > --- > > > > kernel/relay.c | 4 ++-- > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/kernel/relay.c b/kernel/relay.c > > > > index 27f7e701724f..dcb099859e83 100644 > > > > --- a/kernel/relay.c > > > > +++ b/kernel/relay.c > > > > @@ -519,7 +519,7 @@ struct rchan *relay_open(const char *base_filename, > > > > kref_init(&chan->kref); > > > > > > > > mutex_lock(&relay_channels_mutex); > > > > - for_each_online_cpu(i) { > > > > + for_each_possible_cpu(i) { > > > > > > num_possible_cpus() can sometimes greatly exceed num_online_cpus(), so > > > this is an unfortunate change. > > > > Are you worried about too much extra memory to waste in this case? > > That, and the cost of iterating over 1024 CPUs instead of 4 CPUs! > > > A relevant thing I would like to share here: > > To keep the high performance of transferring data between kernel space > > and user space, the per-cpu mechanism is needed like how relay works > > at the moment. It allocates many unnecessary/big memory chunks > > especially when the cpu number is very large, say, 256. I'm still > > working on this to see if we can figure out a good approach to balance > > the performance and memory. > > > > > It would be better to implement the > > > hotplug notifier? > > > > Right, but sorry, I hesitate to do so because it involves much more > > work and corresponding tests. > > I did this conversion a few times, a million years ago. It's > surprisingly easy. I see. Let me try this :) Thanks, Jason