Hello Tiffany. On Sun, Jul 13, 2025 at 10:00:09PM -0700, Tiffany Yang <ynaffit@xxxxxxxxxx> wrote: > Other sources of delay can cause similar issues, but this change focuses > on allowing frozen time to be accounted for in particular because of how > large it can grow and how unevenly it can affect applications running on > the system. I'd like to incorporate the reason from your other mail: | Since there isn't yet a clear way to identify a set of "lost" time | that everyone (or at least a wider group of users) cares about, it | seems like iterating over components of interest is the best way into this commit message (because that's a stronger ponit that your use case alone). > Any feedback would be much appreciated! I can see benefits of this new stat field conceptually, I have some remarks to implementation and suggestions to conventions below. > --- a/Documentation/admin-guide/cgroup-v2.rst > +++ b/Documentation/admin-guide/cgroup-v2.rst > @@ -1018,6 +1018,14 @@ All cgroup core files are prefixed with "cgroup." > it's possible to delete a frozen (and empty) cgroup, as well as > create new sub-cgroups. > > + cgroup.freeze.stat With the given implementation (and use scenario), this'd better exposed in cgroup.freeze.stat.local I grok the hierarchical summing would make little sense and it'd make implementaion more complex. With that I'm thinking about formulation: Cumulative time that cgroup has spent between freezing and thawing, regardless of whether by self or ancestor cgroups. NB (not) reaching "frozen" state is not accounted here. > + A read-only flat-keyed file which exists in non-root cgroups. > + The following entry is defined: > + > + freeze_time_total_ns > + Cumulative time that this cgroup has spent in the freezing > + state, regardless of whether or not it reaches "frozen". > + Rather use microseconds, it's the cgroup API convention and I'm not sure nanosecods exposed here are the needed precision. 1 _____ frozen 0 __/ \__ ab cd Yeah, I find the mesurent between a and c the sanest. > +static int cgroup_freeze_stat_show(struct seq_file *seq, void *v) > +{ > + struct cgroup *cgrp = seq_css(seq)->cgroup; > + u64 freeze_time = 0; > + > + spin_lock_irq(&css_set_lock); > + if (test_bit(CGRP_FREEZE, &cgrp->flags)) > + freeze_time = ktime_get_ns() - cgrp->freezer.freeze_time_start_ns; > + > + freeze_time += cgrp->freezer.freeze_time_total_ns; > + spin_unlock_irq(&css_set_lock); I don't like taking this spinlock only for the matter of reading this attribute. The intention should be to keep the (un)freezeing mostly unaffected at the expense of these readers (seqcount or u64 stats?). Alternative approach: either there's outer watcher who can be notified by cgroup.events:frozen or it's an inner watcher who couldn't actively read the field anyway. So the field could only show completed freeze/thaw cycles from the past (i.e. not substitute clock_gettime(2) when the cgroup is frozen), which could simplify querying the flag too. > @@ -5758,6 +5780,7 @@ static struct cgroup *cgroup_create(struct cgroup *parent, const char *name, > * if the parent has to be frozen, the child has too. > */ > cgrp->freezer.e_freeze = parent->freezer.e_freeze; > + cgrp->freezer.freeze_time_total_ns = 0; struct cgroup is kzalloc'd, this is unnecessary
Attachment:
signature.asc
Description: PGP signature