Re: docs: sphinx: avoid using the deprecated node.set_class()

Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> · Fri, 20 Jun 2025 21:54:10 +0200

Em Fri, 20 Jun 2025 20:44:59 +0200
Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> escreveu:

> Em Fri, 20 Jun 2025 15:05:39 +0200
> Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> escreveu:
> 
> > Em Fri, 20 Jun 2025 20:14:57 +0900
> > Akira Yokosawa <akiyks@xxxxxxxxx> escreveu:
> > 
> > > Mauro!
> > > 
> > > On Fri, 20 Jun 2025 09:44:30 +0200, Mauro Carvalho Chehab wrote:  
> > > > Em Fri, 20 Jun 2025 11:22:48 +0900
> > > > Akira Yokosawa <akiyks@xxxxxxxxx> escreveu:
> > > >     
> > > [...]
> > >   
> > > > 
> > > > I didn't test it yet, but yesterday I wrote a script which allows us to test
> > > > for Sphinx version breakages on multiple versions in one go.
> > > > 
> > > > Using it (and again before this patch, but after my parser-yaml series), I 
> > > > noticed that 6.0.1 with "-jauto" with those packages:    
> > > 
> > > Why did you pick 6.0.1, which was in the middle of successive releases in
> > > early 6.x days.   
> > 
> > I added all major,minor,latest-patch version since 3.4.3 and added to
> > the script. I didn't check what of those are inside a distro or not.
> > 
> > > No distro Sphinx packagers have picked this version.  
> > 
> > The hole idea is to have a script where we can automate build tests
> > with old versions. Perhaps it makes a sense to add a flag at the table
> > indicating what major distros have what sphinx version and a command
> > line parameter to either test all or just the ones shipped on major
> > distros.
> > > 
> > > Just see the release history:
> > > 
> > > [2022-10-16]  5.3.0  ### stable ###
> > > [2022-12-29]  6.0.0
> > > [2023-01-05]  6.0.1
> > > [2023-01-05]  6.1.0  6.1.1 
> > > [2023-01-07]  6.1.2
> > > [2023-01-10]  6.1.3  ### stable ###
> > > [2023-04-23]  6.2.0
> > > 
> > > The crash you observed is hardly related to this fix.  
> > 
> > Almost certainly, the breakage with 6.0.1 is unrelated to this
> > change.
> 
> Heh, I'm not even sure that the problem is with 6.0.1 or with
> Fedora OOM killer setup...
> 
> Even with 64GB ram and 8GB swap(*), I'm getting lots of those:
> 
> jun 20 03:23:46 myhost kernel: [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
> jun 20 03:23:46 myhost kernel: [   1762]   998  1762     4074      467       96      371         0    77824      144          -900 systemd-oomd
> jun 20 03:23:46 myhost kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=user.slice,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/app.slice/app-org.kde.konsole-433443.scope,task=sphinx-build,pid=1043271,uid=1000
> jun 20 03:23:46 myhost kernel: Out of memory: Killed process 1043271 (sphinx-build) total-vm:4222280kB, anon-rss:3934380kB, file-rss:688kB, shmem-rss:0kB, UID:1000 pgtables:7812kB oom_score_adj:200
> jun 20 03:24:28 myhost kernel: sphinx-build invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=200
> jun 20 03:24:28 myhost kernel:  oom_kill_process.cold+0xa/0xbe
> 
> Will do some extra texts here and try to adjust this.
> 
> (*) Granted, I need more swap... the FS was generated when 8GB
>     were good enough ;-)
>     Still 64GB RAM should be enough. Will try to change overcommit
>     and see how it goes.

Yeah, the problem with 6.0.1 was indeed with OOM killer. Once
I added more 64GB of swap, and wait for a long time, it finally
compleded the task without crashes:

	$ ./scripts/test_doc_build.py -m -V 6.0.1 -v
	...
	Finished doc build for Sphinx 6.0.1. Elapsed time: 00:31:02

	Summary:
        	Sphinx 6.0.1 elapsed time: 00:31:02

Looking at the past log I have handy, this is by far the worse one:

	Finished doc build for Sphinx 6.1.3. Elapsed time: 00:11:15
	Finished doc build for Sphinx 6.2.1. Elapsed time: 00:09:21
	Finished doc build for Sphinx 7.0.1. Elapsed time: 00:09:17
	Finished doc build for Sphinx 7.1.2. Elapsed time: 00:09:22
	Finished doc build for Sphinx 7.2.3. Elapsed time: 00:09:17
	Finished doc build for Sphinx 7.3.7. Elapsed time: 00:09:34
	Finished doc build for Sphinx 7.4.7. Elapsed time: 00:04:54
	Finished doc build for Sphinx 8.0.2. Elapsed time: 00:03:40
	Finished doc build for Sphinx 8.1.3. Elapsed time: 00:03:47
	Finished doc build for Sphinx 8.2.3. Elapsed time: 00:03:45

(3.4.3 was the previous "champion" with about 14 minutes)

All of them are using "-jauto" on a machine with 24 CPU threads.

The only one that didn't work with my past scenario was 6.0.1,
so OOM killer seems the one to blame: it is killing a sub-process,
but keeping the main one active, thus causing Sphinx to run for
a long time, only to notice at the end that something bad happened
and producing a completely bogus log. 

Heh, systemd-oomd, shame on you!

Thanks,
Mauro