Re: Maintenance Notification: Branch Push Behavior Change - 8/19/25

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I will do a better job at communicating things like this in the future, but we ran into some additional issues that resulted in another revert back to non-pipeline builds on 8/22.

1.
Error: creating events dirs: mkdir /run/user/1101: permission denied – builds were getting OOM Killed.  This would result in the jenkins-build user essentially getting logged out.  This would break systemd lingering.  We realized there is some logic in the ceph spec file that sets -j based on how much memory a system has for RPM builds.  The make-debs.sh script in ceph.git, which ceph-dev-pipeline utilizes and non-containerized builds do not, did not have this memory calculation.  It does now.

2. basic_outcome_exception_observers.hpp: Operation not permitted – The source tarball created inside a container was being untarred as root.  There was some scenario where this would leave root-owned files left over in the jenkins-build user’s home directory.  Subsequent Wipe Workspace plugin runs would fail because there is no way to tell the plugin to use sudo.  Make-debs.sh now untars files as the host user’s UIDDs.

3. error running container: from /usr/bin/crun creating container – Same story as the OOM Kill.

4. Error: current system boot ID differs from cached boot ID – Systemd didn’t correctly clean up podman-related files on boot.  Dan got a fix into upstream Podman but we also merged a fix to manually put that tmpfiles.d config into place before jobs run.
 
Tl;dr - we would like to switch back to the Pipeline tomorrow (9/12) and we’ll keep an eye on the builds.  Thanks for your patience as we worked through these issues.

 

From: David Galloway <David.Galloway@xxxxxxx>
Date: Thursday, August 21, 2025 at 3:41
PM
To: dev <dev@xxxxxxx>, sepia@xxxxxxxx <sepia@xxxxxxx>
Subject: Re: Maintenance Notification: Branch Push Behavior Change - 8/19/25

We believe the root cause of all failures has been identified and resolved.  A recent shellcheck suggestion broke some bash logic so a podman command to prepare the container environment wasn’t getting run.

 

Additionally, a post-job script to chown files was missing a GID.  The Wipe Workspace plugin that attempts to clean up the build environment before each job doesn’t use sudo and was unable to delete some files left over.

 

https://github.com/ceph/ceph-build/pull/2428/files

 

Thanks Dan, Zack, and John for the quick identification and resolution.

 

From: David Galloway <David.Galloway@xxxxxxx>
Date: Thursday, August 21, 2025 at 12:00
PM
To: dev <dev@xxxxxxx>, sepia@xxxxxxxx <sepia@xxxxxxx>
Subject: Re: Maintenance Notification: Branch Push Behavior Change - 8/19/25

All,

 

I have reverted this change for the time being.  We are seeing a few (what I believe to be) container-related permissions issues that need to be resolved.

 

https://github.com/ceph/ceph-build/pull/2427

 

Please re-push any branches you need rebuilt.  Apologies for the inconvenience.

 

From: David Galloway <David.Galloway@xxxxxxx>
Date: Tuesday, August 19, 2025 at 4:10
PM
To: dev <dev@xxxxxxx>, sepia@xxxxxxxx <sepia@xxxxxxx>
Subject: Re: Maintenance Notification: Branch Push Behavior Change - 8/19/25

This is complete.  Any branches pushed to ceph-ci after this message will be using the new ceph-dev-pipeline job by default.  Let us know if you encounter any issues.

 

From: David Galloway <David.Galloway@xxxxxxx>
Date: Wednesday, August 13, 2025 at 4:11
PM
To: dev <dev@xxxxxxx>, sepia@xxxxxxxx <sepia@xxxxxxx>
Subject: Maintenance Notification: Branch Push Behavior Change - 8/19/25

Upstream development community,

 

You may recall we announced the availability of a new Jenkins pipeline in June: https://lists.ceph.io/hyperkitty/list/sepia@xxxxxxx/thread/HBA4CVO2F6VVBZEOLCUPZJ6SOPB7KCDF/

 

Since that announcement, this pipeline has been opt-in only.  We feel we’ve observed enough builds succeed and ironed out a few bugs that we’re comfortable making this pipeline the default Jenkins job for both ceph.git and ceph-ci.git branches.

 

This switch will be made Tuesday, August 19 at roughly 3PM Eastern.

 

There is no additional action required on your part.  Some things to remember:

  • DWZ is disabled by default
  • SCCACHE is enabled by default
  • The above behavior, as well as which DISTROs/ARCHs to build for, can be overridden using git trailers
  • Builds will be done inside a container

 

For a refresher on how to use the git trailers and their behavior in our environment, see https://github.com/ceph/ceph-build/blob/main/ceph-trigger-build/README.md.

 

-- 

David Galloway

Ceph Engineering Labs – Infrastructure Architect

+1 989 295 0091 - Mobile

david.galloway@xxxxxxx

IBM

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux