Skip to content

ci: fix package versioning across DEB/RPM/TGZ builds#299

Merged
thananon merged 6 commits into
developfrom
ci/fix-package-versioning
May 13, 2026
Merged

ci: fix package versioning across DEB/RPM/TGZ builds#299
thananon merged 6 commits into
developfrom
ci/fix-package-versioning

Conversation

@thananon
Copy link
Copy Markdown
Contributor

Summary

Run 25393634415 on release/test_artifact exposed two versioning bugs in the relocatable package flow:

  • manylinux job silently falls back to PATCH="02". The container runs as root against a host-UID-owned workspace, which trips git's dubious ownership guard. CMake's git describe / git rev-list --count probe (CMakeLists.txt:21-41) returns no patch and uses TRANSFERBENCH_VERSION_PATCH_FALLBACK. Same commit produced 1.66.11 on Ubuntu and 1.66.02 on manylinux in that run.
  • PKG_RELEASE is computed but invisible in binary filenames. The script exports CPACK_DEBIAN_PACKAGE_RELEASE / CPACK_RPM_PACKAGE_RELEASE, but CPack still wrote amdrocm7-transferbench-<version>-Linux.{deb,rpm,tar.gz}. On release/* branches every run overwrites the same S3 keys and the run number can't be recovered from the filename.

Changes

  • build_packages_local.sh: git config --global --add safe.directory \"$REPO_ROOT\" early in the script. Unblocks CMake's git probe inside the manylinux container so both jobs resolve the same 1.66.<N> patch.
  • CMakeLists.txt (BUILD_RELOCATABLE_PACKAGE block):
    • Resolve _tb_pkg_release once from either env var.
    • CPACK_DEBIAN_FILE_NAME=DEB-DEFAULT and CPACK_RPM_FILE_NAME=RPM-DEFAULT so DEB/RPM use canonical filenames (<name>_<version>-<release>_<arch>.deb, <name>-<version>-<release>.<arch>.rpm).
    • Embed the release tag in CPACK_ARCHIVE_FILE_NAME for the TGZ.

Expected filenames after this PR

For a release/test_artifact build at run 46 with ROCm 7.13:

Generator Before After
DEB amdrocm7-transferbench-1.66.11-Linux.deb amdrocm7-transferbench_1.66.X-46_amd64.deb
RPM amdrocm7-transferbench-1.66.02-Linux.rpm amdrocm7-transferbench-1.66.X-46.x86_64.rpm
TGZ …-1.66.{02,11}-Linux.tar.gz …-1.66.X-46-Linux.tar.gz

X is now the same on both jobs.

Test plan

  • Build (Ubuntu 22.04) green; DEB filename includes release + arch.
  • Build (manylinux_2_28) green; RPM filename includes release + arch.
  • Both jobs report the same TransferBench version: 1.66.X (no 1.66.02 fallback on manylinux).
  • PR upload lands under s3://.../transferbench/ci_fix-package-versioning/<run>/... with the new filenames.

Two issues surfaced from run 25393634415 on release/test_artifact:

1. manylinux container falls back to TRANSFERBENCH_VERSION_PATCH="02"
   because git's "dubious ownership" guard rejects the host-UID-owned
   workspace when the container runs as root. Same commit produced
   1.66.11 on Ubuntu and 1.66.02 on manylinux. Mark the repo safe for
   the container's root user before CMake's git probe runs.

2. CPack writes <name>-<version>-Linux.{deb,rpm,tar.gz}, omitting the
   release tag. Successive release/* runs overwrite each other in S3
   and the run number is invisible from the filename. Switch DEB/RPM
   to canonical CPack filenames (which embed release + arch) and
   thread the release tag into CPACK_ARCHIVE_FILE_NAME for the TGZ.
@thananon thananon requested a review from a team as a code owner May 13, 2026 18:43
Run 25819267330 confirmed the version-sync fix works (both jobs report
1.66.13) and that DEB/RPM filenames now embed the release. But the
Ubuntu TGZ filename still came out as `…-1.66.13-Linux.tar.gz` — no
release tag — while manylinux's TGZ embedded it correctly.

Root cause: CMakeLists derived the release tag from
`ENV{CPACK_RPM_PACKAGE_RELEASE}` at configure time. Ubuntu invokes the
script via `sudo -E …`, and although the script then `export`s those
env vars, they don't reliably reach CMake's environment on the sudo
path. The DEB filename still worked because `DEB-DEFAULT` is processed
by CPack at package time, not configure time, so the env var
propagated to the `cpack -G DEB` child directly.

Fix: pass `PKG_RELEASE` as `-DTRANSFERBENCH_PACKAGE_RELEASE=…` from
the build script. CMake reads it directly from the cache, bypassing
env propagation entirely. Env-var fallbacks remain so direct cmake
invocations (without the wrapper script) keep working.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes inconsistent and collision-prone versioning/filenames in the relocatable packaging flow (DEB/RPM/TGZ) by ensuring the git-derived patch version is consistently computed in container builds and by threading a per-build “release” tag into package metadata and artifact filenames.

Changes:

  • Add a git safe.directory configuration step in build_packages_local.sh to prevent git describe from failing under “dubious ownership” in containerized builds.
  • Centralize resolution of a per-build release tag in CMakeLists.txt and apply it to DEB/RPM release metadata and TGZ archive naming.
  • Switch DEB/RPM output naming to CPack “default/canonical” filename modes so release + arch are reflected in the generated artifact names.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
CMakeLists.txt Adds unified release-tag resolution and applies it to CPack settings for DEB/RPM/TGZ filenames/metadata.
build_packages_local.sh Marks the repo as a git safe.directory in container/root contexts and passes an explicit package release tag into CMake.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread build_packages_local.sh Outdated
Comment thread CMakeLists.txt Outdated
Comment thread CMakeLists.txt
thananon added 4 commits May 13, 2026 14:34
CMake 3.26 (manylinux container) honors CPACK_ARCHIVE_FILE_NAME for
the TGZ generator. CMake 3.22 (Ubuntu 22.04 system cmake) falls back
to CPACK_PACKAGE_FILE_NAME instead, ignoring my ARCHIVE_FILE_NAME
override. Diagnostic dump confirmed both jobs wrote identical
CPackConfig.cmake — the divergence is purely in cpack runtime
behavior between the two CMake versions.

Set both to the same suffixed value. DEB/RPM are unaffected because
they use the explicit DEB-DEFAULT / RPM-DEFAULT canonical-naming
tokens, which take precedence over CPACK_PACKAGE_FILE_NAME.

Also remove the temporary debug prints from CMakeLists.txt and the
CPackConfig.cmake dump from build_packages_local.sh.
- Replace `git config --global --add safe.directory` with the GIT_CONFIG_*
  env-var triple so the safe.directory entry is scoped to this build (and
  inherited by CMake's `execute_process(git …)` children) instead of being
  written to root's ~/.gitconfig under sudo. Also drops the `|| true` so a
  setup failure surfaces immediately instead of letting the version probe
  silently fall back.

- Quote `"${VAR}"` in `STREQUAL ""` comparisons. CMake's `if(AND ...)` does
  not short-circuit, and bare-identifier dereferencing inside `if()` depends
  on policy CMP0054. Quoting makes the comparison unambiguous.

- Fail fast when BUILD_RELOCATABLE_PACKAGE is requested on CMake older than
  3.13. The block uses CPACK_ARCHIVE_FILE_NAME (3.13+) and the
  DEB-DEFAULT / RPM-DEFAULT canonical-naming sentinels (3.6+); on a 3.5
  CMake those would silently produce a literal "DEB-DEFAULT" filename. The
  project-wide cmake_minimum_required of 3.5 is left alone (out of scope
  here); the gate is local to the relocatable-package path.
@thananon thananon merged commit 0e824cc into develop May 13, 2026
10 checks passed
@thananon thananon deleted the ci/fix-package-versioning branch May 13, 2026 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants