Conversation
prepared an execution contract draft Signed-off-by: Dan Calavrezo <195309321+dcalavrezo-qorix@users.noreply.github.com>
|
|
|
The created documentation from the pull request is available at: docu-html |
AlexanderLanin
left a comment
There was a problem hiding this comment.
Lots of comments, but this was a great read! Good content IMHO!
| - Tools affecting build outputs must either be: | ||
| - managed by Bazel, or | ||
| - explicitly injected as Bazel action inputs, or | ||
| - reflected in cache partitioning |
There was a problem hiding this comment.
not sure what you mean here. Theoretically its enough if they are documented as in R2, but of course we want more.
What about "mirrorable"... no idea how to describe it. I'm talking about pypi for example.
| - managed by Bazel, or | ||
| - explicitly injected as Bazel action inputs, or | ||
| - reflected in cache partitioning | ||
| - Reliance on host state must be minimized and documented where unavoidable. |
There was a problem hiding this comment.
This is not the same indentation as above. So "Tools affecting build outputs must either be...documented" is missing?
| Build actions must not depend on **undeclared inputs**. | ||
|
|
||
| In practice: | ||
| - Tools affecting build outputs must either be: |
There was a problem hiding this comment.
Let's explain "Tools affecting build outputs" to be very clear. With example of such tools. And tools that are less relevant.
e.g. I'm currently not sure whether pytest affects build outputs. Are test results build output?
There was a problem hiding this comment.
Test results are not build outputs, but they do affect CI decisions ( I guess quality gates) so the tools that produce them must still be Bazel-visible if you want correctness, reproducibility and traceability.
Theoretically we should use Bazel Pythong rules ( rules_python). This would ensure:
- Reproducible test outcomes
- Correct test caching
But I guess it is an acceptable fallback to have Pytest installed in the devcontainer.
Do we have a file like a fingerprint.txt which contains the versions of the tools?
pytest==7.4.2
python==3.11.6
container=sha256:deadbeeeeeeeeef.......
That one can be used as an input for the test action. That way Bazel sees when the fingeprint changes and it can invalidate the caches accordingly
There was a problem hiding this comment.
Ignore caching for now ;-)
We have e.g. this for pinning tool versions: https://github.com/eclipse-score/devcontainer/blob/main/src/s-core-devcontainer/.devcontainer/s-core-local/versions.yaml
And such files for pinning python packages: https://github.com/eclipse-score/docs-as-code/blob/main/src/requirements.txt
|
|
||
| ## Three-Layer Execution Contract | ||
|
|
||
| ### Layer 1 — Host Platform Contract |
There was a problem hiding this comment.
x86, arm for Macs would also belong to host?!
There was a problem hiding this comment.
Good point
There was a problem hiding this comment.
But I've no idea what's the correct list of supported archs for hosts. Who could help clarify this?
There was a problem hiding this comment.
At least devcontainers we build for both arm64 and x86_64. I would say at the moment that pretty much covers all hosts. I successfully ran devcontainer on an Apple M3 (arm64) MacOS, Windows WSL2 (arm64 (Snapdragon Laptop) and x86_64), Linux (arm64 (Snapdragon Laptop) and x86_64). I cannot imagine what else we need.
There was a problem hiding this comment.
Note: ran the devcontainer, started the tools - not: successfully built S-CORE. The issue here is: the Bazelized tools are not all available for arm64.
| - Provide consistent runtime ABI (`glibc`, `libstdc++`) | ||
| - Ensure tool binaries (e.g. rustc) can execute reliably | ||
| - Eliminate “works on my machine” discrepancies | ||
| - Enable local reproduction of CI builds |
There was a problem hiding this comment.
We need to figure out what exactly has to be identical. Or is it this list?
e.g. devcontainer is based on Ubuntu 24. GitHub Runners are based on Ubuntu 24. Is it now enough to ensure the same python version for python scripts?
There was a problem hiding this comment.
you can specify at Github Action Workflows that a specific container image and version shall be used: https://docs.github.com/en/actions/reference/workflows-and-actions/workflow-syntax#jobsjob_idcontainerimage
Then there is no fingers crossed anymore if CI and devcontainer have the same tools and versions installed.
There was a problem hiding this comment.
@lurtz correct
I guess it's time we start converting our workflows into using the devcontainers
Not sure if there's something in particular needed ( in terms of configuration of the container) for the Bazel local-cache.
@AlexanderLanin if we go ahead with using devcontainers, we can make the clean-up action even more aggresive
| --> | ||
|
|
||
|
|
||
| # DR-001-Infra-Extension: S-CORE Build Execution Contract |
| ## Minimum Supported Baselines | ||
|
|
||
| ### OS and Runtime Baseline | ||
| - Minimum supported baseline: **Ubuntu 20.04 LTS** (subject to revision) |
There was a problem hiding this comment.
Baseline is defined in layer 2. So layer 3 should not mention an exact version?
There was a problem hiding this comment.
I think this is relevant for the underlying OS/ kernel version/ glibc
Co-authored-by: Alexander Lanin <Alexander.Lanin@etas.com> Signed-off-by: Dan Calavrezo <195309321+dcalavrezo-qorix@users.noreply.github.com>
|
|
||
| In practice: | ||
| - Tools affecting build outputs must either be: | ||
| - managed by Bazel, or |
There was a problem hiding this comment.
The writing here suggests that if it's managed by bazel it then automatically is solved and we don't need to worry about it anymore.
Is that the case, or am I miss-reading here?
There was a problem hiding this comment.
I have doubts about that. bazelisk downloads even bazel from the Internet, which then might download more dependencies.
There was a problem hiding this comment.
So we for sure would need to have a test build for official releases that checks if the archiving is complete by building it without access to the internet?
There was a problem hiding this comment.
@MaximilianSoerenPollak @lurtz You are both correct
Maybe we should explicitly mention
“Bazel-managed” improves cache correctness and traceability, but does not by itself
guarantee long-term reproducibility; artifact availability must also be ensured
via pinning, checksums, internal mirroring/archiving and offline verification.
Ok, we do have our own Bazel Container Registry, but it will be probably a good idea to mirror the other artefacts that Bazel/bazeliks downloads ( even Bazel itsself, it can be that in 10 years, version 7.1.0 (random example) won't be available anymore for download).
|
|
||
| #### Non-Goals | ||
| - The devcontainer must **not silently override** repository-declared Bazel versions. | ||
| - The devcontainer must **not be the only place** where critical tool versions are defined. |
There was a problem hiding this comment.
Where else would you define it then? In a global ledger or so?
There was a problem hiding this comment.
Not a global ledger — the source of truth should be the repo (ex., .bazelversion, MODULE.bazel/lockfile, pinned toolchain deps). The devcontainer may provide binaries but it must not be the only place where versions are defined, otherwise changes to the container silently change builds. Guess this is part of the 3-layered approach.
There was a problem hiding this comment.
Weeeelll one can see this the other way around: the devcontainer defines the environment, also Bazelversion.
BUT
Changing the devcontainer in a repository is not silent. It is an explicit PR, with a version change of the container. That change must do a build & test of the repository content using the updated container. If that builds - all good, right? If not --> PR fails, investigation required. No blocking of development or surprises at any point in time.
There was a problem hiding this comment.
otherwise changes to the container silently change builds
This can only happen if no fixed revisions are used. E.g. ghcr.io/eclipse-score/devcontainer:latest will silently change, but ghcr.io/eclipse-score/devcontainer:v1.1.0 will always stay the same.
| - versioned | ||
| - immutable | ||
| - built against a documented baseline | ||
| - Tools affecting outputs must be known to Bazel or reflected in action inputs. |
There was a problem hiding this comment.
Outputs in which way?
Does this mean for example if something saves a .json that is used as cache it should only work via Bazel actions?
Or what about the test frameworks that can affect the output xml?
There was a problem hiding this comment.
"Outputs” should mean any build- or CI-relevant result that we rely on, not just binaries.
- Build artifacts (most important)
Examples:
- binaries, libraries, containers, packages
- generated source code checked-in or shipped
- compiled outputs used downstream
- Decision artifacts (CI gating) - I know that we don't have them, but I guess we need to add them at some point 😄
Examples:
- test pass/fail outcome
- coverage percentage used as a gate
- lint results used to pass/fail
If a tool can change these → Bazel must track the tool version / inputs (or we risk wrong decisions / wrong cached results). Yeah, yeah, I know, where's the cache ?
|
Thanks for the write up, looks over all quite good and a great baseline we can work out the small other stuff. |
| - Provide consistent runtime ABI (`glibc`, `libstdc++`) | ||
| - Ensure tool binaries (e.g. rustc) can execute reliably | ||
| - Eliminate “works on my machine” discrepancies | ||
| - Enable local reproduction of CI builds |
There was a problem hiding this comment.
you can specify at Github Action Workflows that a specific container image and version shall be used: https://docs.github.com/en/actions/reference/workflows-and-actions/workflow-syntax#jobsjob_idcontainerimage
Then there is no fingers crossed anymore if CI and devcontainer have the same tools and versions installed.
| #### Responsibilities | ||
| - User-space runtime libraries | ||
| - Bootstrap tooling (git, bash, coreutils, python, etc.) | ||
| - Bazel entrypoint (preferably Bazelisk) |
There was a problem hiding this comment.
I am a bit puzzled. IIRC Bazelisk transparently downloads whatever bazel version has been specified via .bazelversion from the internet. How will this ensure stable builds in like 10+ years? We have no guarantee that the server address is still the same and the version still available.
There was a problem hiding this comment.
Good point again!
Bazelisk is fine as an entrypoint, but only if we pair it with an archived, controller source of Bazel binaries.
Would that mean to create our own internal mirror for Bazel releases (within eclipse).
What do you think?
There was a problem hiding this comment.
I have to be open, that I do not like all of bazels concepts. It is amazing for tracking dependencies, caching and its best effort sandboxing when building and running tests. But I would rather not let it download itself, tools or toolchains and store these within the devcontainer image. However this is against the point you made about having the devcontainer optional.
That being said you also stated that the entire S-CORE should use a single bazel version:
To me the best solution would be to include exactly THIS bazel version in the devcontainer instead of using bazelisk.
Maybe we should discuss how much infrastructure (bazel mirror, bazel archive, devcontainer registry) we want to build, or if we want to create a design which needs less infrastructure. I lean towards a solution, which fulfills all requirements, but needs as less infrastructure in the background as possible.
There was a problem hiding this comment.
that's fair, but at the moment all the repos have their own .bazelversion file, or? example : https://github.com/eclipse-score/communication/blob/main/.bazelversion
Wouldn't that override whatever we set in the devcontainer ?
We'd have to set-up an enforcing mechanism? or is there such already?
There was a problem hiding this comment.
IMHO there are multiple ways to achieve this inside the devcontainer.
- You install a specific bazel version in the devcontainer and set an environment variable, which overrides any
.bazelversionfile. However this happens silently - We could also not ship bazelisk in the devcontainer, then the build will fail, if the
.bazelversiondoes not match the version of the preinstalled bazel binary.
| This must remain possible even if: | ||
| - GitHub runner images change or are retired | ||
| - upstream toolchains are no longer available | ||
| - external services are unavailable |
There was a problem hiding this comment.
should we have a test build, which has no access to the internet? This way we can be sure all needed tools and code are present.
There was a problem hiding this comment.
I think that would only work IF we would have a remote repository cached that is populated for Bazel fetching or we configure our own local mirrors.
Or?
How would you see that? For the devcontainer part I think it is straightforward, for Bazel, I'm not too sure
There was a problem hiding this comment.
The bazel mirror would also need to act like an archive, because it should not remove anything needed to build a release for +10 years. And I guess all S-CORE builds should ideally make use of that mirror.
btw. in addition to bazel dependencies being downloaded from the internet, I expect that Rust code might also download some crates from crates.io.
If the needed dependencies are stored within the devcontainer, it would be rather easy, because you need the devcontainer image from the release and the S-CORE code and you should be able to build. No internet access or mirror needed.
| #### Definition | ||
| - A **versioned devcontainer image** is the default execution context. | ||
| - The container image must be: | ||
| - built from a known OS baseline (Ubuntu LTS) |
There was a problem hiding this comment.
we should keep in mind that Microsoft removed support for older Ubuntu versions running in devcontainers from Visual Studio Code. Thus if your devcontainer image is based on Ubuntu 18.04, you are at the moment only able to run it via command line.
There was a problem hiding this comment.
very good point !
So we might loose the "works in VS Code" and just be stuck with CLI-only
@lurtz can you propose a better wording, please?
#### Definition
- A **versioned devcontainer image** is the default execution context for CI and local builds.
- The container image must be:
- built from a **defined Ubuntu LTS baseline**
- compatible with common developer tooling (e.g. VS Code Dev Containers)
- referenced by an **immutable image digest**
- archived for **long-term reproducibility**
and maybe add a section like
#### Baseline Preservation and Reproducibility
- Once a devcontainer image is used in CI, its image digest becomes part of the build provenance.
- All such images must be archived and retrievable for **at least 10 years**.
- Reproducing historical builds may rely on legacy container runtimes or CLI-only execution,
and does not require continued IDE support.
Co-authored-by: Alexander Lanin <Alexander.Lanin@etas.com> Signed-off-by: Dan Calavrezo <195309321+dcalavrezo-qorix@users.noreply.github.com>
Co-authored-by: lurtz <727209+lurtz@users.noreply.github.com> Signed-off-by: Dan Calavrezo <195309321+dcalavrezo-qorix@users.noreply.github.com>
solved comments from PRs Signed-off-by: Dan Calavrezo <195309321+dcalavrezo-qorix@users.noreply.github.com>
No description provided.