Reproducible Builds

From Yocto Project
Revision as of 20:55, 13 March 2017 by Anibal Limon (talk | contribs)
Jump to navigationJump to search

Current Status

The Yocto Project aims to have builds which are entirely reproducible. That is; if you run a build today, then run that same configuration X time in the future, the binaries you get out of the build should be binary identical.

This implies that the host system you run the build on and the path you run the build in should not affect the target system output.

Our build system doesn't produce binary reproducible builds today, but we are actively working towards that goal and fixing issues as we idenitify them. We also plan to improve our testing to help find reproducibility issues.

The design of the system lends itself very well to producing reproducible builds, as we provide a reproducible build environment with minimal dependencies on the host/build OS.

We have several technologies in place which aide in reproducibility:

Our shared state (sstate) mechanism base our builds on hashes of input metadata, reusing the outputs if the inputs are the same.

As our sstate files need to be reusable regardless of build path and can be target or native binaries, we have mechanisms for working around various issues such as hard-coded paths (though we'd prefer to remove the need for them entirely).

A related problem to sstate is that of knowing when the input has changed, has the output changed? This is useful in the context of package feeds, amongst other things, to know whether we should update them or not. We have binary build comparison tools at an early stage of development to allow us to reduce unnecessary churn in package feeds, however further developments are planned in this area using the tools from the reproducible-builds project.

Our SDKs need to be relocatable and run anywhere they are installed to. We include our own C library to do this in a "run anywhere" scenario and are able to generate fully relocatable toolchains.

Our new for Yocto Project 2.3 (Pyro) recipe specific sysroots ensure that the output of a recipe doesn't change depending on whether other recipes have already been built and in which order, even when the software tries to autodetect available features.

At this point the project does not intend to target timestamp levels of reproducibility so whilst the binary content should be the same, file timestamps may not be and this means package manager tarballs would not be binary identical due to timestamp differences

Why do we want reproducible builds?

The many benefits of reproducible builds as listed in the reproducible-builds project, not least of all the ability to verify a built output matches the source, are key motivations for our work on reproducible builds. Additonal benefits specific to the Yocto Project include:

  • ability to reduce the churn in package feeds
  • ability to improve reliability of allarch recipes and enable wider use of them

Related Work

Some key bugs that are important in our reproducibility efforts are:

  • #1560 Enable recipe specific sysroots — DONE for 2.3
  • #5866 Reproducible builds: identical binaries
  • #10813 Replace build-compare with tools from the reproducible-builds project
  • Make use of SOURCE_DATE_EPOCH, most likely with patches from Debian to ensure tools are using it.
  • Develop tests to catch issues with reproducibility

Verification

To ensure that our builds are reproducible we need to implement an infrastructure of verification over the build systems outputs Images, SDKs, Binary packages, etc.

The idea is to run diffoscope over target shared states (populate_sysroot) because is the first output on what the other artifacts (Images, SDK's, Binary packages) are generated. The build output will be generated running bitbake world with two different Host systems, this can be done using two Virtual machines as Autobuilder workers and then launch the comparison process (diffoscope) against the two build outputs.

An Autobuilder development branch for reproducible builds. [1]

To preserve the results a web system needs to be implemented (the AB will push the results), this also will serve to publish something like www.yoctoproject.org/reproduciblebuilds to see the actual status and what recipes needs works in order to be 100% reproducible. An initial implementation only with database models. [2]