Yocto Build Failure Swat Team: Difference between revisions
RossBurton (talk | contribs) |
RossBurton (talk | contribs) (→Report) |
||
Line 28: | Line 28: | ||
=== Report === | === Report === | ||
There are two categories of builds that Swat will be monitoring: official branches and staging branches. The official branches are the primary top-level branches in Poky, that is master and all of the release branches (gatesgarth, dunfell, etc). The staging branches are where patches are held for testing, such as master-next, stable/dunfell-nmut. | |||
For builds against master or a release branch, all issues observed should be [[#Filing_bugs | filed in Bugzilla]]. Remember to search first to ensure that the issue isn't already filed as, for example, many bugs that occur intermittently are already filed and have "AB-INT" in the whiteboard field. | For builds against master or a release branch, all issues observed should be [[#Filing_bugs | filed in Bugzilla]]. Remember to search first to ensure that the issue isn't already filed as, for example, many bugs that occur intermittently are already filed and have "AB-INT" in the whiteboard field. |
Revision as of 14:21, 21 December 2020
Overview
The role of the Bug Swat Team is to monitor the autobuilder and do preliminary investigation of failures, to ensure that they are logged and brought to the attention of the appropriate owner.
All builds that are run on the public autobuilder are important for the Yocto Project, whether they be routine validation runs (master or release branches) or a pre-integration test builds (master-next, stable/*, and others). Random failures if ignored accumulate and can result in a significant number of builds failing.
Each week a different member of the team is on call. Every build that fails on the autobuilder should be monitored unless told otherwise. The rotation happens at the end of Friday (deliberately vague), any failures over the weekend should be triaged by the incoming member on Monday.
Importantly, the Swat Team isn't responsible for resolving issues encountered on the autobuilder, just enough analysis that it can be logged and the appropriate owner notified.
The Swat Chairs are the primary contact for the Swat Team. The current Swat Chairs are Ross Burton, Armin Kuster and Richard Purdie. The Chairs are assisted by Stephen K. Jolley who handles the rotation process.
Process
The process is simply three steps:
- Identify build failures
- Report the build failures
- Update the build log
Identify
To be notified when a build fails it is best to subscribe to the yocto-builds mailing list. This is sent a mail when a build fails, which includes direct links to the autobuilder job summary, the BuildLog, and the Error Reporting Service.
Alternatively, these services can be manually monitored. The Autobuilder 'Yocto console view' is an overview of the top-level builds (a-full and a-quick) and the sub-builds they trigger. The BuildLog is a wiki page that is updated when builds fail with links to the appropriate logs. The Error Reporting Service collates errors from the autobuilder.
Both the mail notification and theBuildLog will include notes from the build owner, so check this for any useful context. For example, it may request that failures are reported directly to a specific person instead of bugs created, or particular failures that are expected.
Report
There are two categories of builds that Swat will be monitoring: official branches and staging branches. The official branches are the primary top-level branches in Poky, that is master and all of the release branches (gatesgarth, dunfell, etc). The staging branches are where patches are held for testing, such as master-next, stable/dunfell-nmut.
For builds against master or a release branch, all issues observed should be filed in Bugzilla. Remember to search first to ensure that the issue isn't already filed as, for example, many bugs that occur intermittently are already filed and have "AB-INT" in the whiteboard field.
For builds against staging branches (master-next, stable/dunfell-nut, etc), attempt to identify what patch in the branch is likely responsible for the failure. For example, if wget fails with libgnutls errors and there is a GnuTLS upgrade in the branch, that is the likely candidate. If a patch can be identified reply on the mailing list with the failure details. If it isn't obvious which patch is responsible for the failure, or a patch can be identified but it has been merged to the release branch, then file a bug and ensure the branch owner is either the assignee or on the CC list.
If in doubt, file a bug. All observed errors must be actioned unless a patch has already been sent for the issue, in which case please make note of this in the BuildLog.
If the issue is in the infrastructure or autobuilder itself then file a bug against Infrastructure: Autobuilder, infrastructure bugs should be assigned to Michael Halstead and autobuilder logic bugs to Richard Purdie.
The results of pre-triage for an issue should be added to the corresponding entry in the BuildLog, including a link to the resolution (patch name, bug link, etc) and a brief summary of the issue. Every issue should be added to the build log so it acts as a build status report.
The net result is all failures listed in BuildLog should have outcomes listed against them from the person on call at the time.
Communication is key: if the build owner is on IRC then it's always worth discussing with them first before filing bugs. Also, if the build owner triages the build failures then they should update the BuildLog so that Swat doesn't duplicate the work.
Filing bugs
When filing the bug, several items must be included:
- Relevant details about the build configuration. For example did the failure happen just once, or in all PowerPC builds? Was it specific to multilib builds? Look across the entire build run and identify any patterns.
- The error itself. Trim the log down to just the error and any relevant context in the bug description.
- A link to the build failure. Ideally a link to the error reports page (such as http://errors.yoctoproject.org/Errors/Details/199667/) but a link to the autobuilder build log is acceptable (such as https://autobuilder.yoctoproject.org/typhoon/#/builders/34/builds/168). If referring to an autobuilder build log, also attach the complete build log as build logs are not kept forever.
Members
Armin Kuster (place me anywhere)
Lee Chee Yang