Toaster and bitbake communications
This is a write up of some investigation I did into how toaster asks bitbake to do stuff, and how it hears about what bitbake is doing. It's not definitive and not guaranteed to be correct, but it might provide some pointers if you are working on toaster. If anyone knows better, please feel free to correct this.
Any paths given below are relative to the root of the poky/poky-contrib source tree.
toaster asking bitbake to do stuff
First off I wanted to figure out how clicking on the "Build" button in the toaster interface triggers a bitbake build.
toaster only asks bitbake to perform builds when in managed mode. This is the point I started from when doing this investigation. I think analysis mode is very similar, except toaster doesn't ask bitbake to do anything, it just listens to stuff which is already happening.
The conversation with bitbake is handled via the runbuilds command (bitbake/lib/toaster/bldcontrol/management/commands/runbuilds.py). This command is run in a loop once every second from the toaster start script (bitbake/bin/toaster).
The runbuilds.schedule() function looks for BuildRequests in the database. A BuildRequest is created each time you press a "Build" button in the toaster web interface. Those build requests are created in orm/model.py schedule_build on the project object. When creating this BuildRequest also created are BRLayer for each of the layers in your project, BRTarget for the target to be built and BRBitbake for the version of bitbake to use, these are all defined in bldcontrol/models.py, and contain copies of the information from the project.
The localhostbecontroller takes this BuildRequest
- schedule() gets a localhostbecontroller (be = build environment) instance (there is a becontroller for remote bitbakes, but I don't think the implementation is complete) and assigns it to a variable bec.
- schedule() then calls bec.triggerBuild() for a build request which is in the BuildRequest.REQ_QUEUED state. Only one build request gets picked up each time the runbuilds script runs: the one with the largest ID.
- schedule() also creates a build identification variable for each build request, combining the primary key of the build request with the primary key of the build environment controller's build environment (!)).
This is the build environment controller, which handles instances of build environments. It is also responsible for cloning any layers that are in the build request.
It is a subclass of BuildEnvironmentController.
- triggerBuild() calls getBBController(), which returns a BitBakeController instance (bbctrl). getBBController() actually instantiates the controller if it isn't already instantiated, passing it a "server" object. This server object is an instance of bb.server.xmlrpc.BitBakeXMLRPCClient().
- triggerBuild() then calls the BitBakeController build() method: bbctrl.build()
This is constructed with the server object, which is an XMLRPC connection to a BitBake server.
- build() invokes the "buildTargets" command on the connection using runCommand().
This is a subclass of BitBakeServer.
It has a "connection" member, which has runCommand() invoked on it.
When constructed by getBBController(), initServer() is called as part of its construction. This is a method from BitBakeServer which sets the interface address (default: (localhost,0)) and creates an XMLRPCServer() using this interface address.
Once initServer() is done, establishConnection() is invoked, which creates the "real" socket.
The established connection (which runCommand() is invoked on) is a BitBakeXMLRPCServerConnection.
BitbakeServer is a subclass of BitBakeBaseServer (declared in bitbake/lib/bb/server/__init__.py); this actually has the addcooker() method which associates a bitbake cooker instance with the connection.
BitBakeBaseServer acts like a decorator around a server implementation; in our case, it's a BitBakeXMLRPCServerConnection.
Calling addcooker() on the BitBakeBaseServer also calls it on any server implementation (serverImpl) which is wrapped by BitBakeServer.
This is a subclass of BitBakeBaseServerConnection.
When runCommand() is invoked on this, it's passed on to the cooker object associated with it, i.e. it actually invokes self.cooker.command.runCommand().
The cooker object is associated with the BitBakeXMLRPCServerConnection when the bitbake/lib/bb/main.py script runs.
The main script which starts a bitbake server and ui. This is what runs when you use "bitbake" from the command line.
toaster starts the bitbake server with the --server-only switch, which calls bitbake_main() in this file; this in turn calls the main() function.
This instantiates a bb.cooker.BBCooker and adds it to the server implementation via addcooker(). The cooker is what actually enables commands to be sent to the bitbake server.
This has a "command" property which is an instance of BBCommand; this is what runCommand() finally gets invoked on.
runCommand() calls a method from an instance of CommandSync (all the synchronous commands bitbake understands) or CommandAsync (the asynchronous ones), depending on the type of command passed to runCommand(). Both the CommandSync and CommandAsync instances are added to the BBCooker when it is created.
For example, runCommand("getVariable", ...) is invoked via CommandsSync.getVariable().
The commands which go through BBCommand.runCommand() now make their way to the bitbake server over XMLRPC.
toaster listening to what bitbake is doing
At this point, I realised I probably understood enough to see how bitbake was being invoked from toaster: asking toaster to start a build sends a "buildTargets" command to the bitbake XMLRPC server, via a rather indirect series of objects and method calls.
What I wanted to know now was how toaster listens to the result of that command. At a high level, I understood that it gathered events from the XMLRPC connection and converted them into database objects. However, I didn't really get the code path.
I started from BuildInfoHelper, which is where bitbake events are converted into toaster db objects.
This is passed build events from the toasterui.py which routes them off to the buildinfohelper based on the event types.
BuildInfoHelper is responsible for constructing toaster ORM objects from events. The BuildInfoHelper is constructed with a server (instance of BitBakeXMLRPCServerConnection in the case of toaster), so it can also interrogate the bitbake server for extra environmental data via getVariable().
BuildInfoHelper also tries to match the data from the build recipe events from toasterui.py to data in toaster's database, so that toaster can "learn" from the build, this means that we get recipe and package information from the build which otherwise we are unable to determine.
The task of matching the recipes and associated data is done when we get the recipe information from the store task event, this contains a path of the recipe e.g. "/home/yocto/cloned_layers/_mylayer_master.toaster_cloned/mydir/example.bb... the buildinfohelper then tries to find a layer in toaster that has a checkout directory location that starts with the one provided by the event.
It does this by asking the "bc" BuildController (which can only be localhostbecontroller.py at the moment as this is the only controller which contains the implementation) to return the git checkout path via 'getGitCloneDirectory' this returns what it believes was the git checkout base location, in our example hopefully something like "/home/yocto/cloned_layers/_mylayer_master.toaster_cloned" then the buildinfohelper adds the directory name (brl.dirpath) from the information in the build request layer, e.g. "mydir" after all this reconstruction we hope that the recipe path we got from the event starts with the reconstructed path. e.g. Does "/home/yocto/cloned_layers/_mylayer_master.toaster_cloned/mydir/example.bb" start with "/home/yocto/cloned_layers/_mylayer_master.toaster_cloned/mydir/" if yes then we have a layer in toaster to which we can associate the information from the build to.
One problem here is that the getGitCloneDirectory function a) doesn't exist in the other hostcontrollers and b) doesn't always return the correct value due to some special case
This has a main() function which sets up an event listener loop.
main() is passed a server, eventHandler and params. The eventHandler is the object which listens to bitbake events.
toasterui is called from main.py (see below).
The toasterui.py main() method is invoked from main.py.
It's passed two parameters:
The second of these is the eventHandler which is set up in a loop in toasterui.py.
server_connection comes from a call to establishConnection() in main.py. The server is constructed via a servermodule, which is dynamically chosen in main.py according to the parameters used to invoke it (-t). It will either be "process" or "xmlrpc".
The UI type is set in the main.py script via the -u option.
toaster invokes bitbake with: -t xmlrpc -u toasterui
which means that we get the toasterui.main() called, and we get an xmlrpc bitbake server.
Back to server_connection.events...
This refers to an xmlrpc bitbake server connection's events object.
So toasterui.main() gets:
- server_connection.connection => BitBakeXMLRPCServerConnection.connection
- server_connection.events => BitBakeXMLRPCServerConnection.events
The events object is an instance of uievent.BBUIEventQueue in our case, as we're using toasterui.
This is our event handler ("events") in toasterui.main(), which is receiving events on the XMLRPC connection (see later).
toasterui.main() calls events.waitEvent(0.25), which looks for events on the queue every 0.25s. If the queue has events, one is popped off.
Each event popped from the queue is passed to uihelper.BBUIHelper.eventHandler(), which adds build tracking information (how many packages built, tasks completed etc.)
The event then goes to the BuildInfoHelper, where it gets stored in toaster's database.
How do events get on BBUIEventQueue?
BBUIEventQueue is instantiated on the BitBakeXMLRPCServerConnection object.
It is passed a BBServer, which is an xmlrpclib.ServerProxy (from the Python standard library).
The ServerProxy has a method corresponding to each method on the server it is proxying for; in this case, the proxied server is a BBServer, so the proxy has the same methods as BBServer.
Event handling is set up by calling self.BBServer.registerEventHandler() from BBUIEventQueue.
registerEventHandler() is defined in bitbake/lib/bb/server/xmlrpc.py, BitBakeServerCommands. This in turn calls bb.event.register_UIHandler, defined in bitbake/lib/bb/event.py.
When an event is fired, each handler registered for it is invoked with that event (in event.py).
The main function for firing events in event.py is fire_from_worker(), which is called from bitbake/lib/bb/runqueue.py.
Events are constructed from xmlrpc messages coming from the bitbake server (see runqueue.py, runQueuePipe.read()).
How do toasterui + buildinfohelper manage events?
toasterui runs as part of the bitbake instance: bitbake is invoked with a -u toasterui option, which means that toasterui is instantiated as the event handler for the bitbake instance.
As events occur in bitbake, they are passed to toasterui; toasterui then hands those events off to buildinfohelper, which uses the event data to create records in the Toaster database. Note that buildinfohelper has to start up the Django database machinery manually for this to be possible, and that this is happening outside the main Django instance, in a separate process. This might be part of the reason why we get database locking issues with Django 1.8.
As buildinfohelper receives bitbake events, it sets variables in its internal_state dictionary. These variables are used to represent events which are "partial" from the perspective of Toaster: that is, events which can't create a complete, useful record in Toaster's database.
The best example of this comes from bitbake's Task* events. We receive two events for a task in buildinfohelper:
- The event notifying that the task started, e.g. TaskStarted, runQueueTaskStarted, sceneQueueTaskStarted
- The end state of the task, e.g. TaskFailedSilent, TaskCompleted, runQueueTaskFailed, sceneQueueTaskFailed, runQueueTaskCompleted, sceneQueueTaskCompleted, runQueueTaskSkipped
Because Toaster represents a Task as a single entity, with an outcome state, we can't add a complete Task record when a task starts: we add a partial Task record, then update that when the task end event is received.
To get this to work, buildinfohelper saves the partial Task to the database when the task start event is received; then retrieves and updates that record when the task end event arrives. In between these two points, buildinfohelper keeps some internal state about tasks which have started but which don't have a "done" outcome. However, the identifiers used in this internal state are composed of the task file + name, which is a fairly arbitrary algorithm (I quote: "we do a bit of guessing"). When the task end event is received, it is matched up to the internal state (where we have a list of tasks which haven't ended) using this fairly arbitrary identifier. This seems like it might be prone to error.
The reason this approach has been used, though, is because (as far as I can tell) bitbake doesn't provide any identifiers on its events which would enable them to be tied together. A TaskStarted event doesn't specify which task started (tasks don't have unique IDs), just the name of that task and its .bb file; there is also nothing to tie a task to the build it's associated with. This means that any event handler waiting for bitbake events has to manually tie together events and maintain local state to be able to do that.
The following internal state is maintained in buildinfohelper for this purpose:
- lvs (layer versions): layer versions known to Toaster
- recipes: recipes known to Toaster
- backlog: list of log events which haven't been saved for the current build yet
- brbe: the build request and the build environment primary keys, concatenated together around a colon (e.g. "1:2")
- build: the ongoing build (there's an implicit assumption that bitbake can only run one build at a time)
- taskdata: a list of ongoing tasks (i.e. task events for which we have received a task started event, but haven't yet received a task ended event); as task end events are received, the task goes into the database and is removed from taskdata
- targets: targets for the current build
- task_order: a counter which is added to each task as it is saved to the database, which marks the order in which the task events were received; it gets incremented each time a certain type of task event is received
Note that lvs and recipes are set before the build started event, using metadata events fired by bitbake.
While the build is ongoing, the internal state affects how data is added to the database: it is used to implicitly associate events with the build which is assumed to occur between a BuildStarted event and the next BuildCompleted/BuildFailed event.
Here's an example of the workflow to make it clearer:
- toasterui receives a BuildStarted event and passes it to buildinfohelper.
- buildinfohelper creates a database record B for the new build, and stores it as a property on itself.
- toasterui receives a TaskStarted event with name "foo" and file "/bar/bar/humbug"; it passes the event to buildinfohelper.
- buildinfohelper stores the partial task in the database as T, associating it with the build record B; it also stores T under the key "/bar/bar/humbug:foo" in taskdata.
- toasterui receives a TaskCompleted event with name "foo" and file "/bar/bar/humbug"; it passes the event to buildinfohelper.
- buildinfohelper marries up the new TaskCompleted event with the partial record T, by matching the new event's key "/bar/bar/humbug:foo" with the key for the TaskStarted event which is already in taskdata (see 4); T is updated in the database.
- toasterui receives a BuildCompleted event and passes it to buildinfohelper.
- buildinfohelper assumes that the BuildCompleted event applies to the build B already stored in its internal state. It updates the database record for B with data about when the build ended, build artifacts etc.
- toasterui creates a new buildinfohelper, ready to deal with the next build.
backlog is a list of log events which haven't yet been attached to a build; if a log event occurs before the build has been saved, it is added to the list; then, once a BuildStarted event has occurred, the list of events in the backlog is added to the database and associated with that build.
At this point, I felt I understood enough about how events are processed for my purpose, so I didn't dig any further. I knew where I could amend an event to add/remove properties on it, which is what I was after.