TipsAndTricks/BuildingAndRunningClearContainersonTarget: Difference between revisions

From Yocto Project
Jump to navigationJump to search
Line 126: Line 126:
  {
  {
         "vm": {
         "vm": {
                 "path": "/usr/bin/qemu-system-x86_64",
                 "path": "/usr/local/bin/qemu-system-x86_64",
                 "image": "/usr/share/clear-containers/clear-containers.img",
                 "image": "/usr/share/clear-containers/clear-containers.img",
                 "kernel": {
                 "kernel": {
Line 175: Line 175:


I capped it at 1G for now. This setting would definitely need to be tweaked depending on needs/usage...
I capped it at 1G for now. This setting would definitely need to be tweaked depending on needs/usage...
 
===The Qemu Hypervisor ===
===The Qemu Hypervisor ===
==== Getting the Source Code ====
==== Getting the Source Code ====

Revision as of 23:38, 25 July 2017

What & Why

Clear containers (CC) offer a hybrid solution that encompasses the advantages of hypervisor security and container deployment. So, we wanted to see if they could be used in a YP environment. This was done for Clear Containers 2.2 based on YP master around the time of 2.4 RC1/2.
Note: this is a Proof of Concept, done by building on target. The eventual goal would be to create a standard recipe to allow the clear containers to be built in the standard way. Hopefully, this guide will help with that by outlining the parts, dependencies, and configuration steps. This guide assumes you already have docker running on your target by having followed Running Docker on your image . The target example is being done with an Intel Nuc. I have successfully run the same code on a Minnowboard Turbot.

Dependencies you need

Layers

The layers I am using:

meta-openembedded/meta-oe                                                                               
meta-openembedded/meta-python                                                                                 
meta-openembedded/meta-networking                                                                      
meta-openembedded/meta-filesystems                                                                            
meta-virtualization                                                                                          
meta-clear

All of these layers can be found on layer.openembedded.org except the meta-clear. The meta-clear layer was created with the script yocto-layer. It's only purpose is to turn on CONFIG_VHOST_NET=m for the kernel. Here's a tree of the layer:

├── conf
│  └── layer.conf
├── COPYING.MIT
├── README
└── recipes-kernel
   └── linux
       ├── linux-yocto
       │   ├── clear.cfg
       │   └── clear.scc
       ├── linux-yocto_4.10.bbappend
       └── linux-yocto_4.9.bbappend

I am using the 4.9 kernel. Here's the linux-yocto_4.9.bbappend:

FILESEXTRAPATHS_prepend := "${THISDIR}/${PN}:"
SRC_URI += "file://clear.scc \
          "
KERNEL_MODULE_AUTOLOAD += "vhost-net"

The scc file:

define KFEATURE_DESCRIPTION "Enable clearcon support"
define KFEATURE_COMPATIBILITY board
kconf non-hardware clear.cfg

And finally the cfg file:

CONFIG_VHOST_NET=m

Conf Changes

This guide presumes you have the setup in your conf file described in Running Docker on your image . In addition, to make on target building easier, I add the following to my conf/local.conf:

EXTRA_IMAGE_FEATURES += " dev-pkgs tools-sdk tools-debug tools-profile   "

Additional Dependencies to Bitbake

These are the additional recipes I built in addition to the base I outlined above. They could be added all at once in the local.conf, if you want by doing a

IMAGE_INSTALL_append += libcheck mdadm psmisc json-glib libmnl ossp-uuid autoconf-archive python-setuptools libcap-ng tunctl go wget sudo
bitbake libcheck mdadm psmisc json-glib libmnl ossp-uuid autoconf-archive python-setuptools libcap-ng tunctl go wget sudo

These are additional packages I built for convenience, but they are not required:

bitbake less zile ntp rsync minicom

Once built these can be installed on the board. Note that we need the dev pkgs as we are mostly completing build requirements for pieces of CC.

dnf install tunctl python-dev python-setuptools-dev libcap-ng-dev libcheck-dev libmnl-dev libjson-glib-1.0-dev  autoconf-archive-dev libcap-ng-dev python-setuptools-dev go wget sudo

and the convenient ones:

dnf install zile less ntp rsync minicom

The Image to Build

bitbake core-image--base


The Pieces of CC

Clear Containers are comprised of a set of software and binaries. The main code is a slightly forked (2.9 currently) qemu hypervisor configured to be minimal, a command proxy, a shim, and the oci runtime. The command proxy is written in go. The rest is c/c++. We build the hypervisor itself, but the binaries for the hypervisor are downloaded from the CC site.

The Runtime,Shim & Proxy

This comes from [clear oci runtime]. While getting it to work, I followed the development model outlined in Leveraging Rpm Package Feeds. Here I will list the dependencies to make it shorter.
Which Clear was this?

cc-oci-runtime version: 2.2.0
spec version: 1.0.0-rc1
commit: f92d50ad54003298c139de59777f07588683cdc2

Getting the Source Code

We will pretty much follow the (very good) instructions in the README. Because it is a go project we will follow the go flow...

go get -d github.com/01org/cc-oci-runtime/...

The ... is necessary. If you are behind a proxy, make sure you export http_proxy and https_proxy into your shell.
This will put the src in ~/go/src/github.com/01org/cc-oci-runtime/ by default.

Building It

Again, following their README do

./autogen.sh --disable-functional-tests

We disable functional tests because I was not able to easily find a recipe for BATS (Bash automated testing).

make

The make install is quite clean and show where everything is going:

# make install
make[1]: Entering directory '/home/root/go/src/github.com/01org/cc-oci-runtime'
 /bin/mkdir -p '/usr/bin'
  /bin/sh ./libtool   --mode=install /usr/bin/install -c cc-oci-runtime '/usr/bin'
libtool: install: /usr/bin/install -c cc-oci-runtime /usr/bin/cc-oci-runtime
 /bin/mkdir -p '/usr/bin'
 /usr/bin/install -c data/cc-oci-runtime.sh '/usr/bin'
 /bin/mkdir -p '/usr/libexec'
  /bin/sh ./libtool   --mode=install /usr/bin/install -c cc-shim '/usr/libexec'
libtool: install: /usr/bin/install -c cc-shim /usr/libexec/cc-shim
 /bin/mkdir -p '/usr/libexec'
 /usr/bin/install -c cc-proxy '/usr/libexec'
 /bin/mkdir -p '/usr/share/defaults/cc-oci-runtime'
 /usr/bin/install -c -m 644 data/vm.json data/hypervisor.args data/kernel-cmdline '/usr/share/defaults/cc-oci-runtime'
 /bin/mkdir -p '/lib/systemd/system'
 /usr/bin/install -c -m 644 proxy/cc-proxy.service proxy/cc-proxy.socket '/lib/systemd/system'

Getting the Artifacts

We need a kernel and an image for the hypervisor.

wget http://download.clearlinux.org/releases/16050/clear/x86_64/os/Packages/linux-container-4.9.33-74.x86_64.rpm
wget https://download.clearlinux.org/releases/16050/clear/clear-16050-containers.img.xz
rpm --install ./linux-container-4.9.33-74.x86_64.rpm
xz --decompress clear-16050-containers.img.xz 
cp clear-16050-containers.img /usr/share/clear-containers/
pushd /usr/share/clear-containers/
ln -s clear-16050-containers.img clear-containers.img
popd

After this, you should have a /usr/share/clear-containers directory that looks like this:

|-- clear-16050-containers.img
|-- clear-containers.img -> clear-16050-containers.img
|-- vmlinux-4.9.33-74.container
|-- vmlinux.container -> vmlinux-4.9.33-74.container
|-- vmlinuz-4.9.33-74.container
`-- vmlinuz.container -> vmlinuz-4.9.33-74.container

Configuring It

The configuration is located in /usr/share/defaults/cc-oci-runtime. There are 3 files.

  • vm.json defines which hypervisor, kernel and rootfs we use:
{
       "vm": {
               "path": "/usr/local/bin/qemu-system-x86_64",
               "image": "/usr/share/clear-containers/clear-containers.img",
               "kernel": {
                       "path": "/usr/share/clear-containers/vmlinux.container",
                       "parameters": "root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 tsc=reliable no_timer_check 
rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k panic=1 console=hvc0 console=hvc1 
initcall_debug init=/usr/lib/systemd/systemd systemd.unit=cc-agent.target iommu=off quiet systemd.mask=systemd-networkd.service 
systemd.mask=systemd-networkd.socket systemd.show_status=false cryptomgr.notests net.ifnames=0"
               }
       }
}
  • kernel-cmdline is well, the kernel command line
root=/dev/pmem0p1 rootflags=dax,data=ordered,errors=remount-ro rw rootfstype=ext4 tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 
i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k panic=1 console=hvc0 console=hvc1 initcall_debug init=/usr/lib/systemd/systemd 
systemd.unit=cc-agent.target iommu=off quiet systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket 
systemd.show_status=false cryptomgr.notests net.ifnames=0
  • hypervisor.args defines how we start up the qemu hypervisor
/usr/bin/qemu-system-x86_64
-name
@NAME@
-machine
pc-lite,accel=kvm,kernel_irqchip,nvdimm
-device
nvdimm,memdev=mem0,id=nv0
-object
memory-backend-file,id=mem0,mem-path=@IMAGE@,size=@SIZE@
-m
2G,slots=2,maxmem=3G
-kernel
@KERNEL@
-append
@KERNEL_PARAMS@ @KERNEL_NET_PARAMS@
-smp
2,sockets=1,cores=2,threads=1
-cpu
host

This is the file we need to change to accommodate our small memory availability on the Minnowboard Turbot (4 G total) . We are also using a newer qemu and machine type than the hypervisor.args file assumes. This helps to make the Atom processor happy. Here's the diff:

# diff hypervisor.args.orig hypervisor.args
5c5
< pc-lite,accel=kvm,kernel_irqchip,nvdimm
---
> q35,accel=kvm,kernel_irqchip,nvdimm,nosmm,nosmbus,nosata,nopit,nofw
11c11
< 2G,slots=2,maxmem=3G
---
> 256M,slots=2,maxmem=1G

I capped it at 1G for now. This setting would definitely need to be tweaked depending on needs/usage...

The Qemu Hypervisor

Getting the Source Code

git clone https://github.com/clearcontainers/qemu qemu-cc
cd qemu-cc
git checkout -b qemu-lite-v2.9.0 origin/qemu-lite-v2.9.0

Building It

mkdir build
cd build
 ../configure '--disable-tools' '--disable-libssh2' '--disable-tcmalloc' '--disable-glusterfs' '--disable-seccomp' '--disable-bzip2' '--disable-snappy' '--disable-lzo' '--disable-usb-redir' '--disable-libusb' '--disable-libnfs' '--disable-tcg-interpreter' '--disable-debug-tcg' '--disable-libiscsi' '--disable-rbd' '--disable-spice' '--disable-attr' '--disable-cap-ng' '--disable-linux-aio' '--disable-brlapi' '--disable-vnc-jpeg' '--disable-vnc-png' '--disable-vnc-sasl' '--disable-rdma' '--disable-bluez' '--disable-fdt' '--disable-curl' '--disable-curses' '--disable-sdl' '--disable-gtk' '--disable-tpm' '--disable-vte' '--disable-vnc' '--disable-xen' '--disable-opengl' '--disable-slirp' '--enable-trace-backend=nop' '--enable-virtfs' '--enable-attr' '--enable-cap-ng' '--target-list=x86_64-softmmu' "$@"

In order to build it with gcc-7 (current YP compiler which is ahead of most/all std distros, I needed to HACK the following files:

      modified:   ../block/blkdebug.c
      modified:   ../block/blkverify.c
      modified:   ../hw/usb/bus.c

Here are the individual diffs:

  • block/blkdebug.c
diff --git a/block/blkdebug.c b/block/blkdebug.c
index 67e8024e36..e3ab19b947 100644
--- a/block/blkdebug.c
+++ b/block/blkdebug.c
@@ -668,7 +668,7 @@ static int blkdebug_truncate(BlockDriverState *bs, int64_t offset)

static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
{
-    BDRVBlkdebugState *s = bs->opaque;
+//    BDRVBlkdebugState *s = bs->opaque;
    QDict *opts;
    const QDictEntry *e;
    bool force_json = false;
@@ -688,11 +688,13 @@ static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
        return;
    }

+#if 0
    if (!force_json && bs->file->bs->exact_filename[0]) {
        snprintf(bs->exact_filename, sizeof(bs->exact_filename),
                 "blkdebug:%s:%s", s->config_file ?: "",
                 bs->file->bs->exact_filename);
    }
+#endif

    opts = qdict_new();
    qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
  • block/blkverify.c
diff --git a/block/blkverify.c b/block/blkverify.c
index 9a1e21c6ad..aa5d5ccdad 100644
--- a/block/blkverify.c
+++ b/block/blkverify.c
@@ -302,6 +302,7 @@ static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options)
        bs->full_open_options = opts;
    }

+#if 0  
    if (bs->file->bs->exact_filename[0]
        && s->test_file->bs->exact_filename[0])
    {
@@ -310,6 +311,7 @@ static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options)
                 bs->file->bs->exact_filename,
                 s->test_file->bs->exact_filename);
    }  
+#endif
}

static BlockDriver bdrv_blkverify = {
  • hw/usb/bus.c
diff --git a/hw/usb/bus.c b/hw/usb/bus.c
index 24f1608b4b..2f2abc1824 100644
--- a/hw/usb/bus.c
+++ b/hw/usb/bus.c
@@ -407,8 +407,8 @@ void usb_register_companion(const char *masterbus, USBPort *ports[],
void usb_port_location(USBPort *downstream, USBPort *upstream, int portnr)
{
    if (upstream) {
-        snprintf(downstream->path, sizeof(downstream->path), "%s.%d",
-                 upstream->path, portnr);
+//        snprintf(downstream->path, sizeof(downstream->path), "%s.%d",
+//                 upstream->path, portnr);
        downstream->hubcount = upstream->hubcount + 1;
    } else {
        snprintf(downstream->path, sizeof(downstream->path), "%d", portnr);

I verified with fprintf(stderr....) that these are not hit in typical usage. I also expect they will be upstream patched soon. Now we can do:

make
make install
# qemu-system-x86_64 -machine help| grep q35
q35                  Standard PC (Q35 + ICH9, 2009) (alias of pc-q35-2.9)
pc-q35-2.9           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.8           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.7           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.6           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.5           Standard PC (Q35 + ICH9, 2009)
pc-q35-2.4           Standard PC (Q35 + ICH9, 2009)

The last line shows we can run qemu and have the machine available.



Running Clear Containers

You will need a couple of logins onto the target.

More configuration

We have the CC runtime, now we need to let docker know about it. We change the startup of the docker systemd service to know about the new cor runtime and we also turn on debugging. Note, for faster performance, we would drop the -D and do --add-runtime cor=/usr/bin/cc-oci-runtime the .sh script is a clever interceptor that let's us turn on additional debugging features in case we need to.

  1. diff /lib/systemd/system/docker.service ~/docker.service
12c12
< ExecStart=/usr/bin/dockerd -D --add-runtime cor=/usr/bin/cc-oci-runtime.sh -H fd://
---
> ExecStart=/usr/bin/dockerd -H fd://

Once the systemd service has been modified, you need to tell systemd to reload, and then restart docker

# systemctl daemon-reload   
# systemctl  stop docker
# systemctl  start docker

The Proxy

  • by hand:
/usr/libexec/cc-proxy -v 3

This will show the connections and is quite reassuring early. This can also be run using the systemd service that got installed while building the runtime.

# ls /lib/systemd/system/cc-proxy.s*
/lib/systemd/system/cc-proxy.service  /lib/systemd/system/cc-proxy.socket

A Container

Now all the pieces are in place for the big finale. This will run a "normal" docker container

# docker run -it --rm busybox sh

and

# ps augxww | grep qemu 

should be empty. Now, let's run a Clear Container:

# docker run -it --rm --runtime cor busybox sh                                               
/ # 

This is the success we've been aiming for all along!!! Yay!
How do we know it worked?

  • ps This will show us a qemu instance running:
#ps augxww | grep qemu
root     10192  0.7  0.9 716620 73864 ?        Ssl  18:00   0:00 /usr/local/bin/qemu-system-x86_64 -name 0966e97e19a2 -machine 
q35,accel=kvm,kernel_irqchip,nvdimm ..... and much more ....
  • our proxy will have some nifty information, here's a snippet:
I0708 18:00:35.512724   10139 vm.go:114] [vm e9276e31 hyperstart] hyper_channel_read
I0708 18:00:35.512813   10139 vm.go:114] [vm e9276e31 hyperstart] hyper send type 14, len 4
I0708 18:00:35.512933   10139 vm.go:114] [vm e9276e31 hyperstart] get length 41
I0708 18:00:35.513016   10139 vm.go:114] [vm e9276e31 hyperstart] hyper send type 14, len 4
I0708 18:00:35.513148   10139 vm.go:114] [vm e9276e31 hyperstart] 0 0 0 b 0 0 0 29 7b 22 73 65 71 22 3a 31 2c 20 
22 72 6f 77 22 3a 34 38 2c 20 22 63 6f 6c 75 6d 6e 22 3a 31 31 33 7d 
I0708 18:00:35.513213   10139 vm.go:114] [vm e9276e31 hyperstart]  hyper_channel_handle, type 11, len 41
I0708 18:00:35.513295   10139 vm.go:114] [vm e9276e31 hyperstart] call hyper_win_size, json {"seq":1, "row":48, "
column":113}, len 33
I0708 18:00:35.513429   10139 vm.go:114] [vm e9276e31 hyperstart] exec seq 1, seq 1
I0708 18:00:35.513525   10139 vm.go:114] [vm e9276e31 hyperstart] hyper send type 9, len 0
  • and , since we are running the debug version (Remember, we set docker up with -D and the runtime to point to /usr/bin/cc-oci-runtime.sh, we also have a CC logfile in /run/cc-oci-runtime/cc-oci-runtime.log
# cat /run/cc-oci-runtime/cc-oci-runtime.log
... much stuff ...
2017-07-08T18:00:35.487831Z:10216:cc-oci-runtime:debug:proxy msg length: 16
2017-07-08T18:00:35.487868Z:10216:cc-oci-runtime:debug:message read from proxy socket: {"success":true}
2017-07-08T18:00:35.487940Z:10216:cc-oci-runtime:debug:msg received: {"success":true}
2017-07-08T18:00:35.487954Z:10216:cc-oci-runtime:debug:disconnecting from proxy
2017-07-08T18:00:35.488579Z:10216:cc-oci-runtime:debug:created state file /var/run/cc-oci-
runtime/e9276e3130766473ecf58edaf76fea76c3df46db9c1dff5dc5113e6e7a85a205/state.json
  • as you can see, we also have a state file. This shows things like
"vm" : {
   "pid" : 10192,
   "hypervisor_path" : "/usr/local/bin/qemu-system-x86_64",
   "image_path" : "/usr/share/clear-containers/clear-16050-containers.img", 
   "kernel_path" : "/usr/share/clear-containers/vmlinux-4.9.33-74.container",  
   "workload_path" : "",  
 ....


Debugging

There is a lot of debugging information on the CC site, so I won't repeat it here. I used 3 main things when I was debugging this:

  • the stderr/stdout for the proxy.
  • the CC runtime log /run/cc-oci-runtime/cc-oci-runtime.log
  • I also added an additional debug to the /usr/bin/cc-oci-runtime.sh. I added debugging for the hypervisor itself.
# mkdir /tmp/hypervisor
# diff /usr/bin/cc-oci-runtime.sh /usr/bin/cc-oci-runtime.sh.orig
64c64
< runtime_args="$runtime_args --global-log=\"$global_log\"  --hypervisor-log-dir=/tmp/hypervisor "
---
> runtime_args="$runtime_args --global-log=\"$global_log\""

The hypervisor logs are empty unless there's a problem..,