Apr 07

Cron the Eater of (CPU) Time

Note: This behaviour was observed on a Pi B+; it was not observed on a Pi 2.

I happened to have top up while doing the bonded Nic test and noticed that the load was awfully high. Then the following popped up:

cron1

Cron pretty much has the CPU pegged.

Once I saw this, I wondered what on Earth cron could be doing that was so intensive. So I looked in /var/spool/cron/crontab and it was empty.

A few minutes later this followed:

cron2

It was still roughly eating 1/2 the CPU time.

So I rebooted and observed the same behaviour.

I then did a sudo service cron stop and started my tests from scratch…..

Apr 07

Swarming Raspberry Pi: Of Network Bondage

Apologies to W. Somerset Maugham for the title.

For my Cloud in a Box, I want the Data hosts, at the least to have
“more” network I/O capability, whether it be for a Docker Registry or other data. (As an aside, I am playing with doing substructure matching of chemical compounds and/or other Cheminformatics with the Pi. Compressed, the base data from the NIH is 50GB. News to follow shortly)

One way to do this is via Link Aggregation. This post explores this topic.

My test is rather artificial; I am reading 100Mb from /dev/zero and sending it across the wire.

In the first case, I used ssh for the transport. Afterwards, I used netcat with similar results.

Additionally, there are different modes of balancing and/or HA posible; UbuntuBonding provides a good description of the types of bonding as well as configuration.

In the test, the following types of bonding are tested:

  • No Bonding; this is a baseline for comparison.
  • balance-rr – Balanced Round Robin — this uses all the Nics to send packets round robin.
  • balance_alb
  • balance_tlb

The Hosts used in testing were both plugged into a gigabit switch which was otherwise empty. Only one Pi at a time was plugged into the switch.

The second Nic is a USB 2.0 device 10/100 Mb/s. I didn’t grab a picture, but the Pi with both Nics in use was drawing about 2 watts (or, if you will, less than 500mA). Without the 2nd Nic, both the B+ and 2B were drawing ~1.25 watts. So my compute section of the cloud will be drawing < 20 Watts. Likely < 15.

Setup

My current switch does not support 802.3ad Link Bundling, so I am doing without in this test.

There is a driver which needs to be loaded:

Then add the following line to /etc/modules:

Once you’ve done so, you can lsmod to verify it’s loaded. If it’s not, then do a modprobe bonding. You should not need to do so again; the module should be loaded on the next book.

/etc/network/interfaces

Pi B+

unbonded

balance_alb

balance_rr

balance_tlb

Pi 2B

Unbonded

balanced_rr

balance-alb

balance-tlb

Results and Observations

I believe that /dev/zero is implemented “oddly” or at least differently oon the B+ than the Pi 2. It seems to require more work by the sytem.

I also noticed that there appears to be an issue with cron on the Pi B+. It is eating the CPU approximatly 1/2 the time. This will be investigated in another blog post.

  • I didn’t notice any difference between the bonded and unbounded test on the Pi B+.

  • The bonding is working on the Pi 2; this is evidenced by the time spend transferring the files, approximately 1/2 the time for the un-bonded host.

All in all, it was a useful little experiment to see how well bonding works on the Pi.

Apr 06

Why it’s Important to use TLS with Docker Swarm

There is nothing wrong with your docker host. Do not attempt to adjust the picture. We are controlling transmission. If we wish to make it louder, we will bring up the volume. If we wish to make it softer, we will tune it to a whisper. We will control the horizontal. We will control the vertical. We can roll the image, make it flutter. We can change the focus to a soft blur or sharpen it to crystal clarity. For the next hour, sit quietly and we will control all that you see and hear. We repeat: there is nothing wrong with your docker host. You are about to participate in a great adventure. You are about to experience the awe and mystery which reaches from the inner mind to – The Outer Limits.

It’s nothing new, but I thought I might explore things that can be done with an open docker port. The example below is somewhat innocuous, but illustrates a point…

Dude, you just got p0wned.

Other applications are left as an exercise for the student.

Apr 05

SSD on a Raspberry Pi

As a part of building out the Cloud in a Box, I wanted some storage for Docker images, as well as data.

Based upon my previous experience, I believed that a SSD would be faster than a Micro SD, but I hadn’t tested it as yet. The challenge from Dieter Reuter (@Quintus23M), asking how I’d hooked up the SSD as well as whether it was faster than the Micro SD was a good motivator. I did find a couple of surprises along the way.

SSD Config

The SSD are (at the moment) in a USB 2.0 case such as Inland 2.5″ SATA to High-Speed USB 2.0 External Hard Drive Enclosure 434746 – Micro Center. There wasn’t much reason to go with USB 3 as the Pi doesn’t support it. I mainly wanted it for the cirucuit — a case with circuit and cable was $5 USD, whereas a conversion cable is over $10.

At some point I am thinking about building a sata to USB converter supporting multiple drives — the
JM20337 is an inexpensive chip and appears to be what the converter is using.

The Pi, even when plugged into a 2 Amp power supply, didn’t have enough juice to run the drive. Consequently I needed a powered hub. I think it’s quite possible that another, more expensive case/circuit might not need to have a powered USB hub.

Devices

Device Mount Point Type Notes
/dev/mmcblk0 / MicroSD Class 10, 16GB in Pi Micro SD slot
/dev/sdb /data SSD 240GB Sata 3 SSD
/dev/sda /opt2 Spinning Disk 2.5″ 5400 RPM, 160GB

Testing

In order to make as accurate a test as possible, the buffers and cache are dumped prior to every run:

Additionally, the data is sync‘d in order to ensure that the reads/writes are finished and measured as a part of the time elapsed in the test.

The hdparm tests have an implicit dump of the cache/buffers.

Sequential Write Test

Sequential Read Test

hdparm Buffered Disk Reads

hdparm Cached Disk Reads

Results

The main limitation of I/O on the Pi is the speed of the USB Bus. Given it’s limited to 480Mbs, the bus is much slower than the speed which the hard drive and SSD support.

By comparison, here is the HDParms results for a SSD on a desktop:

and a 5400 spinning disk on a laptop (doing other work at the same time):

The Micro SD device, however, is half the speed of the other two devices — I believe that this is to spec. “Difference between Speed Class, UHS Speed Class, and Speed Ratings (performance) for SD/SDHC/SDXC cards“, indicates that the class refers to the minimum speed that a card supports.

The Best microSD Card states that

It’s important to test SD cards via USB 3.0 to prevent bottlenecks, since USB 2.0 tops out around 33 MB/s

This is the approximate speed being seen for the SSD and Spinning Disk. Since the Micro SD is half the speed of the others, I’ve probably reached the limit of the card — approximately 1.6 times the minimum speed for reads and the spec for writes.

There are UHS cards which are faster, however. One of these should be able to match or exceed the performance of the SSD or spinning disk. For instance, the Samsung EVO Micro SDXC claims, depending on the version, speeds up to
48MB/s or 90MB/s.

Interestingly enough, the hdparm tests for cached reads has all three roughly within range of each other. The buffered tests, however, are similar to the Sequential Read test in the reported rates.

On writes, the spinning disk is slightly slower than the SSD. However, the SSD pulls a bit ahead on reads — ~1 MB/sec for buffered and then ~20 MB/sec on the cached reads.

At the end of the day, the SSD and Spinning Disk are faster than the Micro SD — however, this might be due to the Micro SD card I’m using.

Apr 03

Nifty Things for Week Ending 3 April

Questioning Assumptions

DevOps

HAProxy

Docker

Business

Philosophy & Paradigm

Software devs are professional learners: we’re paid to learn how to work with new languages, frameworks, code bases and apps. @pat_shaughnessy

Google Chrome

Ruby

Raspberry Pi

Cheminformatics

Machine Learning & Big Data

Graph Database

Apr 03

Swarming Raspberry Pi: Docker Swarm Discovery Options

Docker Swarm supports a variety of methods for discovering swarm members. Each has arguments in its favor. In this post I shall discuss the various methods and thoughts regarding each.

Background

I originally started with the idea of having a portable cluster, a “cloud in a box” if you will, so that I could go and give talks without having to worry about network dependencies and so forth. I also was intrigued by the idea of low power, inexpensive devices which could be used for building a cloud. Two days after my initial purchase of 5 Pi B+, the Pi 2 was released. Despite my initial grump, I realized that this presented possibilities for distributing workloads across a heterogenous environment which is an interesting problem space — determining how best to distribute work across an infrastructure.

I still have the goal, for the present of having a portable cloud. I’ve been challenged, however, to build a larger one than Britain’s GCHQ Pi cloud. It is tempting. Since they’re using all single core Pi’s, it wouldn’t be terribly difficult to build a cloud with more oomph and far fewer nodes. Of course, if the workload is IO intensive then more members are needed.

At present, my cloud is consisting of the following:

  1. 5x Pi B+ Worker Nodes, 16GB micro SD
  2. 5x Pi 2B Worker Nodes, 16GB micro SD
  3. 1x Pi 2B Master Node (temporarily being used as a porting/compiling node), 16GB Micro SD
  4. 2x Pi 2B Data Nodes (one of these will become a docker registry, among other things)
    a. One has 2x 240GB SSD
    b. One has a 240GB SSD and a 160GB Spinning disk (for “slow” data)
  5. 16 Port 100Mbit Switch. This may shortly be swapped out for gigabit switch(es).

Criteria for Evaluation

I strongly believe that metrics, monitoring, and alerting are necessary in building any infrastructure.

I am seeking maximum portability; my Cloud in a Box™ should be able to do Real Work™ without depending on anything outside the cluster. Additionally, the less I need to know ahead of time the better. Names trump numbers — if I can use a name or use a name to look up a number that is better than having to remember a number.

Given the limited resources of the Pi, lightweight solutions are preferred over heavyweight, saving where they can serve dual purposes.

The Contendors

The list of Discovery Services can be found in the Docker Documentation.

Hosted Discovery Service

The hosted discovery service presents an easy way to test and get started with Swarm. Swarm communicates with the Docker Hub in order to maintain a list of swarm members.

The Good

It’s easy, presented in the tutorial, and is supported by Docker.

The Bad

Unfortunately the requirement of connecting the the Docker Hub means that it’s not self contained; in order for it to work a network connection is needed.

As of today, there a couple of issues with it:

  1. There is no way to remove a host from the swarm.

  2. docker -H $SWARM_MANAGER info returns what I believe is an incorrect count:

As an example, in the case of apis-rpi-04, it’s claiming that there are six (6) containers. However, there are not six (6) containers running:

There are, however, six in the ps -a results:

On a whim, I removed the containers which were not up:

At this point, info returns the three (3) which I’d expect. However, on further investigation, it turns out that docker info outside of swarm returns the total number of containers, not the number of running containers. I’ve
opened an issue about this. I think that having an entry for the number of containers running would be useful, but barring that documentation is good.

Static File Describing the Cluster

In this case, a file is used by all the members which has a list of all the hosts IP addresses and ports.

The Good

It’s pretty simple.

The Bad

Since my cluster is portable, I don’t necessarily know what the IP addresses are — I may happen to be on a network where the addresses I’ve chosen are already in use. For simplicity sake I don’t want to have to worry at this point about NAT translation.

The file needs to be copied to all of the servers. Additionally, it violates principles espoused by The Twelve-Factor App — primarily there is a configuration artifact which needs to be maintained.

A Static List of IPs

The Good

Same as the file list, but this also has the added goodness of being more 12 Factor compliant.

The Bad

Same as the file list.

Range pattern for IP addresses

The Good

The Bad

etcd

When I investigated coreos/etcd on the Raspberry Pi wouldn’t compile without patching the code — to whit, a structure needs to be edited. I don’t view it as very portable and I have concerns whether the structure change will keep it from working properly with other frameworks. At least for the moment I don’t consider it to be a good choice.

zookeeper

The Good

Established codebase with lots of users.

The Bad

It’s really heavyweight compared to some of the other options. However, if plannng to use hadoop or anthing in the hadoop ecosystem, it might be a good choice.

Consul

The Good

Consul can serve multiple purposes — service discovery, monitoring, and DNS. Additionally there’s a fairly useful web ui which provides a dashboard showing the status of the members.

It’s fairly lightweight — the client takes approximately 3MB:

The Server, on the other hand, takes about 12MB for managing 13 hosts:

That 821 minutes is over 11 days:

That’s about 1.25 hours/day of CPU time. By comparison, the client is a bit lighter:

The Bad

It’s rather chattier than the other methods. Every 10 seconds or so the client wakes up and does some work.

Verdict

For the moment, based upon my goals and an analysis of the good and the bad of the various methods available today, I think that consul is the best choice at this point.

Mar 29

Swarming Raspberry Pi, Part 2: Registry & Mirror

This episode will consist of a quick aside to build a Docker Registry Mirror.

Previous: Part 1

Why Do We Need a Mirror??

The Docker Registry is, to my mind, one of the greatest contributors to the rapid growth of Docker. Having a central location for images encourages sharing and re-use. Obviously some images are more useful than others, but it is a wonderful resource for learning and leveraging the work of others. I like to look at Dockerfiles written by others; I learn a lot that way. It also helps me to determine which image to use where there are more than one providing a service. (Hint: I prefer to avoid those who do not share their Dockerfile).

In the Raspberry Pi Swarm, there are containers which will be pulled by all of the hosts. It takes a while to pull down images, even in parallel (at which point network congestion can cause problems). If an image is 300MB, unless we have a local mirror the entirety of the image is going to be pulled down across the internet by each host. It’s a lot quicker to cache the image locally and pull it once. Also, without a local mirror, there is a dependency on internet connectivity — something I am attempting to avoid with my Pi Swarm.

Configuring the Mirror Host

Requires:
1. Raspberry Pi
2. Hypriot Image (or other image supporting recent Docker)
3. External Disk (see below)

As before, I am starting from a Hypriot install. I have attached an external disk (in my case I’m using a SSD with LVM) to /opt/docker-cache to serve as my mirror. I’m not sure that using a micro-sd for the cache is the best way to go — it’s slow, for one thing. I’m not certain that the benefit of a less internet traffic and latency is overcome by the slow disk.

I’ve uploaded a Docker Registry image for the Raspberry Pi to the Docker Hub. You can pull it via:

The Dockerfile follows — there are a few differences from the stock Dockerfile:

If you want to build it yourself, you can do the following (on a pi with Docker):

Running the Mirror

Once you’ve built (or pulled) the registry image, it can be started on the registry host. In order for the mirroring to work properly, modifications to the other Docker hosts will need to be made. That follows below.

I am assuming that the external disk is mounted at: /opt/docker-cache.

There are a couple arguments of interest:

Argument Notes
-p 80:5000 The registry is exposed on port 80. This serves two purposes. First, if you wish to have a local registry of private images running on host registry, you can specify registry/project/image instead of registry:5000/project/image when pushing or pulling an image. Additionally, should you need to delete an image, docker is (at the moment) unable to delete images of the form HOST:PORT/PROJECT/IMAGE.
-v /opt/docker-cache:/tmp/registry This points to the location outside of the container which is used for the mirror’s cache. If you don’t specify a location, it’s kept within the container and once the container ends you’ve lost the benefits of a mirror.
-e GUNICORN_OPTS=["--preload"] This is to workaround a bug. The registry will not start up correctly without it. See Issue #892 · docker/docker-registry for a discussion of it.

Using the Mirror

Docker hosts which you wish to use the mirror need extra arguments passed into the daemon upon start. Assuming you are running a debian based version, the default configuration file is located at /etc/default/docker.

The DOCKER_OPTS variable needs to be edited to add --registry-mirror=http://REGISTRY_MIRROR. Replace REGISTRY_MIRROR with either the name or IP address of the host on which the mirror is running.

Once the file is edited, restart docker:

sudo service docker restart

To test, pull in a docker image you’ve not pulled before, remove it, then pull in again: (you may briefly see a message saying it’s using the mirror)

We went from ~44 to ~16 seconds total, almost 1/3 the time.

Limitations

Ultimately the limiting factor for improvement is I/O, namely the speed of the USB Bus and the bandwidth of the local network. In this case, the mirror is plugged into a 10/100Mb switch. The host performing the pull is connected over wireless. Kidzilla is watching Netflix.

As a quick test of network speed:

Ok, so in this totally unscientific test, I can fill 48Mb/s. Let’s try the same, but wired:

A little bit better; so at this point either the 10/100Mbs switch is being flooded or we’ve hit the limit of the USB bus. There are other processes running on the Pi, too, some of which are chatty.

Next Time

Next time is using consul and registrator along with swarm. If you’d like to play in the mean time, the images are out on docker hub:

Service Image Notes
consul nimblestratus/rpi-consul Port of progrium/consul
registrator nimblestratus/rpi-registrator Port of gliderlabs/registrator
swarm nimblestratus/rpi-swarm Port of docker/swarm. Details on how to use it at: Swarming Raspberry Pi Part 1

Mar 27

Nifty Things for Week Ending 27 March

Geekly Toys & Games

Hardware

Funny

Cryptography

Tools

Archeology

  • 10,000-Year-Old Stone Tool Site Discovered in Suburban Seattle
  • Megalithic Site Discovered In Russia:
    On Mount Shoria in southern Siberia, researchers have found an absolutely massive wall of granite stones. Some of these gigantic granite stones are estimated to weigh more than 3,000 tons, and as you will see below, many of them were cut “with flat surfaces, right angles, and sharp corners.” Nothing of this magnitude has ever been discovered before. The largest stone found at the megalithic ruins at Baalbek, Lebanon is less than 1,500 tons.

Testing

Linux

Astronomy

Programming

Docker

LVM

SSD

Site of the Week

Unicornfree with Amy Hoy: Creating And Selling Your Own Products — I’ve been reading what Amy has to say for at least eight years; she really has a lot to offer. I started first reading her blog on slash7 and followed as she migrated to Unicorn Free and elsewhere. I am not quite at the point where I’m ready to build a product, but when I am, I know I’ll be glad for her lessons.

Mar 23

Shrinking Docker Images

Size does matter. Docker images can become quite large as each RUN generates a new layer which becomes part of the image, even if it’s not in the final container. This wastes disk space and network bandwidth. The following are some steps for shrinking the size of a docker container in which builds have been performed — they work particularly well for containers which have go executables.

In the Dockerfile

  1. Remove any build archives
  2. Remove any packages which were installed to build/compile which are not needed later

Compact the Image

If you perform a docker export of a container, it produces a tarball of the flattened Docker image. This can then be re-imported, at great size reduction1. For example, assuming the “big” image is named “consul-big” and the small named “consul-small”, executing the following command:

produces something like this:

A reduction of over 250MB. Not too shabby. I might be able to reduce it further, but I’m close to diminishing returns. A chunk of the size is due to dependencies which progrium/consul has, primarily in some accessory shell scripts.

May your Docker images grow ever smaller!


1 Your mileage may vary. The tip is provided without warranty of any kind. No images were harmed in the making of this blog post.

Mar 22

Registrator for Pi: /gliderlabs/registrator ported to Pi

I’ve made a port of gliderlabs/registrator to the raspberry pi. The repository for the Pi port is at: nimblestratus/registrator. At first glance, the part I find most interesting is the size of the resulting image — ~12M. I firmly believe it’s a direct result of starting from Alpine Linux.

This will definitely warrant further investigation. I think quite a few other docker images could be built with Alpine. It will require some testing, but I think that the Consul container should be able to get down to 50MB or less. Swarm should get down to the 20MB or less range. Definitely something to try (and soon!).

I’m also going to do a writeup shortly on service discovery for the pi swarm…

As always, the true thanks go to those who wrote the apps and tools.

Dockerfile contents:

Older posts «

» Newer posts