Ramblings

Weaving with Light Pt. 1

This is the first in a series of posts regarding a recent project which integrated handweaving, fiber optics, and electronics. It’s a part of a costume for a cosplayer at work, but I’ll be discussing my part of it. TL;DR For those who can’t wait, here’s what the project looks like in the dark: And …

View full post

Abusing HAProxy: Stupid Simple Easy Dashboards

I wanted a simple way to have a dashboard to show if hosts and services are alive & didn’t want to write much code and/or run up a nagios instance (or anything like that). All I care is whether it’s green or red. I’d already been setting up HAProxy for a proxy forwarder, so I …

View full post

Rules for Operations

The following list was compiled in 2012 for a talk on Operations Principles for Developers (Ops4Devs). They are loosely inspired by the list of rules from Zombieland as well as from my experiences and those shared by others. Looking over the list four years later, I believe that they are still (very) applicable for all …

View full post

DevOps Creed (Work in Progress)

This is a work in progress of a DevOps Creed. It will always be a work in progress as I and others learn and grow. Suggestions are welcome! I have drunk deep of the DevOps Kool-Aid. From the visions which ensued, I have come to the following…. I Believe: DevOps methodologies lead to systems which …

View full post

I am not a Mindreader: a mini-saga

I must confess a severe failing on my part. I am not a mindreader. I am not privy to the thoughts in your head. I do not know your needs or desires. And I am certainly not aware of your expectations. This is why requirement documents exist. Please use them.

View full post

Weaving with Light Pt. 1 Abusing HAProxy: Stupid Simple Easy Dashboards Rules for Operations DevOps Creed (Work in Progress)I am not a Mindreader: a mini-saga

Apr 28

Heterogenous Docker Swarms Teaser

Categories:

agni, Docker, Experiments, Raspberry Pi

by Matt Williams

Note: This is all very experimental; Docker does not support any architecture other than X86_64.

The last few evenings I’ve been working on Mulitifarious, a means of creating heterogenous Docker Swarms. I’d previously found that I can create a swarm with heterogenous members — a swarm which has, say, X86_64 and Raspberry Pi members. The problem arose, of course, once I attempted to run containers in the swarm. Containers are architecture specific.

Enter Multifarious. And no, multifarious isn’t nefarious, even if the words sound similar. Rather it means “many varied parts or aspects” (Google)

Multifarious uses dependency injection to tell Docker the name of a container suited to an Architecture.

In the preliminary version, ClusterHQ’s powerstrip is used in order to inject the proper image name into the request to build a Docker container. Powerstrip, in turn, calls a small Sinatra Application which performs a lookup in Redis to find the proper name for the host’s architecture. If the image name is not registered with Redis, then it is passed through without modification. It can be configured to either provide the image name for every architecture of a canonical name, or such that multifarious replaces the default name only in the case of a “special case”.

Quite possibly a future version will be written in Go and rather than requiring multiple executables to perform the injection, I expect to merge powerstrip and the adapter into one. This should reduce the footprint a good deal.

I am still working on a cohesive demo, but the following will show that the dependency injection is working:

matt@argentum:~$ docker -H tcp://127.0.0.1:2375 run -i hello
Hello from Docker.
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (Assuming it was not already locally available.)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

For more examples and ideas, visit:
 http://docs.docker.com/userguide/
matt@argentum:~$ docker ps -a
CONTAINER ID        IMAGE                                 COMMAND                CREATED             STATUS                         PORTS                    NAMES
6d07bf225819        hello-world:latest                    "/hello"               16 seconds ago      Exited (0) 15 seconds ago                               tender_hawking       
matt@argentum:~$ docker images |grep hello
hello-world                              latest                e45a5af57b00        3 months ago         910 B
luisbebop/docker-sinatra-hello-world     latest                9f12dabcf938        17 months ago        357.5 MB

matt@argentum:~$ docker -H tcp://127.0.0.1:2375 run -i hello

Hello from Docker.

This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:

1. The Docker client contacted the Docker daemon.

2. The Docker daemon pulled the "hello-world" image from the Docker Hub.

(Assuming it was not already locally available.)

3. The Docker daemon created a new container from that image which runs the

executable that produces the output you are currently reading.

4. The Docker daemon streamed that output to the Docker client, which sent it

to your terminal.

To try something more ambitious, you can run an Ubuntu container with:

$ docker run -it ubuntu bash

For more examples and ideas, visit:

http://docs.docker.com/userguide/

matt@argentum:~$ docker ps -a

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

6d07bf225819 hello-world:latest "/hello" 16 seconds ago Exited (0) 15 seconds ago tender_hawking

matt@argentum:~$ docker images |grep hello

hello-world latest e45a5af57b00 3 months ago 910 B

luisbebop/docker-sinatra-hello-world latest 9f12dabcf938 17 months ago 357.5 MB

The -i is needed due to a powerstrip quirk. However, take note that the docker image being invoked on the command line is ‘hello’. The docker image being run is that of ‘hello-world’ and there is no ‘hello’ image. Injection is working and I can configure images to run based upon the architecture.

I’ve injected the proper name for the image based upon a Redis lookup. I chose Redis because it’s available for multiple platforms and is pretty easy to use. It just needs to have the lookup table fed to it.

The items are stored in Redis as a HSET:

hset multifarious:hello x86_64 "hello-world"

1 2	hset multifarious:hello x86_64 "hello-world"

At runtime the image is chosen and injected and life proceeds.

The repository is available on github and will be added to in the next couple of days, with a full-fledged writeup and demo to follow in the next couple of days.

The Feaured Image is a modification of a photo by JD Hancock:

flickr photo shared by JD Hancock under a Creative Commons ( BY ) license

This post has no tag

3 comments

Apr 21

‘Piping’ Hot Docker Containers

Categories:

cheminformatics, cloud, cloud computing, cloud-in-a-box, Docker, Raspberry Pi

by Matt Williams

One of the possibly lesser used flags for docker run is -a which allows you to attach the container’s STDIN, STDOUT or STDERR and pipe it back to the shell which invoked the container. This allows you to construct pipelines of commands, just as you can with UNIX processes. For instance, using UNIX commands to count the number of files in a directory, you would do something like:

ls /dev | wc -l

1 2	ls /dev \| wc -l

Since the Docker container acts as a command, it has its own STDIN, STDOUT, and STDERR. You can string together multiple commands and containers.

After I ‘docker’ized the ‘grep’ discussed in Naive Substructure Substance Matching on the Raspberry Pi, I was able to attach the STDOUT from the grep to wc -l to get a count of the matching substances.

docker run -a stdout nimblestratus/rpi-substructure-grepper $PATTERN | wc -l

1 2	docker run -a stdout nimblestratus/rpi-substructure-grepper $PATTERN \| wc -l

This works just fine. In fact, it opens up opportunities for all sort of other commands/suites running inside a container. Pandoc running in a container to generate PDF’s comes to mind. Or ImageMagick. Or any of a number of other commands. All of the advantages of docker containers with all of the fun of UNIX pipes.

Then the imp of the perverse struck. If I could redirect the STDOUT of a container running on a local host, would it work as well on another? In short…. yes.

You can attach to the streams of a docker container running on a different host. The docker daemon needs to be bound to a port on the other host(s).

So, if I can run one at a time, why not five? I knocked out a couple of one line shell scripts (harness and runner) and, for grins and giggles, added a ‘-x’ magick cookie to demonstrate what’s happening. The lines below with the ‘+’ inside show the commands which are being performed behind the scenes:

matt@argentum:~/projects/apis/chem-swarm$ time ./harness 'CC1CCCCC=O' |tee /tmp/mout3 | wc -l
+ xargs -P 5 -n 1 ./runner CC1CCCCC=O
+ seq 1 5
+ docker -H tcp://192.168.1.101:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper
+ docker -H tcp://192.168.1.103:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper
+ docker -H tcp://192.168.1.104:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper
+ docker -H tcp://192.168.1.102:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper
+ docker -H tcp://192.168.1.105:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper
22

real    0m5.776s
user    0m0.099s
sys 0m0.091s

matt@argentum:~/projects/apis/chem-swarm$ time ./harness 'CC1CCCCC=O' |tee /tmp/mout3 | wc -l

+ xargs -P 5 -n 1 ./runner CC1CCCCC=O

+ seq 1 5

+ docker -H tcp://192.168.1.101:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper

+ docker -H tcp://192.168.1.103:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper

+ docker -H tcp://192.168.1.104:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper

+ docker -H tcp://192.168.1.102:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper

+ docker -H tcp://192.168.1.105:2375 run -a STDOUT -v /opt/smiles:/data -e PATTERN=CC1CCCCC=O nimblestratus/rpi-substructure-grepper

real 0m5.776s

user 0m0.099s

sys 0m0.091s

In less than six seconds, it’s spawned docker containers on five other hosts. Each of these containers is performing a substructure (read grep) search of ~13.7 million chemical compounds for a total of ~69M compounds. The results are then sent back to the initiating host, which is dumping the results to a file as well as counting the results. Not too shabby. And it scales to O(n), too — IO is the main limiting factor here.

I can think of lots of uses for this. Poor man’s parallel processing. Map/Reduce. Many more.

The disadvantage of this quick and dirty method is that you need to know the IP addresses on which to run the commands. Swarm alleviates the necessity of knowing the addresses or of coming up with a methodology for distributing the workload, which is always a plus.

It’s not necessarily something I’d do to go to production, but for testing or experimentation, it works quite well. It also leads to other experiments.

Docker is really awesome; I’m learning new things to do with it all the time.

This post has no tag

2 comments

Apr 19

Docker Containers: Smaller is not always better

Categories:

cloud, cloud computing, Docker, gotchas, Raspberry Pi

by Matt Williams

Generally smaller Docker containers are preferred to larger ones. However, a smaller container is not always as performant as a larger one. By using a (slightly) larger container, performance improved over 30x.

TL;DR

The grep included in busybox is painfully slow. When doing using grep to process lots of data, add a (real) grep to the container.

Background

As discussed in Naive Substructure Substance Matching on the Raspberry Pi » Ramblings, I am exploring the limits of the Raspberry Pi for processing data. I chose SubStructure searching as a problem set as it is a non-trivial problem and a decent demonstration for co-workers of the processing power of the Pi.

I’ve pre-processed the NIH Pubchem Compounds database to extract SMILES data — this is a language for describing the structure of chemical compounds. As a relatively naive first implementation I’m using grep to match substructures. I have split the files amongst five Pi 2s; each is processing ~840M in ~730 files. xargs is used to do concurrent processing across multiple cores. After a few cycles, the entire data is read into cache and the Pi is able to process it in 1-2 seconds for realistic searches. A ridiculous search, finding all of the carbon containing compounds (over 13 million) takes 8-10 seconds.

Having developed a solution, I then set about dockerizing it.

I chose voxxit/alpine-rpi for my base — it’s quite small, about 5mb and has almost everything needed. I discovered that the version of xargs which ships with the container does not support -P. So xargs is added via:

apk --update add findutils

1 2	apk --update add findutils

I ran my test and found that the performance was horrid.

I decided to drop into an interactive shell so that I could tweak. You can see the performance below in the ‘Before’.

Before:

/opt/smiles # date;time /bin/ash -c " ls | xargs -P 4 -n 50 grep -h 'C1CCCCC1C=O'| wc -l ";date
Sun Apr 19 14:25:54 GMT 2015
19
real    1m 4.21s
user    3m 57.52s
sys 0m 3.52s
Sun Apr 19 14:26:58 GMT 2015

/opt/smiles # date;time /bin/ash -c " ls | xargs -P 4 -n 50 grep -h 'C1CCCCC1C=O'| wc -l ";date

Sun Apr 19 14:25:54 GMT 2015

real 1m 4.21s

user 3m 57.52s

sys 0m 3.52s

Sun Apr 19 14:26:58 GMT 2015

Typically the performance of a large IO operation will improve after a few cycles; the system is able to cache disk reads. It generally takes 3 cycles before all of the data is in the cache. However, the numbers above did not improve. I did verify that multiple cores were, indeed, being used.

I proceeded down a rabbit hole, looking at IO and VM statistics. Horrible. From there I googled to see if, indeed, Docker uses the disk cache (it does) and/or if there was a flag I needed to set (I didn’t). Admittedly, I couldn’t believe that IO using Docker could be that much slower, but I am a firm believer in testing my assumptions.

After poking about in /proc and /sys and running the search outside of Docker, I decided to see if there might be a faster grep. As it turns out, the container uses busybox:

/opt/smiles # ls -li /bin/grep
 501101 lrwxrwxrwx    1 root     root            12 Mar  6 13:27 /bin/grep -> /bin/busybox

/opt/smiles # ls -li /bin/grep

501101 lrwxrwxrwx 1 root root 12 Mar 6 13:27 /bin/grep -> /bin/busybox

This is generally a good choice in terms of size. However, it appears that the embedded grep is considerably slower than molasses in January. On a whim I decided to install grep:

/opt/smiles # apk search grep
ngrep-1.45-r1
grep-doc-2.20-r1
grep-2.20-r1
/opt/smiles # apk --update add grep
fetch http://repos.lax-noc.com/alpine/v3.1/main/armhf/APKINDEX.tar.gz
(1/2) Installing pcre (8.36-r1)
(2/2) Installing grep (2.20-r1)
Executing busybox-1.22.1-r14.trigger
OK: 6 MiB in 18 packages
/opt/smiles # which grep
/usr/bin/grep
/opt/smiles # ls -li /usr/bin/grep
  66417 -rwxr-xr-x    1 root     root        189840 Feb  2 11:05 /usr/bin/grep

/opt/smiles # apk search grep

ngrep-1.45-r1

grep-doc-2.20-r1

grep-2.20-r1

/opt/smiles # apk --update add grep

fetch http://repos.lax-noc.com/alpine/v3.1/main/armhf/APKINDEX.tar.gz

(1/2) Installing pcre (8.36-r1)

(2/2) Installing grep (2.20-r1)

Executing busybox-1.22.1-r14.trigger

OK: 6 MiB in 18 packages

/opt/smiles # which grep

/usr/bin/grep

/opt/smiles # ls -li /usr/bin/grep

66417 -rwxr-xr-x 1 root root 189840 Feb 2 11:05 /usr/bin/grep

I then re-ran the test and did a Snoopy Dance.

After:

/opt/smiles # date;time /bin/ash -c " ls | xargs -P 4 -n 50 grep -h 'C1CCCCC1C=O'| wc -l ";date
Sun Apr 19 14:30:35 GMT 2015
19
real    0m 1.81s
user    0m 4.39s
sys 0m 2.38s
Sun Apr 19 14:30:36 GMT 2015

/opt/smiles # date;time /bin/ash -c " ls | xargs -P 4 -n 50 grep -h 'C1CCCCC1C=O'| wc -l ";date

Sun Apr 19 14:30:35 GMT 2015

real 0m 1.81s

user 0m 4.39s

sys 0m 2.38s

Sun Apr 19 14:30:36 GMT 2015

Lessons Learned

This episode drove home the need to question assumptions. In this case the assumption is that a smaller sized container is inherently better. I believe that smaller and lighter containers are a Good Practice and an admirable goal. However, as seen here, smaller is not always better.

I also habitually look at a container’s Dockerfile before pulling it. In this case it wasn’t enough. It reinforced the lesson that I need to know what’s running in a container before I try to use it.

This post has no tag

6 comments

Apr 18

Naive Substructure Substance Matching on the Raspberry Pi

Categories:

cheminformatics, Raspberry Pi

by Matt Williams

Chemists can search databases using parts of structures, parts of their IUPAC names as well as based on constraints on properties. Chemical databases are particularly different from other general purpose databases in their support for sub-structure search. This kind of search is achieved by looking for subgraph isomorphism (sometimes also called a monomorphism) and is a widely studied application of Graph theory. The algorithms for searching are computationally intensive, often of O (n3) or O (n4) time complexity (where n is the number of atoms involved). The intensive component of search is called atom-by-atom-searching (ABAS), in which a mapping of the search substructure atoms and bonds with the target molecule is sought. ABAS searching usually makes use of the Ullman algorithm or variations of it (i.e. SMSD ). Speedups are achieved by time amortization, that is, some of the time on search tasks are saved by using precomputed information. This pre-computation typically involves creation of bitstrings representing presence or absence of molecular fragments. By looking at the fragments present in a search structure it is possible to eliminate the need for ABAS comparison with target molecules that do not possess the fragments that are present in the search structure. This elimination is called screening (not to be confused with the screening procedures used in drug-discovery). The bit-strings used for these applications are also called structural-keys. The performance of such keys depends on the choice of the fragments used for constructing the keys and the probability of their presence in the database molecules. Another kind of key makes use of hash-codes based on fragments derived computationally. These are called ‘fingerprints’ although the term is sometimes used synonymously with structural-keys. The amount of memory needed to store these structural-keys and fingerprints can be reduced by ‘folding’, which is achieved by combining parts of the key using bitwise-operations and thereby reducing the overall length. — Chemical database

Substructure substance matching is, in many ways, a non-trivial exercise in Cheminformatics. The amount of data used to determine matches grows very quickly. For instance, one method of describing a molecule’s “fingerprint” uses 880 bytes. Or 2^880 combinations. This space is very sparsely populated, but there are still many potential combinations.

Another way of describing the structure of a molecule is Simplified molecular-input line-entry system or SMILES. This method uses a string which describes the structure of a molecule. Hydrogen atoms are generally stripped from the structure, so the SMILES representation for water is ‘O’. Likewise, methane is ‘C’. Single bonds are assumed. Double bonds are described by ‘=’, so carbon dioxide is ‘O=C=O’.

As it turns out, grep happens to work very well to find substructure matches of SMILE data. The following searches are performed on a subset of the NIH PubChem Compound database, 13689519 compounds in total. The original data has been processed on a Raspberry Pi — compressed, this portion of the database is ~13GB. Pulling out the SMILES representation and the compound ID, the resultant flat data is 842M in 733 files.

The 842M happens to fit into the ram of the Pi. After a few searches, the files are buffered in RAM. At that point, the speed increases mightily. The limit for reads of a MicroSD card is ~15M/s. Once cached in RAM, however, it is able to read >400M/s:

HypriotOS: root@apis-rpi-09 in /opt
$ hdparm -t /dev/mmcblk0

/dev/mmcblk0:
 Timing buffered disk reads:  50 MB in  3.01 seconds =  16.60 MB/sec
HypriotOS: root@apis-rpi-09 in /opt
$ hdparm -T /dev/mmcblk0

/dev/mmcblk0:
 Timing cached reads:   836 MB in  2.00 seconds = 418.22 MB/sec

HypriotOS: root@apis-rpi-09 in /opt

$ hdparm -t /dev/mmcblk0

/dev/mmcblk0:

Timing buffered disk reads: 50 MB in 3.01 seconds = 16.60 MB/sec

HypriotOS: root@apis-rpi-09 in /opt

$ hdparm -T /dev/mmcblk0

/dev/mmcblk0:

Timing cached reads: 836 MB in 2.00 seconds = 418.22 MB/sec

Following is a series of searching demonstrating how the search speeds up as the data is read into cache.

HypriotOS: root@apis-rpi-09 in /opt
$ du -h smiles
824M    smiles
HypriotOS: root@apis-rpi-09 in /opt
$ ls smiles|wc -l
733
HypriotOS: root@apis-rpi-09 in /opt
$ cat smiles/*|wc -l
13689519
HypriotOS: root@apis-rpi-09 in /opt
$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'CC1CCCCC1=O'|sort|wc -l
123

real    0m17.955s
user    0m3.060s
sys 0m2.610s
HypriotOS: root@apis-rpi-09 in /opt
$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'CC1CCCCC1=O'|sort|wc -l
123

real    0m11.874s
user    0m2.540s
sys 0m2.290s
HypriotOS: root@apis-rpi-09 in /opt
$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'CC1CCCCC1=O'|sort|wc -l
123

real    0m1.600s
user    0m2.500s
sys 0m3.290s

HypriotOS: root@apis-rpi-09 in /opt

$ du -h smiles

824M smiles

HypriotOS: root@apis-rpi-09 in /opt

$ ls smiles|wc -l

733

HypriotOS: root@apis-rpi-09 in /opt

$ cat smiles/*|wc -l

13689519

HypriotOS: root@apis-rpi-09 in /opt

$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'CC1CCCCC1=O'|sort|wc -l

123

real 0m17.955s

user 0m3.060s

sys 0m2.610s

HypriotOS: root@apis-rpi-09 in /opt

$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'CC1CCCCC1=O'|sort|wc -l

123

real 0m11.874s

user 0m2.540s

sys 0m2.290s

HypriotOS: root@apis-rpi-09 in /opt

$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'CC1CCCCC1=O'|sort|wc -l

123

real 0m1.600s

user 0m2.500s

sys 0m3.290s

Once the files are buffered in memory, the greps occur in close to constant time for reasonable searches sorted by the compound ID — the previous search matched 123 compounds; by comparison follows a search for a ring structure:

HypriotOS: root@apis-rpi-09 in /opt
$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'C1CCCCC1'|wc -l
75325

real    0m1.709s
user    0m3.300s
sys 0m2.950s

HypriotOS: root@apis-rpi-09 in /opt

$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'C1CCCCC1'|wc -l

75325

real 0m1.709s

user 0m3.300s

sys 0m2.950s

However, a ridiculous search for substances containing carbon does take a bit longer — there are limits to IO. This search matches almost all of the substances:

HypriotOS: root@apis-rpi-09 in /opt
$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'C'|wc -l
13679424

real    0m9.394s
user    0m26.280s
sys 0m7.140s

HypriotOS: root@apis-rpi-09 in /opt

$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'C'|wc -l

13679424

real 0m9.394s

user 0m26.280s

sys 0m7.140s

How, then, is the Pi processing so much data so quickly? Part of the secret lies in splitting the data into “reasonable” chunks of ~55MB. The other secret is in how xargs is invoked. Not all versions of xargs support multiple concurrent processes. The -P 4 says to run four instances of grep concurrently.

HypriotOS: root@apis-rpi-09 in /opt
$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'C1CCCCC1'|wc -l
75325

real    0m1.702s
user    0m3.130s
sys 0m3.020s
HypriotOS: root@apis-rpi-09 in /opt
$ time find /opt/smiles -type f |xargs -P 3 -n 50 grep -h 'C1CCCCC1'|wc -l
75325

real    0m1.881s
user    0m3.210s
sys 0m2.250s
HypriotOS: root@apis-rpi-09 in /opt
$ time find /opt/smiles -type f |xargs -P 2 -n 50 grep -h 'C1CCCCC1'|wc -l
75325

real    0m2.423s
user    0m2.970s
sys 0m1.860s
HypriotOS: root@apis-rpi-09 in /opt
$ time find /opt/smiles -type f |xargs -P 1 -n 50 grep -h 'C1CCCCC1'|wc -l
75325

real    0m4.344s
user    0m2.920s
sys 0m1.350s

HypriotOS: root@apis-rpi-09 in /opt

$ time find /opt/smiles -type f |xargs -P 4 -n 50 grep -h 'C1CCCCC1'|wc -l

75325

real 0m1.702s

user 0m3.130s

sys 0m3.020s

HypriotOS: root@apis-rpi-09 in /opt

$ time find /opt/smiles -type f |xargs -P 3 -n 50 grep -h 'C1CCCCC1'|wc -l

75325

real 0m1.881s

user 0m3.210s

sys 0m2.250s

HypriotOS: root@apis-rpi-09 in /opt

$ time find /opt/smiles -type f |xargs -P 2 -n 50 grep -h 'C1CCCCC1'|wc -l

75325

real 0m2.423s

user 0m2.970s

sys 0m1.860s

HypriotOS: root@apis-rpi-09 in /opt

$ time find /opt/smiles -type f |xargs -P 1 -n 50 grep -h 'C1CCCCC1'|wc -l

75325

real 0m4.344s

user 0m2.920s

sys 0m1.350s

Notice that the improvement on the time required is not linear; there is not much difference in time between three (3) and four (4) concurrent threads. The limit of IO has been reached.

With five Pi 2 boards, substructure searches of all 68279512 compounds can be performed in seconds.

It’s not perfect, some structures can be described in more than one way with SMILES. However, it’s fast and simple.

The next substructure search will utilize fingerprints.

This post has no tag

3 comments

Apr 15

Raspberry Pi and First World Problems

Categories:

philosophy, rants

by Matt Williams

And now, dear reader, a brief intercalary segue…..

Stating the obvious, I’ve been doing a lot of work recently with the Raspberry Pi. In truth, I’ve been trying to discern and/or work around its limitations. Consequently, I’ve caught myself wishing for just a little bit more bandwidth — thus far I/O is the limiting factor for me and the types of work I’ve been doing with the Pi.

Limiting as in 30MB/s disk read/writes. ~13MB/s MicroSD read/writes. ~7MB/s network I/O over ethernet — I wonder if I went with wireless I could squeeze out a little bit more… See? I’m doing it again.

I can remember not terribly long ago that I’d have killed for such performance. For that matter, I’ve (only) got 30Mb/s (note the lowercase ‘b’) coming into the house. The Pi could consume the entire bandwidth into the house.

And then I think back a decade where I thought that dedicated 768Kbs up and down was quite nice. Two and a half decades and I thought that transferring files from White Sands to CMU at 9600 baud was quite impressive.

Then I start to think about all that I now take for granted in the Day-to-Day which not terribly long ago would have been considered a “hard problem” if not Magick. I told my daughter about a year ago that I had a magick mirror which would allow me to see and talk to people on the other side of the world. She didn’t believe me, so I pulled out my phone. “Dad, that’s not Magick, that’s Technology.” Out of the mouths of babes and innocents, I am reminded of Arthur C. Clarke:

Any sufficiently advanced technology is indistinguishable from Magic

Frankly, a decade ago the idea that I could have a computer with 4 cores was not something I’d contemplated. In 2001 I purchased a laptop with a single core 1000MHz processor and a Gigabyte of ram for over $1700. At the time I thought it was something quite nice. Go back further a bit and I had a computer with 1MHz processor and 64K ram. In the mid 90’s I was running systems with hundreds of users on 60MHz processors and 128MB ram.

Yet I’m complaining about limitations of I/O on a machine which is considerably more powerful.

And then I think about all of the regions in the world where there isn’t good, stable electricity. Or internet access. Or libraries and books.

Or Food.

Or Water.

Or stable government. Of being able to walk outside my house with a reasonable expectation that I won’t be kidnapped or killed. My daughter can leave the house and go to school without worrying that she’ll be shot or stolen.

Suddenly I’m ashamed to be complaining about I/O constraints. First World problems, indeed.

This post has no tag

Leave comment

Apr 15

Swarming Raspberry Pi: Private Registry for Swarm Images

Categories:

agni, cloud, cloud-in-a-box, Docker, Raspberry Pi, registry, swarm

by Matt Williams

Some more backstory on the Pi Swarm

I was really excited when Amazon announced their Lambda offering. I thought that it was an awesome idea, but for the lack of an open solution and that it locked you into javascript.

I believe that using Docker, we can have a relatively simple Amazon Lambda work-alike which allows code from arbitrary languages to be run.

Along the way, I’ve investigated using Kubernetes, but it didn’t support ephemeral containers. It kept trying to resurrect the dead container. Hilarity ensued after a fashion.

Enter Swarm….

Swarm

I’d seen mentioned that Swarm wasn’t working with servers which require authentication. Issue #374, in its history, indicates that there is no (as of yet) support for registry requiring authentication.

It occurred to me that it might be possible to have swarm working with a local private registry via an insecure registry. A few tests later, and it’s alive!!!

Configuration

Registry

You’ll need to have a registry running. Swarming Raspberry Pi Part 2: Registry and Mirror has details.

Private registries have the standalone=true flag set. According to the documentation:

> standalone: boolean, run the server in stand-alone mode. This means that the Index service on index.docker.io will not be used for anything. This implies disable_token_auth -- [docker/docker-registry](https://github.com/docker/docker-registry#general-options)

1 2	> standalone: boolean, run the server in stand-alone mode. This means that the Index service on index.docker.io will not be used for anything. This implies disable_token_auth -- [docker/docker-registry](https://github.com/docker/docker-registry#general-options)

On first reading, it looks as though a registry cannot work as both a mirror and a private registry. In tests, however, I was able to use a private registry as both a mirror and private registry. I believe but have not verified that the registry is passing index queries to the docker hub. I have verified that images cached on the local private registry are served locally. So… it might be thought of as a private registry which just so happens to act as a image cache for images from the canonical registry. However, it is not indexing the images. If you care to experiment, you can start a registry as follows:

docker run -p 8080:5000 
    -v /opt/docker-cache:/tmp/registry -d 
    -e STANDALONE=true  
    -e MIRROR_SOURCE=https://registry-1.docker.io  
    -e MIRROR_SOURCE_INDEX=https://index.docker.io 
    -e GUNICORN_OPTS=["--preload"] 
    registry

docker run -p 8080:5000

-v /opt/docker-cache:/tmp/registry -d

-e STANDALONE=true

-e MIRROR_SOURCE=https://registry-1.docker.io

-e MIRROR_SOURCE_INDEX=https://index.docker.io

-e GUNICORN_OPTS=["--preload"]

registry

On a Raspberry Pi, substitute nimblestratus/rpi-docker-registry for the image. One thing which I didn’t get working (although didn’t test extensively — the TCP/IP stack on my laptop gets fussy after repeatedly connecting to different networks and connecting/disconnecting from multiple VPNs) is to run a mirror registry as a docker container and pointing the docker daemon to the registry container. Part of me thinks it might work, but I can also see where it wouldn’t — Docker might attempt to talk to the registry on startup, realize it isn’t up, then give up. There’s a good chance that it’s a race condition, though I have not looked at the code as yet.

Note that it is both a STANDALONE and mirroring registry.

Daemon

On the host(s) which access the local private registry, the docker daemon needs to be configured in order to allow access to the private repository. You can either set an environment variable, a command-line argument, or edit a config file. More information may be found in the Docker Documentation. I’ve used the config file; on debian based systems it’s generally located at /etc/default/docker. Add a line similar to the following at the end of the file:

DOCKER_OPTS="${DOCKER_OPTS} --insecure-registry apis-rpi-util02:8080"

1 2	DOCKER_OPTS="${DOCKER_OPTS} --insecure-registry apis-rpi-util02:8080"

Once done, the daemon will need to be restarted. On a debian system it is typically sudo service docker restart.

Note: In my test, I added the private registry to a host which already had port 80 bound. Hence the specification of the port. Your name, etc., will vary. I have written some thoughts about the pros and cons of various private registry schemes at Good Practices for Configuring Docker Private Registries.

Catching My Breath

At this point, the Raspberry Pi Swarm has:

Swarm
Consul and Registrator
A Private Registry and a Mirror
Monitoring is available through Consul and Registrator. I am not sure how well they work with ephemeral containers however. That is the subject of a future test. I may need to hack registrator to ignore ephemeral containers.

The remainder:

Bootstrapping the Swarm and basic services
Storage. I’ve got NFS working (it’s easy). I intend to evaluate:
a. S3
b. HDFS
c. Ceph or Gluster
Log aggregation
Solving some “real” problems. ${WORK} is involved in Cheminformatics and authoritative chemical information. I’ve decided as a way to stretch the abilities of the swarm to do some substructure searching of chemical substances. I am not a chemist; I remember a good deal of my Advanced Placement Chemistry from ’87, but let’s just say I’m learning a lot. It’s good though, I think. I don’t know what’s impossible!

“Lamb”da in the Cloud

At this point I’d like to introduce Agni. Agni is my answer to Apache’s Lambda. However, it differs in two major areas:

It’s Open Source and built upon Open Source.
It supports multiple languages — not just Javascript.

On a high level, code is registered with Agni and a container image is created. A part of the creation process entails specifying an event/message with will trigger the running of an instance of the container. When events are recieved, a listener spawns instances via swarm, passing the details of the message to the newly created Docker container.

More on Agni shortly…. Meanwhile I’m back to the cluster and seeing how I can leverage the work which the good people at Hypriot have done with Docker Machine (and to a lesser degree Kitematic since I don’t have a Mac).

This post has no tag

Leave comment

Apr 14

Good Practices for Configuring Docker Private Registries

Categories:

Docker

by Matt Williams

Private registries can be very helpful when using Docker — particularly if you’re wanting to be able to share code locally without either making it public or incurring the cost of a round trip. This post presents some practices which I think make life easier when using a Private Registry.

Where to look

Docker recognizes that an image is on a private registry when any of the following conditions occur:

An explicit port is specified in the image name, such as registry:5000/foobar.
An IP address is used, such as 127.0.0.1 or 192.168.1.123
A fully qualified domain name (FQDN), such as registry.nimblestrat.us or registry.local

By default, the registry port is 5000. By adhering to convention, it’s easy to look at an image and tell that it is coming from a private location. However, it’s extra typing and more to remember. I prefer using a FQDN and having the registry bind to port 80 — the name, assuming that the host has a good name (or CNAME record) such as registry.foo.bar.

How to use a private registry

In order to place an image into a private registry, you must first tag it with a name in which you have specified the location of the registry.

$ docker help tag

Usage: docker tag [OPTIONS] IMAGE[:TAG] [REGISTRYHOST/][USERNAME/]NAME[:TAG]

Tag an image into a repository

  -f, --force=false    Force
  --help=false         Print usage

$ docker help tag

Usage: docker tag [OPTIONS] IMAGE[:TAG] [REGISTRYHOST/][USERNAME/]NAME[:TAG]

Tag an image into a repository

-f, --force=false Force

--help=false Print usage

Each of these examples would work (assuming that a registry is bound to the IP/Port):

docker tag a1b2c3d4e5f6 127.0.0.1:5000/gnomovision
docker tag a1b2c3d4e5f6 registry.foo.bar/gnomovision

However, the following wouldn’t work for pushing an image to a private registry:

docker tag a1b2c3d4e5f6 gnomovision — mere mortals cannot “bless” an image and make it part of the “Official Repositories”
docker tag a1b2c3d4e5f6 registry/gnomovision — in this case, it considers registry to be a userid for the Docker Hub. There is not enough information to tell it that you’re trying to send it to a host named registry.

Recommended Practices

Either name a host registry or, better yet, use a CNAME record to alias a host as registry. That way you don’t have to remember that xyz.pdq.io is the registry.
Bind to the HTTP port.
Where possible, use authentication. Since my major use case is with Swarm and it does not as yet support authentication, I am investigating other means, such as only allowing connections from a local network. Socketplane is an option, too — have the registry listening on a private network address. Neither is perfect, but for the moment….

I’d love to hear what other folk think — are there practices which you use?

This post has no tag

Leave comment

Apr 12

Docker Workers Scale Nicely with Multiple Cores

Categories:

cheminformatics, cloud-in-a-box, Docker, swarm

by Matt Williams

Disclaimer: The title might be a bit misleading. For this workload, it’s scaling pretty much linearly. Other workloads might scale differently.

I was running a quick-ish test on a Pi to see how long it would take to churn through 50GB compressed data.

This processing consists of:

Determining the files to be processed — this is via an offset since ultimately there will be 10 workers processing the data. Also I think I might get a slightly more representative set than by simply taking the first N files.
For each file, start a Docker container which uncompresses the file to stdout, where data is extracted from the stream and appended to a file.

The input is read over NFS from a server with a single 10/100 NIC and the output is written locally. Why NFS? In this case, it’s easy to configure and works well enough until proven otherwise.

top output demonstrates that the process doing the extracting is, indeed, working hard:

$ top

top - 21:12:03 up 2 days, 13:33,  2 users,  load average: 1.03, 2.09, 1.80
Tasks:  97 total,   2 running,  95 sleeping,   0 stopped,   0 zombie
%Cpu(s): 25.2 us,  0.2 sy,  0.0 ni, 74.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:    947468 total,   554564 used,   392904 free,    48836 buffers
KiB Swap:        0 total,        0 used,        0 free,   438636 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND                                                         
 6358 root      20   0  4736 4288 1752 R  97.8  0.5   3:27.19 extract_prop_sd                                                 
 6357 root      20   0  1100    8    0 S   2.6  0.0   0:05.27 gzip

$ top

top - 21:12:03 up 2 days, 13:33, 2 users, load average: 1.03, 2.09, 1.80

Tasks: 97 total, 2 running, 95 sleeping, 0 stopped, 0 zombie

%Cpu(s): 25.2 us, 0.2 sy, 0.0 ni, 74.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st

KiB Mem: 947468 total, 554564 used, 392904 free, 48836 buffers

KiB Swap: 0 total, 0 used, 0 free, 438636 cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

6358 root 20 0 4736 4288 1752 R 97.8 0.5 3:27.19 extract_prop_sd

6357 root 20 0 1100 8 0 S 2.6 0.0 0:05.27 gzip

Eeek! The extractor (from the SDF Toolkit) is pretty much eating a CPU by itself. It may need to be replaced, depending on whether I have good enough results.

However, I am not going to optimize without testing and evaluating. It might just be the case that pegging the CPU might be ok — when the whole swarm is working, I believe that IO is ultimately the limiting factor. I haven’t tested it yet, so I don’t have much confidence in it. As I’m writing, I begin to doubt it — even if NFS is stupidly chatty, this extraction should only be a one or maybe two time event. I’d spend more time writing code to parse and then testing it than just letting it run. If this were happening on a regular basis, I think that I’d be more concerned. (I just had the glimmer of a fairly easy to implement AWK or Ruby streaming parser, so if I find myself performing the extraction more than I anticipate…)

That usage pattern remains consistent with more processors:

$ top

top - 21:15:14 up 2 days, 13:36,  2 users,  load average: 2.12, 1.69, 1.67
Tasks: 118 total,   5 running, 113 sleeping,   0 stopped,   0 zombie
%Cpu(s): 98.9 us,  1.0 sy,  0.0 ni,  0.0 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:    947468 total,   500500 used,   446968 free,    48908 buffers
KiB Swap:        0 total,        0 used,        0 free,   369856 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND                                                         
 6534 root      20   0  4748 4272 1736 R  97.5  0.5   0:26.10 extract_prop_sd                                                 
 6545 root      20   0  4732 4360 1812 R  97.5  0.5   0:25.72 extract_prop_sd                                                 
 6558 root      20   0  4744 4292 1756 R  97.5  0.5   0:25.90 extract_prop_sd                                                 
 6554 root      20   0  4716 4292 1756 R  96.5  0.5   0:25.85 extract_prop_sd                                                 
 6532 root      20   0  1100    8    0 S   2.6  0.0   0:00.63 gzip                                                            
 6543 root      20   0  1100    8    0 S   2.3  0.0   0:00.57 gzip                                                            
 6557 root      20   0  1100    8    0 S   2.3  0.0   0:00.62 gzip                                                            
 6553 root      20   0  1100    8    0 S   2.0  0.0   0:00.55 gzip

$ top

top - 21:15:14 up 2 days, 13:36, 2 users, load average: 2.12, 1.69, 1.67

Tasks: 118 total, 5 running, 113 sleeping, 0 stopped, 0 zombie

%Cpu(s): 98.9 us, 1.0 sy, 0.0 ni, 0.0 id, 0.1 wa, 0.0 hi, 0.0 si, 0.0 st

KiB Mem: 947468 total, 500500 used, 446968 free, 48908 buffers

KiB Swap: 0 total, 0 used, 0 free, 369856 cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

6534 root 20 0 4748 4272 1736 R 97.5 0.5 0:26.10 extract_prop_sd

6545 root 20 0 4732 4360 1812 R 97.5 0.5 0:25.72 extract_prop_sd

6558 root 20 0 4744 4292 1756 R 97.5 0.5 0:25.90 extract_prop_sd

6554 root 20 0 4716 4292 1756 R 96.5 0.5 0:25.85 extract_prop_sd

6532 root 20 0 1100 8 0 S 2.6 0.0 0:00.63 gzip

6543 root 20 0 1100 8 0 S 2.3 0.0 0:00.57 gzip

6557 root 20 0 1100 8 0 S 2.3 0.0 0:00.62 gzip

6553 root 20 0 1100 8 0 S 2.0 0.0 0:00.55 gzip

The following tests were performed on a Pi 2 with increasing amounts of parallelism:

Number of Files/Total Size	Number of Concurrent Docker Containers	Elapsed time	Average time/file (rounded)
3/43.6477MB	1	real 13m46.731s user 0m1.420s sys 0m1.350s	4:35
10/135.167MB	3	real 17m22.104s user 0m2.910s sys 0m1.420s	1:44
12/162.307MB	4	real 15m45.713s user 0m3.840s sys 0m0.920s	1:19

Note: 10 is not evenly divisible by 3, so the last file was running by itself.

Yes, the individual runs are slower (~4:35 for one vs. 5:15 when 4 cores in use), however the multiple cores more than make up for it.

The number of concurrent processes was controlled by xargs:

cat file_list | xargs -P $procs -n 1 worker.sh

1 2	cat file_list \| xargs -P $procs -n 1 worker.sh

The -n 1 specifies that one argument (file) is sent to each invocation of the worker script. One advantage of doing it this way is that if one file finishes quicker than another (or is smaller) then processing is not held up.

The output is, on average, slightly more than 7MB for each of the 12 files. Small enough that I’m not too concerned about compressing them (yet) — I have 16GB MicroSD cards which are less than 25% full.

So…. there are 3664 files. Assuming that I have 4 processes per Pi 2, and 1 per Pi B+, that will give me 25 among the 10 worker nodes. If I press additional hosts to work I could get up to ~37 at the expense of more hosts hitting
a single NFS server. I think I shall copy the data to another data host & split the reads in half.

So, assuming 25 workers and each file taking about 1.5 minutes of wall clock time (padding for IO latency), I should be able to churn through the files in approximately 3 hours and 40 min. Even at 2 minutes/file, that is about 4 hours. Not too terribly bad.

I might be able to get a little more performance if I allow the docker containers to use the host’s network stack. That’s a test for another day, however.

This post has no tag

Leave comment

Apr 12

Docker Commandline Arguments are Context Sensitive

Categories:

cheminformatics, Docker, gotchas, Raspberry Pi

by Matt Williams

All I can say is that it was late when I wrote the script. And I was distracted between the feline overlords (one of whom is attempting to climb into my lap) and the babble box. That’s my story and I’m sticking to it. PEBKAC and ID10T errors were not involved.

Creative Commons licensed ( BY ) flickr photo shared by JeepersMedia

I’m processing 50GB of compressed cheminformatics data on the Pi Swarm, extracting certain pieces of data from substance records from NIH. I created a docker image containing Perl and the SDF Toolkit rather than writing my own parser. Tested a trivial case and pushed it to Docker Hub as nimblestratus/rpi-sdf-toolkit. So far, so good. Then in order to get an idea of how long it would take to process the lot, I wrote a quick script to split up a chunk of the data among the Pi’s.

After determining its next chunk to process, the container is started. However, my script was invoking Docker like this:

docker -v  /mnt:/data run nimblestratus/rpi-sdf-toolkit ARG ARG ARRRRRRRG

docker -v /mnt:/data run nimblestratus/rpi-sdf-toolkit ARG ARG ARRRRRRRG

And it printed the version and came back immediately. Did not pass Go. Did not collect $200. Certainly didn’t process any of my data.

What I should have typed to mount a volume was:

docker run -v /mnt:/data nimblestratus/rpi-sdf-toolkit

1 2	docker run -v /mnt:/data nimblestratus/rpi-sdf-toolkit

So…. long story made short: order does matter. Now serving number 35.

This post has no tag

Leave comment

Apr 09

Abusing Awk

Categories:

Awk, howto, Sick and Wrong, tips, tools

by Matt Williams

Almost ashamed to admit I did this, yet it’s still kinda cool.

I use awk for a lot of commandline parsing; I learned it back in 1989…. before perl was much of a thing. For some problems, awk “just works”. So I was wanting to count the number of instances of ‘A’ in a collection (25K) of long strings (each one >150 characters).

I thought about a quick way of counting these characters…. and it occurred to me that I could split() the string, using ‘A’ as the delimiter, then count the array size:

<br />awk '{
   a = split($1,a,"A");
   printf("%03dn", length(a) - 1);
}'

<br />awk '{

a = split($1,a,"A");

printf("%03dn", length(a) - 1);

This worked. Quite well, actually. I’m sure that there’s a much “better” way to do it, but this one works.

A little later it occurred to me that awk was already splitting the string.

awk -F 'A' '{printf("%03dn", (NF - 1));}'

1 2	awk -F 'A' '{printf("%03dn", (NF - 1));}'

One of the nice things about Unix is that there’s usually five ways to do something and it’s usually faster to do it the way you know how rather than spend the time looking up the “right” way.

This post has no tag

Leave comment

Ramblings

Musings of Matt Williams

Weaving with Light Pt. 1

Abusing HAProxy: Stupid Simple Easy Dashboards

Rules for Operations

DevOps Creed (Work in Progress)

I am not a Mindreader: a mini-saga

Heterogenous Docker Swarms Teaser

‘Piping’ Hot Docker Containers

Docker Containers: Smaller is not always better

TL;DR

Background

Lessons Learned

Naive Substructure Substance Matching on the Raspberry Pi

Raspberry Pi and First World Problems

Swarming Raspberry Pi: Private Registry for Swarm Images

Some more backstory on the Pi Swarm

Swarm

Configuration

Registry

Daemon

Catching My Breath

Good Practices for Configuring Docker Private Registries

Where to look

How to use a private registry

Recommended Practices

Docker Workers Scale Nicely with Multiple Cores

Docker Commandline Arguments are Context Sensitive

Abusing Awk

Subscribe to Blog via Email

Recent Posts

Top Posts & Pages

Archives

Categories

Copyright