TL;DR — When using AUFS in a memory constrained environment, Java can spawn (lots!) of Zombies. A workaround is to change the storage driver to the device mapper.
In working on the Hadoop in a box CDH cluster with Cloudera Manager, I’ve discovered a few interesting things about AUFS. These experiences are with Ubuntu 14.04 and Docker 1.9.1. Others have reported similar results using Java in Docker without CDH.
I did my initial development of the CDH in a box containers in environments with 32G and 24G ram, switching to the latter when I was informed the target was for a host with 24G. With that amount of memory, everything just worked and no zombies. However, people started placing it on hosts with less ram and Java started spawning zombies. So I took a closer look.
I had previously noticed that the amount of cached and buffered memory seemed, to me, awful high, but I know that Linux uses it for optimizing IO. As it turns out, this memory doesn’t seem to be “free-able” when using aufs. Add to this Java, and weird things occur.
I tested on a quad core, 12G host, running up the manager and three workers. And then the zombies appeared. In a very short order — minutes — I had 260 zombies! This is in part due to supervisord
restarting the failed jvms.
This necessitated a reboot. Once rebooted I started to do some research.
I found a couple of items hinting at issues and workarounds. I then decided to test the device mapper driver and set about converting my aufs rig to device mapper. After a few iterations, the least invasive steps are as follow:
docker ps -aq | xargs docker rm -f
docker images -q | xargs docker rmi
service docker stop
- Edit
/etc/default/docker
and add the following to the end:DOCKER_OPTS="${DOCKER_OPTS} --storage-driver=devicemapper"
service docker start
Now you can restart the cluster. I did so and once things stabilized, started adding services back to the cluster. I did not tweak any parameters, except:
- DataNode Default Group / Resource Management:
dfs.datanode.max.locked.memory = 65536 B
— this alleviates “Cannot
start datanode because the configured max locked memory size… is
more than the datanode’s available RLIMIT_MEMLOCK ulimit,” as
documented at
Apache Hadoop 2.4.1 – Hadoop Distributed File System-2.4.1 – Centralized Cache Management in HDFS. -
Service-Wide / Replication:
dfs.replication = 1
Notes and Caveats
-
YMMV, but changing to the device mapper seems to slow things down about 10%. However, I’d rather, particularly in a test/development environment be stable and not spawning zombies!
-
This is not using the LVM backed storage.
-
Ubuntu 14.04 is on Kernel 3.13; other options emerge post 3.18.
-
I have been able to run quite a few more services on an openstack instance with 24G of ram: