Aug 01

Cloudy Update

I’ve been working with Docker a good bit and have updated my list of tools.  Here’s a quick dump of where I am in the design of the infrastructure.

  • collectd will be used to monitor cgroup statistics.  This necessitates compiling all or part of collectd — the current packages do not contain the cgroup plugin. This will run in the hosts which run the containers.  Additionally, information about the hosts will be collected.  Thresholds will be used to send alerts, scale up or down services, etc..  Graphite or some other tool will provide graphs for the dashboards.
  • A log aggregation tool will capture logs from the various containers.  I’m considering logstash due to the large number of inputs which are already defined.  OpenTSDB is another option — it looks like it may be more poweful in some senses, but more difficult to configure in others.  My main concern with both is that they’re java based and in the case of logstash it requires a java collector running in each container and even though it’s a default sized jvm, assume it will require ~128 mb for each container.  That adds up quickly and I’d rather have something lighter.  I’ve not done enough research on OpenTSDB to speak to what it uses.
  • Dynamic DNS will be used to register the various services.  At present I am leaning toward Power DNS using PyPdnsRedis as a pipe backend  and redis to store the dynamic data.
  • Nginx will sit in front of hipache which is a dynamic proxy/load balancer which uses redis. I have not decided whether to use the node flavour of hipache or the one embedded in nginx — that needs to be tested. Calls to the services will be routed through the proxy.
  • An image repository will be available for local hosting of images.
  • A web-based front end to configure hosts, services, and provide a dashboard view.  Given the many other pieces which are using redis, I am seriously contemplating using it to store the data used to define services and hosts.  I’m not 100% sure of this yet, however.
  • Services consist of a particular process, such as a restful service running in a web server, or a jvm, or …. In their definition, the following information will be stored:
    • The name of the service
    • Ports which need to be exposed
    • Whether the service is active
    • The image
    • Dependencies – particularly services which need to be running prior to the start of this service
    • The minimum and maximum number of instances of the service
    • Threshholds
    • System requirements (cpu, memory)
    • Heartbeat – this is in addition to the thresholds
    • There may be a sort of inheritance to help cut down on duplicate information.
  • Hosts run the docker daemon and host containers.  Their information consists of:
    • Name of the host
    • IP address (may be dynamic)
    • Is the host available
    • Does the host need to be started (is it out in the public cloud)
    • Any cloud information needed to start it.
    • Priority – this determines the order in which containers and services are started – I anticipate that private hosts will have priority over public cloud hosts — due to expense; it makes sense to have overflow go to the public cloud.
    • Server specs
    • Currently available resources — this can be grabbed, in part from collectd.
  • There may be a discovery process, akin to the old Sun Jini project whereby a service can advertise itself and other services can utilize that service.  I could see this being used, for instance, if a service needs a cache.  Databases likely would not be as useful.

A picture will follow soon.  However, a good bit of the work’s already been done — where possible I’m integrating existing tools and projects.  Obviously the front end will need to be written.

I’ve decided that I’m going to repurpose my nimblestratus project on github — it’s not like I’d really done much of anything with it.  Unfortunately someone registered nimblestratus.com two weeks ago, but .org is still available.

I’m pretty excited, though — the pieces/parts are coming together in my head and I believe that this is do-able in fairly short order.  I really, really want to have a simple proof of concept done in the next couple of weeks, or by the end of August — I’m going to Cloud Develop and would love to have something for a “show-and-tell”.

Leave a Reply

%d bloggers like this: