“In the Information Age, the first step to sanity is FILTERING. Filter the information: extract for knowledge.
Filter first for substance. Filter second for significance. These filters protect against advertising.
Filter third for reliability. This filter protects against politicians.
Filter fourth for completeness. This filter protects against the media.”
— Marc Stiegler, David’s Sling
I first read this book in 1988 after reading excerpts from it in Analog Magazine. In a lot of ways it has shaped my thinking as an adult and how I approach information. This quote came up in a conversation at the house the other evening so I found and re-read it.
Over twenty-five years have elapsed since these words were penned, yet I think that they’re just as true now as they were then. The thought that struck me, however, as I read the quote was that the ideas apply to other endeavors as well. These ideas could also be used as a benchmark for data centers and DevOps.
Measuring all the things is an ideal for which to strive — it can be hard to predict or know what is important to monitor in a system. Moreover it is often hard to predict ahead of time what is actually valuable — some patterns emerge after the fact and unless there is a sufficient amount of data, the pattern is lost.
By the same token, too much data is indistinguishable from noise. And it takes a lot of space, time, and energy to store. Not to mention the costs in making it accessible to search.
In reading the quote above it strikes me as being similar to the DIKW pyramid:
Data becomes Information. Information becomes Knowledge. Knowledge becomes Wisdom. Subsequent filtering and manipulations are required to achieve the final desired form — it’s possible that stopping in the middle is perfectly acceptable, too!
A different view of the DIKW pyramid was proposed by Milan Zeleny in 1987 [Zeleny], mapping the knowledge hierarchy to:
I really like this explanation — it stands on its own better than the DIKW pyramid — the terms are self-explanatory. It definitely describes stages in operational maturity, too!
In mapping Stiegler’s filters to IT/DevOps I would reverse their order:
Completeness ties directly into the “Monitor all the Things!” (as opposed to Big Brother’s “Monitor all the Thinks!”) It is difficult to achieve an accurate picture of a system without completeness. Granted, when there are many variables, some pieces need to be fixed in order to understand and draw conclusions, but that is what the other filters will provide.
Reliability cuts across a number of areas. Can we reliably capture and store the right metrics? Are the metrics being masked by some other event? Are we causing a Heisenburg issue where the very act of monitoring a system skews it badly. I’ve been doing some work recently with building out an infrastructure on a Raspberry Pi cluster. One tool for monitoring the behavior of the machine takes ten percent of the CPU. It is, perhaps, reliably capturing and transmitting the metrics as it sees them, but it’s placing a large load on the system to do so. That load, in turn, is likely affecting the behavior of the other parts of the system.
Significance is that which differentiates the needle from the haystack. However, there are many types and sizes of needles; just because a metric happens to fit a hypothesis does not mean that it is the root cause or even involved in an issue. Humans are very good at seeing patterns, especially those which do not exist.
Statistics can be made to prove anything – even the truth. — Author Unknown
Substance goes hand-in-hand with significance. It refers to “[the actual matter of a thing, as opposed to the appearance or shadow; reality.]”(http://dictionary.reference.com/browse/substance). The conclusions which we draw from analysis of metrics need to be substantive; they need worth and meaning. The combination of the
substance and significance will help remove the causative/correlative fallacy.
Re-evaluating beliefs and viewing ideas in a different framework/light is something which is very useful. Taking the original quote and examining how it maps to other systems — namely DevOps metrics and measuring has been a very useful exercise for me. It has me wondering what should be the next one to visit!
[Zeleny]: Zeleny,It Milan (1987). “Management Support Systems: Towards Integrated Knowledge Management”. Human Systems Management 7 (1): 59–70.