A while ago I wrote on Facebook that I started InfluxDB and Grafana for analyzing my heating control system. I facepalmed about the point that I didn't used it before after all the time I was in contact with both tools at customer sites for performance gigs. You always think about IO rates or memory consumption when you use the tool usually in that manner but not about valve states and buffer temperatures.

Usually I'm telling my customers if they ask me to support them through a change of their software to record the behaviour before and after the change. Record is meant in a really loose manner here ... ranging from just dumping the output of a few commands into a directory before and after the choice, or using tools like GUDS to recording absolute everything.

I'm quite a fan of proactively recording everything. You often don't know what will kill your performance, before it kills your performance. Measuring only selective data is based on your assumptions where a problem might happen. You can make professional guesses based on your experiences and you may be right but I'm doing this job for 16.5 years now and there was always a situation where I thought "If I suggested to the customer to record the behavior of this value everything would be much easier. Dang!".

In this article I want to write about an "almost everything" approach. As I mentioned in the past, kstat is the source of a lot of truth when working with Solaris. So what's more obvious to dump all the data of kstat into some kind of database.

At a number of my customer their central repository of statistical information is a InfluxDB and they are using Grafana to visualize it. So i had to use it as well. No discussion. I reproduced the mechanism I've used at a customer as far as it was in my head in Python yesterday evening. You shouldn't need any additional packages or python modules.

The script is really simple. It just takes the complete kstat -p and dumps it into an Influx DB. It uses "kstat" as the name for the measurement, the name of the system as host and uses tags according to the naming of the kstat naming policy: Module, instance, name and statistic. The value is ether in value (numeric ones) or valuestring (the kstat value has a string as it's value). A number of lines are ignored, tags are cleaned up by substituting a number of chars by a "_". It imports it into the database in batches of 500.

Disclaimer: The script is horrible Python code, it may contain gapping security holes or accidentally trigger the destruction of the system. The script doesn't catch any errors. It's some kind of proof of concept. Do whatever you want to do with it. Before using it on your system check it and adapt it to your own needs. But don't ask me.

You can download the code at github. It needs a config file. A sample is available at github as well. You use it like ./kstat2influx.py kstat2influx.cnf

You can run it periodically. Like every minute or every 10 minutes. However keep in mind: It dumps 24000 values in your influx db ... per execution of the script. You should plan accordingly.

Perhaps it's useful for one or the other as a starting point.

1 Comment

Linear

  • Richard Elling  
    I’ve doing this for several years and it works reasonably well.

    One nice thing about kstats is that the tend to be counters. This makes downsampling very easy in influxdb continuous queries.

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA