About tuning
Recently I was doing some work in regard of tuning systems. There is something i really hate about this topic of computing: Tuning scripts. You find them on google easily and i find them on systems at customers quite often. Simply said: I hate them. The reasons for it are simple. For example recently I found a system with a networking tuning script dating back into 2003 or so. The problem: It was meant to increase some of the settings. However many of them were already higher in the default config of current Solaris 10 versions, thus the tuning script essentially reduced the parameters and thus reduced the performance. Futhermore: Tuning is a lot about understanding things. Understanding how things work together. On a systemic, on an architectural level. How an application loads all the rest of components. Just dropping a script downloaded from a website found by Google - into /etc/init.d is not about understanding things. You have to carefully consider each change from the default about the impacts. You have to check each setting, if the setting hasn’t already overtaken by the years. You have to recheck it it with every major update of your environment. You have to recheck it with each new technology you are using in your system. Network tuning scripts dating back to a time when 100 MB/s were normal and 1 GB/s are fast aren’t necessarily up to the task in a time when 10 GBit/s are fast and Infiniband IPoIB networks deliver even more. You had to turn different knobs in a time, when cpu time was precious. You’ve tuned for minium cpu utilization. CPU isn’t a large factor today, you tune for minimize latency or maximize throughput. You have to know what you want to aim for, because minimum latency and maximum throughput are often mutually exclusive. Do you want an extreme or a target in between. Just using a script to tune something doesn’t lead you through all the thought to make really good tuning decisions. There are some basic rules from my point of view:
- For Solaris there is for each major update a document updated called “Oracle Solaris Tunable Parameters Reference Manual”. It’s available at here.Check your tuning script at each update against this document in order to ensure that you settings make sense. The 817-0404 is a great read anyway, as it gives you an condensed overview over the possible official knobs to turn.
- Each tuning script has to be carefully designed for each system for each workload. An /etc/system for a database make perhaps no sense for a webserver or may even reduce the performance and vice versa. That’s the reason why I don’t like tuning scripts as part of jumpstart configurations. An NFS server delivering many small files to many clients may need other tunings than a server delivering few large files to a few clients.
- Understand! Understand how components of your architecture interact. Understand the load. Monitor the load for a while. Tuning is at foremost doing a lot of research before and just at last about putting some lines into some scripts. That’s one of the reason why I like tools like dtrace in general, and the scripts from the dtracetoolkit especially. Those tools are like the professionals tools of a mechanic. They can help you to understand the load. The equivalent of the basic set of screwdrivers: All the *stat tools are helpful, too. You need them to know where you can start with the more advanced tools, that are often too focused.
- Have a benchmark load. Have a mechanism to reproduce the load on the system. The more realistic and the less synthetic the better. So you can quantify the impact of each tuning.
- Just change one knob at a time. Otherwise you will never know what setting really helped you or which setting hurted the performance
- Of course: Experience helps a lot in shorten the research phase and sometimes there are some low-brainer settings. But even when you think “Nah … this is just a webserver. Let’s use script number 7b”, you may be surprised by interesting side effects. So there is such think like a no-brainer in tuning
- Large standard applications have often a set of certain tuning parameters that the vendor suggest as a good baseline: At first - look them up in the tuning guide in order if the ISV tuning guide wasn't overtaken by the defaults in the OS. At second: Check them with the vendors manuals with each release ... had once a performance problem with customer that used the tuning parameters suggested for - let's say - version 2, however there weren't that reasonable with version 5.
- It's a good practice to create two baselines: At first check the service performance with a totally untuned system ... just modify settings you need to get the software up-and-running.Run your benchmark. Configure the system with the vendor suggested parameters. Rerun the benchmark. Now do your own tunings. Rerun the benchmark with each knob you turn. And ask yourself: Is the performance impact worth the deviation from the standard parameters from a plain vanilla installation? With such baselines you can assess the impact and thus the value.