“Just apply an additional index or some cool caching. That should do the trick.”
As a long-term software development geek I notice that more and more junior developers are unaware of the fundamental basics behind performance optimization strategies. Often, they just throw in some caching, indexes or better hardware and hope that these remove the bottlenecks they just spotted. Most of the time this works as expected, and all is well. However, I often get the feeling that they somehow lack or lost the theoretical reasoning that each performance optimization technique is based upon.
During my study (I’m talking pre 2000 here) we were learned that all performance optimizations can in fact be brought back to a rebalancing of a few correlated process parameters. I’ve been in search of a web page, paper or reference of any kind that describes this rebalancing on a understandable way, but until now my search has not been successful yet.
The above is the reason to write this post. I call it the DOV throughput theorem because the idea is not only applicable to software development but to countless other domains where a processes need to be optimized. Please feel free to prove it using a scientific method or provide references to any other comparable and related theories.
Let’s keep it plain and simple and bring all performance optimizations back to the following one-liner:
“When a process is described in terms of Duration, Operation and Volume,
one of these three parameters can be improved by decreasing/increasing the other ones.”
From the three parameters presented in this statement Duration is most probably the easiest to understand. A process can take more or less time based on other factors. Operation specifies the actual amount of operations available for processing.
This can be as broad as the number of robots and/or production lines when producing cars in a factory, the amount of CPU processing cycles available in a software applications or even the number of people assigned to conduct the same labor.
The last one, Volume is the amount of items can be processed, for example cars when talking about a factory, Kilo/Mega/Giga bytes when talking about some software applications and even the number of dollars a business is able to produce.
The theorem in here is that it is impossible to change one of the DOV process parameters without changing at least one the other two as well.
Below are a few examples that show how the DOV parameters can be spotted in everyday performance optimization processes:
- Video codecs: Increased CPU load (Operation) and additional time (Duration) to decrease the final size of a video file (Volume).
- File compression: Increased CPU load (Operation) and additional time (Duration) to decrease the size of a file (Volume) needed to transfer or store on disk.
- Web caching: Store additional copies (Volume) in order to reduce the total number of requests (Operation) and/or time it takes to execute a request (Duration).
- Indexes: Increased disk usage (Volume) to reduce the needed record comparisons (Operation) and time (Duration) to find matching records during database operations.
- Finance: Increase the amount of work (Operation) done during a work-day (Duration) to increase the amount of money (Volume) paid.
- Driving: Increase the distance (Volume) by driving longer (Duration) and/or faster (Operation).
- Drying clothing: Change from hanging clothing outside to a drying machine (Operation) to reduce the time (Duration) it takes for them to dry.
- Cooking water: Decrease the time it takes (Duration) for water to cook by decreasing the amount (Volume) of water or increasing the temperature (Operation).
I am not a scientist so forgive the lack of evidence. This post is merely intended to find a unified but still easy to understand way to describe the mechanisms that drive most known performance optimization techniques. Feel free to criticize (in a positive way), discuss and/or provide references to websites containing additional information about the topic.
Thanks in advance!