Anatomy of a flawed microbenchmark highlights some of the problems involved in creating microbenchmarks (Java oriented).
The scary thing about microbenchmarks is that they always produce a number, even if that number is meaningless. They measure something, we’re just not sure what. Very often, they only measure the performance of the specific microbenchmark, and nothing more. But it is very easy to convince yourself that your benchmark measures the performance of a specific construct, and erroneously conclude something about the performance of that construct.
Even when you write an excellent benchmark, your results may be only valid on the system you ran it on. If you run your tests on a single-processor laptop system with a small amount of memory, you may not be able to conclude anything about the performance on a server system.
This is why I prefer to benchmark with ab rather than microtime().
Another good example is people measuring the performance overhead of parsing PEAR.php and then mentally multiplying this by the number of PEAR packages that use PEAR.php in their code. Obviously adding infrastructure code in userland is going to incurr overhead, however the point if such code is to make your general life easier. Therefore you are likely to make heavy use of said infrastructure code. This means that the parsing overhead will diminish. Obviously though you would want PEAR.php to have efficient code since its likely that code contained therein is going to be called often.
Indeed. Good point.
One might make the same point about breaking large code bases up into multiple small files. Bad for microbenchmarking, good for flexibility, reuse and modularity.
[...] Microbenchmarks of single and double qouting. I wrote earlier about flawed microbenchmarks. Today on sitepoint, there was a post on the perf [...]