The Importance of Optimizing the Right Thing
I’m following up from my previous post because I got some feedback on the PR’s performance from the Eigen maintainers. You may recall it had two parts:
- Taking advantage of the identity right hand side (RHS) when producing a Q matrix from \(Q \times I\)
- Switching the inner (result column \(j\)) and outer (Householder vector \(h_k\)) loops and precalculating \({\beta}_k h_k\)
The first of those was acknowledged and quickly taken into Eigen in a somewhat different form. The second did not do so well.
Reality Strikes
One maintainer ran my changes on a selection of “real world” matrices taken from the SuiteSparse Matrix Collection using his own benchmark code and found a moderate to significant slowdown, in contrast to my measured 2-3X speedup. So what went wrong?
The real world matrices differed from mine in at least two significant ways:
- They were considerably sparser. I measured 5, 8, and 20% dense matrices but theirs varied between 0.5 and 1.5%
- Mine were uniformly random - that is, every entry had an equal chance of having a nonzero value. Real world matrices are often “lumpy”.
In addition, solving against a single vector (which involves multiplying by \(Q^*\)) was an important metric. For this case there is only one result column and we get no savings from precalculating \({\beta}_k h_k\). My tests always used square RHS matrices of the same size as \(Q\).
Conclusion
I’m going to take another look at this later - I still have some ideas that might help - but the lesson learned for now is be sure you have the right metrics when you optimize.