After actually implementing affine decomposition, it's clear why the method is very expensive and not appropriate for a games runtime. However, it's still useful for offline tools. Usually, an animation type of tool works with only what is being animated. Meaning there are far fewer matrices to potentially decompose.
The underlying data, and the matrices which are used to render don't have to be 1 to 1. You can decompose the matrix, without touching the data. Similarly, the output data does not need to match the input data. Where affine decomposition is really helpful is to design animations with proper squash / stretch, then export some touched up data. The touched up data is decomposed, it's not the same as the input data, rather it's what looks correct.
Finally, there is room for improvement! The algorithms presented here are focused on being easy to implement, not speed or efficiency. There are alternate algorithms for almost every step of the affine decomposition. Most of the alternate algorithms are faster and more efficient.