I'm currently in the process of analysing some such cases. The gains might not seem (or even be) all that fancy or worth thinking about, but I'd still like to optimise this particular module as much as I reasonably can.
One thing I've just done is changing math.atan2 functions to math.atan, which is much faster than the aforementioned. Atan2 is the safer option and handles more scenarios than atan, but in my case, I've swapped all uses of atan2 to atan when I know that these scenarios are impossible.
From my tests, atan seems to be about 40% to 50% faster, which is great, but when you consider that atan2 takes roughly 0.000125ms to run, the gains don't seem particularly impressive. If I create dynamic shadows for 20 objects per frame, with roughly 12 uses of atan2 per object, then using atan saves me about 0.0165ms per frame.