Thursday, June 19, 2008

Day 159: Profiling

Today was my first in-depth profiling. Yesterday I did some looking over, but even with ruby-prof's excellent HTML output, it was taking me a while to get my head around exactly where the bottlenecks in my code were.

Then I found out that ruby-prof can output in a format that KCacheGrind understands. I also found out that KCacheGrind exists, which was likewise helpful, and suddenly I had a tool which could walk me through parts that needed optimizing.

My first surprise was that Set::merge was called FOUR MILLION TIMES. No, I'm not making that number up. When you ask a KuiObject for its attributes, children, etc, it collects all of the ones it knows about in a set and then merges it with the ones its superclass knows about. Except there was no caching here, it was doing it every single time.

Even that wouldn't have been a problem, because you'd think I wouldn't be asking what every single attribute of an object is except when I'm loading/saving (and then time is not as important). But it turns out that it was heavily used in comparison. Two KuiObjects are == if all their children, attributes, etc, are equal, and this requires me to call Set::merge above.

So I fixed the caching problem, and that sped things up. Then I made it so that two KuiObjects are == if their tags are ==. This changed a dozen+ string comparisons and countless object comparisons into one string comparison. That sped things up a lot.

The beauty of using profiling is that I can tell whether harebrained ideas I have for optimization actually work. For instance, I thought that the string comparisons were likely to also be too slow, and I figured symbol comparisons would be faster. Turns out, no, they're not, converting the strings to symbols and comparing them is a little bit slower than just comparing the strings to begin with.

My other optimization idea, to take a line of the form (angle / 10).to_i and change it to (angle.to_i / 10), did speed things up a bit, due to the fortunate elimination of floating point division.

At this point the game is smooth again in sectors with lots of ships, but not when they start shooting the hell out of someone. There are a few other spots in the code I can optimize, fortunately. I would like to avoid having to drop down to C, as I know that this more likely indicates a failure to optimize on my part than it does a shortcoming on Ruby's part.

No comments: