A story about memory

Here’s an old story.  It takes place in the distant past, right about lunchtime yesterday.

I was working on getting MMORPG Tycoon 2’s PCs to draw using instancing.  I had assumed that this would be an easy task (I’d already converted buildings, monsters, and a few other things, after all), but it turned out to be much more complicated than I’d expected.  See, VectorStorm’s support for instanced rendering only works for single objects, and the game PCs are made up of a hierarchy of six objects (head, torso, and left and right arms and legs).  PCs aren’t being animated yet, but the plan is that they will be, so I didn’t want to smash those pieces together into a single object for the sake of drawing them using instancing right now;  I’d just have to separate them again later.

Instead, I started setting up a new system; a system which will eventually allow different player classes to be represented by different models (This system is being used in the RenderTech 5 build, but different player classes still use a single model right now).   In brief, the system defines a ‘skeleton’, made up of one or more ‘bones’, and different models can be hung off of those bones, while still sharing the same animation data.

But there was a bug in this system.

My code looked a little something like this:

[sourcecode language=”cpp”]
m_bones = new Bone*[m_boneCount];
for ( int i = 0; i < m_boneCount; i++ )
m_bones[i] = new Bone();
[/sourcecode]

[sourcecode language=”cpp”]
for ( int i = 0; i < m_boneCount; i++ )
vsDelete( m_bones );
vsDeleteArray( m_bones );
[/sourcecode]

(‘vsDelete’ and ‘vsDeleteArray’ are custom macros which call delete or delete [], and then set the value to NULL. Just in case any programmers were wondering.)

These were bits of code used as part of creating and destroying a skeleton, and handle creating and destroying the set of bones which make up the skeleton. Programmers are likely to spot the mistake immediately; when destroying the bones, I had written delete m_bones; where I meant delete m_bones[i];

But there were some interesting gotchas. In VectorStorm, I use a custom allocator. I use it for a couple reasons; most notably that I’ve added extra memory sanity checking to it; I can verify that nothing is writing outside its bounds, I can make sure that the correct deallocator is called for each allocation, I can print out a list of every block of memory that’s been allocated (and what bit of code allocated it), and I can verify that all allocated memory has been correctly deallocated before the game closes down. What worried me most about the code above was that my memory allocator wasn’t warning me that I’d called vsDelete( m_bones );. Yes, I didn’t actually want to destroy m_bones there, but from the allocator’s point of view, I should have instead called vsDeleteArray( m_bones );. It should have been jumping up and down about that error, but there was nothing. What’s more, the leak detector didn’t report the m_bones[i] objects as leaking, which they definitely were, due to that same error.

When you’re getting ready to release a build, this sort of thing can very quickly throw you into a panic; you realise that the safety checks you’d built into your system aren’t working for some reason, and you wonder what else might be going wrong that it isn’t warning you about; it was only random chance that you noticed there was even a problem — how long had the problem been going on, and how much other code is affected, which you’ll need to debug and fix, once you manage to get your safety checks working again?

After a lot of testing, I eventually worked out that the custom allocator had been slightly broken when we updated to requiring OpenGL 2.1, back in January’s MS3 build. It still worked for everything inside the VectorStorm engine (which is likely why I hadn’t noticed the problem before), but individual games using the engine were using the normal system allocator. Which meant that their allocations weren’t being tracked, and leaks weren’t being tested.

So I fixed the problem, but now MMORPG Tycoon 2 wouldn’t load at all — it claimed to be running out of memory.

I’d had it in my head that MMORPG Tycoon 2 used about 300 megabytes of RAM while open. I thought that because that’s as much as I allowed my custom allocator to use. But since the game itself wasn’t allocating memory for the game using the custom allocator, most of the memory used by MMORPG Tycoon 2 wasn’t being counted. Once I checked with the OS, it was clear that MMORPG Tycoon 2 is actually using nearly a gigabyte of RAM — far, far more than I’d intended! But I needed to get the build out, so I just increased the amount of memory that MMORPG Tycoon 2 is allowed to use, and released the build. And today, I started hunting down where all that memory was going.

So here’s an interesting tidbit. MMORPG Tycoon 2’s ground data takes up 88 megabytes in RAM. And I had to pack it very, very carefully in order to make it that small. But that’s a drop in the bucket of how much RAM we’re using at the moment.

Of that gigabyte, about 400 megabytes is a set of quadtrees which I use to speed up collision tests against the ground. Yep, 88 megabytes for the ground data, and 400 megabytes to make it faster to check for collisions against that ground data. After thinking about it overnight, I think I can get that 400 megabytes down to about 100 bytes in total, without losing much of the speed. And not needing to update those 400 megabytes of data will probably make it much faster to build ground mesh. Those would both be very nice savings.

Another 100 megabytes is taken up by the data which is used while rebuilding bits of the visible terrain models. It’s not used for anything most of the time; it just sits there taking up space. I should be able to cut back on that substantially.

Beyond that, it starts dropping off pretty quickly. There’s probably another 100 megabytes in total which could be saved, outside of the above two.

Things could have been a lot worse. Once I had the custom allocator working for the game again, it turned out that I only had one memory leak (and it only happened during shutdown, so the OS would clean up after that immediately, anyway), and I didn’t have any apparent memory trashing bugs. The biggest issue was that I’d been wrongly assuming for so long that I was using a surprisingly small amount of memory, when in fact, I wasn’t.