Posts by Ben Carter
  1. Speeding up sparse data access with bitfields ( Counting comments... )
  2. The API from the Black Lagoon ( Counting comments... )
  3. 60000 people in a Very Big Room - Comic Market and indie games in Japan ( Counting comments... )
  4. Networking, APIs and C preprocessor abuse ( Counting comments... )
  5. The blunt but useful instrument that is XML serialisation in C# ( Counting comments... )
  6. Please wait, pausing... ( Counting comments... )
  7. A bird's eye view of your memory map ( Counting comments... )
  8. Of Jelly Cars and Other Trifles ( Counting comments... )
  9. FileSystemWatcher: A cool thing I didn’t know about until recently ( Counting comments... )
  10. How not to Get Things Done ( Counting comments... )
  11. My Elven warrior has PTSD ( Counting comments... )
  12. Elegance in Failure: How and when to crash horribly ( Counting comments... )
  13. Slim and light multitasking with fibers ( Counting comments... )
  14. Why Names Matter ( Counting comments... )

Generic heaps are a pain. As soon as you step outside the world of fixed-size arrays and linear allocators, life becomes a painful whirlwind of fragmentation, leaks, and out-of-memory border cases. The standard remedy to this is either (a) to throw memory at the problem until it goes away, or (b) spend hours tracking down individual allocations that may be causing problems. Now, if that was the end of the story it might be fine, but in my experience memory bugs are a gift that keeps on giving, and as such what "should" be a one-off effort becomes something like a bad TV detective show - a weekly trawl through the seedy underbelly of the code in search of a scapegoat. Er. Culprit.

At this point I would like to present a universal cure-all for such problems... but since I have absolutely no clue what that would be, here's a faintly cute trick that can serve to dull the pain of the regular treasure hunt.

Basically, the idea is this - most memory tracing tools focus on taking precise snapshots of activity, allowing the tracking of a single rogue allocation with great accuracy. Which is very, very handy if that's what you're trying to do, but generally a bit like using an electron microscope to find an elephant when you're at the initial stages of "er, why did that allocation fail?" debugging. Just turning on the associated debugging functionality and collecting data can take a significant amount of time, even before you start analysis. So, how about something more rough-and-ready?

Rather than trying to track allocations individually, displaying the contents of memory in a visual form makes it easier to see at a glance what the state of play is. In addition, doing this in realtime means that it is possible to visualise changes over time - so patterns of allocation/deallocation can be observed, and the effect of larger state changes (for example, loading a new world zone) are instantly obvious. Doing this makes spotting the likely source of large leaks quite easy (since in general, if you know roughly when something was allocated, you can narrow down the list of candidate causes significantly), and also unusual patterns of usage that can result in fragmentation.

A simple way to achieve this create a texture (say 128x128 or 256x256), and use each pixel to represent a range of bytes in the heap. For example, a 256x256 texture has 65536 pixels, so if the heap is 32Mb then each pixel represents a 512 byte memory range. Then colour each pixel according to how many bytes within that range are actually allocated, and draw the texture as a 2D screen element each frame. The results look something like this (if you have an HTML5 browser, click through to try it out interactively):

This demo simply allocates bytes at a rate determined by "Allocation rate", with a random size in the range specified into a (virtual) 4Mb heap. Blocks are also randomly freed, so you can fill up or drain memory by adjusting the balance of the two rates. Selecting "Allocate large blocks high" adds a common memory management optimisation by allocating anything greater than 4K in size from the top down, seperating large and small allocations to reduce fragmentation.

Fragmentation, in particular, is an interesting problem because it is often hard to visualise with standard tools - even if there is a lot of free memory available, the placement (and lifespan) of a few small allocations can radically affect the largest contiguous block that is allocatable. A graphical display like this can make it much easier to spot such problematic patterns and identify accurately if a out-of-memory situation is the result of fragmentation or simply not having enough space.

Writing a debug code to produce a display like this efficiently turns out to be surprisingly simple - all this implementation does is keep a copy of the texture itself, and a seperate buffer of integer values that represent the number of bytes used in each "pixel" (in some cases these can be combined, but for simplicity I've kept them seperate). Whenever an allocation or free operation occurs, the code calculates which pixels are touched by it, and adds/subtracts the appropriate number of bytes from the running total. The bitmap is simultaneously updated - it's hard to spot on this demo, but there is actually a range of colours in use. Completely free blocks are black, and those containing data scale their red component from 127 to 255 (the scale skips the bottom half of the colour range simply because a pixel with R=0 is almost indistinguishable from one with R=1, but the different between "empty" and "nearly empty" is very significant for fragmentation tracking). Obviously, more complex colour-coding (by allocation type, etc) is also possible, although it can end up confusing the display more than it helps in some circumstances.

In Javascript, the resulting code looks like this (the full version is in the page source for the demo, if you want it):

this.updateAllocationMap = function(a_address, a_size, a_alloc)
{
	// Get image bitmap
 
	var image_data = g_context.getImageData(0, 0, g_map_width, g_map_height);
 
	// Calculate range of pixels touched by operation
 
	var first_pixel = Math.floor(a_address / this.m_bytes_per_map_pixel);
	var last_pixel = Math.floor((a_address + a_size - 1) / this.m_bytes_per_map_pixel);
 
	for (var i = first_pixel; i <= last_pixel; i++)
	{
		// Start/end of pixel range
 
		var pixel_start_address = i * this.m_bytes_per_map_pixel;
		var pixel_end_address = pixel_start_address + this.m_bytes_per_map_pixel;
 
		// Clip chunk to pixel
 
		var chunk_start = a_address;
		var chunk_end = chunk_start + a_size;
 
		chunk_start = Math.max(Math.min(chunk_start, pixel_end_address), pixel_start_address);
		chunk_end = Math.max(Math.min(chunk_end, pixel_end_address), pixel_start_address);
 
		// Calculate byte count
 
		var bytes_in_chunk = chunk_end - chunk_start;
 
		// Lazy initialisation of allocation map (because I'm lazy)
 
		if (!this.m_allocation_map[i])
		{
			this.m_allocation_map[i] = 0;
		}
 
		// Update allocation map
 
		if (a_alloc)
		{
			this.m_allocation_map[i] += bytes_in_chunk;
		}
		else
		{
			this.m_allocation_map[i] -= bytes_in_chunk;
		}
 
		// Update pixel
 
		var pixel_x = i % g_map_width;
		var pixel_y = Math.floor(i / g_map_width);
 
		if (this.m_allocation_map[i] > 0)
		{
			// Red for allocated
			image_data.data[(i * 4)] = 127 + (127 * (this.m_allocation_map[i] / this.m_bytes_per_map_pixel));
			image_data.data[(i * 4)+1] = 0;
			image_data.data[(i * 4)+2] = 0;
			image_data.data[(i * 4)+3] = 255;
		}
		else
		{	
			// Black for unallocated
 
			image_data.data[(i * 4)] = 0;
			image_data.data[(i * 4)+1] = 0;
			image_data.data[(i * 4)+2] = 0;
			image_data.data[(i * 4)+3] = 255;
		}
	}
 
	// Put image bitmap back
 
	g_context.putImageData(image_data, 0, 0);
}

A big benefit of this scheme is that whilst the overhead it adds to alloc/free operations is certainly significant, it is not crippling (as is often the case with tracking allocations "properly", where large buffers, file writes and the like become necessary). It is not unreasonable to keep it running in the background on debug builds, especially since updates can be supressed whilst the display is not visible if you don't mind iterating over all allocated blocks once to initialise the buffers when it does get turned on.

For us, being able to observe the memory behaviour of the game in realtime has turned out to be a big win in terms of quickly tracking down problems. Even in cases where more detailed analysis has been necessary, quickly narrowing the search area before moving to full captures makes life simpler, and by replicating the same "MS DOS defrag tool" display in our full-fat memory debugging tool, programmers can easily select and track blocks which looked suspect on the realtime view. Hopefully it's a quick and dirty hack that others will find useful too! (also, if you get bored, you can write a routine to draw pictures in the memory viewer with carefully-constructed sequences of allocs and frees... ^-^)