Founder and technical director of Id Software and Armadillo Aerospace
Posts by John-Carmack
  1. Latency Mitigation Strategies ( Counting comments... )
  2. Functional Programming in C++ ( Counting comments... )
  3. Static Code Analysis ( Counting comments... )
  4. Parallel Implementations ( Counting comments... )
Technology/ Code /

I used to Code Fearlessly all the time, tearing up everything whenever I had a thought about a better way of doing something.  There was even a bit of pride there -- "I'm not afraid to suffer consequences in the quest to Do The Right Thing!"  Of course, to be honest, the consequences usually fell on a more junior programmer who had to deal with an irate developer that had something unexpectedly stop working when I tore up the code to make it "better".

Sure, with everything in source control you can roll back the changes if it catastrophically breaks, but if you did succeed in making some aspect better, there is an incentive to keep pushing forward, even if there is a bit of suffering involved.  Somewhat more subtly, there are all sorts of opportunities to avoid making honest comparisons between the new way and the old way.  Rolling back code and rebuilding to run a test is a pain, and you aren’t going to do it very often, even if you have a suspicion that things aren’t working quite as well in a particular case you hadn’t considered during the rewrite.

What I try to do nowadays is to implement new ideas in parallel with the old ones, rather than mutating the existing code.  This allows easy and honest comparison between them, and makes it trivial to go back to the old reliable path when the spiffy new one starts showing flaws.  The difference between changing a console variable to get a different behavior versus running an old exe, let alone reverting code changes and rebuilding, is significant.

For some tasks, this is pretty obvious.  If you have a ray tracer, it isn't hard to see an interface that allows you to have the Trace() function use various kD tree / BVH / BSP back ends, and a similar case can be made for the processing code that builds accelerator structures for them.  Missing some pixels?  Change over to the other implementation and check it there.

However, some of my most effective uses of this strategy have been more aggressive.  Over the years, I have done a number of hardware acceleration conversions from software rendering engines.  In the old days, I would basically start from scratch, first implementing the environment rendering, then the characters, then the special effects.  There were always lots of little features that got forgotten, and comparing against the original meant playing through the game on two systems at once.

The last two times I did this, I got the software rendering code running on the new platform first, so everything could be tested out at low frame rates, then implemented the hardware accelerated version in parallel, setting things up so you could instantly switch between the two at any time.  For a mobile OpenGL ES application being developed on a windows simulator, I opened a completely separate window for the accelerated view, letting me see it simultaneously with the original software implementation.  This was a very significant development win.

If the task you are working on can be expressed as a pure function that simply processes input parameters into a return structure, it is easy to switch it out for different implementations.  If it is a system that maintains internal state or has multiple entry points, you have to be a bit more careful about switching it in and out.  If it is a gnarly mess with lots of internal callouts to other systems to maintain parallel state changes, then you have some cleanup to do before trying a parallel implementation.

There are two general classes of parallel implementations I work with:  The reference implementation, which is much smaller and simpler, but will be maintained continuously, and the experimental implementation, where you expect one version to “win” and consign the other implementation to source control in a couple weeks after you have some confidence that it is both fully functional and a real improvement.

It is completely reasonable to violate some generally good coding rules while building an experimental implementation – copy, paste, and find-replace rename is actually a good way to start.  Code fearlessly on the copy, while the original remains fully functional and unmolested.  It is often tempting to shortcut this by passing in some kind of option flag to existing code, rather than enabling a full parallel implementation.  It is a  grey area, but I have been tending to find the extra path complexity with the flag approach often leads to messing up both versions as you work, and you usually compromise both implementations to some degree.

Every single time I have undertaken a parallel implementation approach, I have come away feeling that it was beneficial, and I now tend to code in a style that favors it.  Highly recommended.