San Francisco based Software Engineer at Sony Network Entertainment. He primarily acts in a generalist fashion, working on everything from tools development, to engine design, to game programming, but his real passion is software architecture and optimization. He’s also completed Deadly Premonition. You can find him online at http://donolmstead.me.
Posts by Don Olmstead
  1. AltDevPanel on Optimization Video ( Counting comments... )
  2. AltDevPanel on Optimization ( Counting comments... )
  3. Gathering and Analyzing Data with Django: Part 3 ( Counting comments... )
  4. Announcing AltDevPanel ( Counting comments... )
  5. Abusing C++ with Expression Templates ( Counting comments... )
  6. #include <best_practices> ( Counting comments... )
  7. Gathering and Analyzing Data with Django: Part 2 ( Counting comments... )
  8. Gathering and Analyzing Data with Django: Part 1 ( Counting comments... )
  9. Defining a SIMD Interface: Redux ( Counting comments... )
  10. Defining a SIMD Interface ( Counting comments... )
  11. Creating Camera Based Interactions ( Counting comments... )
Technology/ Code /

There is nothing that can break a programmer’s concentration quite like a long compile time. As a project grows a simple change can go from a second wait, to a trip to the coffee pot; A complete rebuild becomes something that is only reasonable to do on the way out the door for lunch. And woe to those who forgot a semicolon in a header before they left, coming back to a compilation stopped due to number of errors message and another long wait.

From xkcd

When a file is included in a header it becomes a dependency. Any files that include that particular file then become dependent on its dependencies, and so on down the line. This is expected behavior, but if the file isn’t actually required then the compiler is doing more work than necessary and the build time will suffer. Now once compile times get unbearable there are tools to help, such as our very own Niklas Frykholm’s Header Hero. But rather than treat the disease some preventative care can make all the difference in a code base. By following some simple guidelines compile times can be kept manageable.

Rule #1: Prefer forward declarations

To do its job the compiler needs to be aware of types. Some things it knows about implicitly, these are the built in types such as integers, pointers, and array, while others it needs to be informed of, such as functions, structures and classes. However, the compiler can be treated on a need to know basis. It only needs to know about a type when it actually has to generate code for it. This is where forward declarations come into play.

Assume we have two classes Foo and Bar. In this case Foo holds an instance of Bar within it.

#include <Bar.hpp>
 
class Foo
{
	public:
		Foo(const Bar& bar);
 
	private:
		Bar _bar;
} ;

This is all pretty standard stuff. The Foo class needs to include the declaration of Bar. So in terms of our compile times anything that needs access to Foo will end up including Bar, and anything Bar includes, and so on down the line.

Now if Foo contained a pointer to Bar internally we could rewrite the code to the following and use a forward declaration for Bar.

class Bar;
 
class Foo
{
	public:
		Foo(Bar* bar);
 
	private:
		Bar* _bar;
} ;

So now anyplace that includes Foo, just includes Foo. It doesn’t have to include Bar or any of Bar’s dependencies. We just move the include to the source file and this will compile without issue.

Now what if we rewrote the first bit of code to use a forward declaration?

class Bar;
 
class Foo
{
	public:
		Foo(const Bar& bar);
 
	private:
		Bar _bar;
} ;

If we compile this we’ll get an error. Visual Studio outputs the following.

error C2079: 'Foo::_bar' uses undefined class 'Bar'

Why this occurs goes back to what the compiler implicitly knows about. The compiler knows what a pointer is, but has no idea what Bar is unless it's explicitly declared.

Now in this case we could wonder why this is actually an issue. Nothing is being implemented within the header so why does the compiler care?

Well for one thing what is the size of Foo? We have no idea since we don’t know the size of Bar. This is why the compile will fail. The compiler does know what the size of a pointer is though, so with a forward declaration there’s enough information to determine the number of bytes in an instance of Foo.

This is also the reason why a forward declaration cannot be used for a base class. The size of the derived class is dependent on the size of the base class. Because of this an include is required for any base classes.

There is also the matter of references. C++ added reference types in addition to the value and pointer types that C supports. If you ask a C programmer about references they’d probably say they were nothing more than semantic sugar for a pointer, and to a degree they’re correct. Under the hood a reference type is a pointer without the possibility of a null value that is treated with value semantics. With regards to forward declaration of reference types they function the same as pointer types.

Templates also throw a wrench in the gears, but the same rules apply. If Bar was a templated class, Bar, and if Foo contains an instance of Bar then both Bar and BarImp needed to be included by Foo. Now if we had a reference to Bar we can use a forward declaration for both classes.

template <typename Imp>
class Bar;
 
class BarImpl;
 
class Foo
{
	public:
		Foo(const Bar<BarImp>& bar);
 
	private:
 
		const Bar<BarImp>& _bar;
} ;

The following table summarizes whether an include or forward declaration is required. The same rationale also applies to function and method declarations.

Case Include of Forward declaration
Foo contains Bar Include Bar
Foo contains Bar* Declare Bar
Foo contains Bar& Declare Bar
Foo descends from Bar Include Bar
Foo contains Bar Include Bar and BarImp
Foo contains Bar Include Bar and Declare BarImp
Foo contains Bar* Declare Bar and BarImp
Foo contains Bar& Declare Bar and BarImp

Rationale

If there is one rule you should abide by it's this one. By using forward declarations when possible the number of included files will go down resulting in reduced build times.

Rule #2: Include the associated header first

Header files should include all the files they depend on and nothing more. One way to ensure that all the dependencies are there is to always include the associated header first thing within the source file. By doing this the compiler will generate an error if the dependencies are not included.

If we go back to the example of the Foo class containing an instance of Bar there is a way to make the forward declaration work. The following code will compile without issue.

#include <Foo.hpp>
#include <Bar.hpp>
 
Foo::Foo(const Bar& bar)
: _bar(bar)
{ }
 
// Other method definitions

The reason why this works is due to the include order. Since Bar is included first the compiler knows about it before it begins parsing Foo’s header file. While this works the effort required to manage the dependencies quickly becomes a nightmare. Because of this include the associated header first to make sure all the dependencies are referenced within the header.

Rationale

This rule ensures that a header file always includes all its dependencies.

Rule #3: Separate declaration and implementation

In C/C++ source code is organized into header files, which contain declarations, and source files, which contain definitions. If we restrict ourselves to these two file types, .h/.hpp and .c/.cpp, there are ways that the definitions can start to creep into the declaration. This is especially true for templates, as the entirety of the implementation needs to be visible to the compiler, but also code that is marked inline. We can add another file type to the mix to handle these cases, the inline implementation file.

An inline implementation file acts the same as a source file but contains only those functions and methods that need to be visible to the compiler. It also contains dependencies to other inline files. These files are then included within the source files.

So if we go back to our Foo containing an instance of Bar example and each of the classes had a corresponding inline implementation file then Foo.ipp would look like the following.

#ifndef PROJECT_FOO_IPP_INCLUDED
#define PROJECT_FOO_IPP_INCLUDED
 
#include <Foo.hpp>
 
inline const Bar& Foo::getBar() const
{
	return _bar;
}
 
#endif

In terms of convention .inl and .ipp are typical file extensions for inline implementations. Another convention appends impl to the associated header file name, e.g. Foo_impl.hpp.

Rationale

This ensures the header file only contains declarations which will keep the file size down, which helps build time. It also has the added benefit of keeping interface and implementation completely separate.

Rule #4: Hide your includes

The include directive has two variants, #include and #include “file”, which have different behaviors. The form used specifies the order the directories are searched by the compiler. Here’s what MSDN has to say about the behavior of Visual Studio.

Syntax Form Action
Quoted form The preprocessor searches for include files in the following order:

  1. In the same directory as the file that contains the #include statement.
  2. In the directories of any previously opened include files in the reverse order in which they were opened. The search starts from the directory of the include file that was opened last and continues through the directory of the include file that was opened first.
  3. Along the path specified by each /I compiler option.
  4. Along the paths specified by the INCLUDE environment variable.
Angle-bracket form The preprocessor searches for include files in the following order:

  1. Along the path specified by each /I compiler option.
  2. When compiling from the command line, along the paths that are specified by the INCLUDE environment variable.

So if you use #include “file” the compiler will end up finding it no matter where the file lives. The issue isn’t the functionality it's the semantics.

By using quotes the include functions as an internal definition. An excellent example of this brand of structuring is Lua. If you were to package Lua 5.1.4 for distribution as a binary library you would need to expose four header files out of the twenty-three in the source code for it to be usable by the client. The rest of the header files correspond to internal functionality that don’t need to be accessed by customer.

This plays into the pimpl, pointer to implementation, idiom. You may have noticed this idiom in use above during the forward declaration example, though it was never explicitly stated. To reiterate -- and save us from scrolling up -- if we had a class Foo that contained a pointer to Bar we only require a forward declaration. If Bar is only needed by Foo and is not accessed otherwise it can be made private. So the source file would resemble this.

#include <Foo.hpp>
#include “Bar.hpp”
 
Foo::Foo(Bar* bar)
: _bar(bar)
{ }
 
// Other method definitions

And the library would look like this.

Source layout

Any cases where code is only used internally it should not be visible to external clients. If Bar was accessible and could be acted on by callers then it needs to be moved into the include directory for the library.

Rationale

This rule has to do with the structure of the project. By storing the include file within the source itself we are hiding its implementation. In terms of compile times, if it can’t be accessed then it can’t be included accidentally.

Rule #5: Use precompiled headers

To combat compile times some vendors offer an optimization technique known as precompiled headers. The compiler converts the specified header files into an “intermediate format” that is easier to parse. Since the file is easier for the compiler to parse this can result in drastically reduced compile times.

“Intermediate format” is a bit of a misnomer as the format the precompiled header is transformed into is dependent on the compiler. Within Visual Studio a precompiled header creates memory mapped files, which aren’t very intermediate at all. In recent versions these files are stored in the ipch directory by default.

In terms of specifying a precompiled header it's nothing more than a header file containing common header files within the project. This file is included first within each source file contained within the project by convention, which means the header file associated with the source file gets bumped to the number two slot.

When creating a Visual Studio project the option to use precompiled headers is presented. If selected a file, stdafx.h, is generated. For a Win32 project it creates the following.

// stdafx.h : include file for standard system include files,
// or project specific include files that are used frequently, but
// are changed infrequently
//
 
#pragma once
 
#include "targetver.h"
 
#define WIN32_LEAN_AND_MEAN             // Exclude rarely-used stuff from Windows headers
// Windows Header Files:
#include <windows.h>
 
// C RunTime Header Files
#include <stdlib.h>
#include <malloc.h>
#include <memory.h>
#include <tchar.h>
 
// TODO: reference additional headers your program requires here

To get the best results from precompiled headers the care and feeding is very important. Misuse of the functionality can actually result in build times that are worse.

The most important guideline is to only specify files that never change, or change infrequently. If the file was changing frequently then the compiler will have to keep regenerating the precompiled header which takes a considerable amount of time. By default windows.h is added to the precompiled header which gives insight on what qualifies. Really any third party libraries used within a project pass the litmus test for inclusion. This includes the STL which is also a prime candidate assuming you’re using it.

The other guideline to follow is to specify files that are used frequently. If you were to include everything but the kitchen sink within the precompiled header then there’s that much more for the compiler to look for within the precompiled header. If we thought of it like a cache then by including too much we’re pulling in more information than needed and causing misses which cause performance to tank.

It should be noted that not all compilers support precompiled headers. To work around this we can provide a guard around the headers to precompile. For compilers supporting precompiled headers, such as GCC and Visual Studio, we would just use a define to denote this, which would look like the following.

#ifndef PROJECT_PCH_HPP_INCLUDED
#define PROJECT PCH_HPP_INCLUDED
 
#ifdef PROJECT_USE_PRECOMPILED_HEADER
 
// Insert include files
 
#endif
 
#endif

Rationale

Precompiled headers are such a performance gain that they should be used on any platform that supports them.

Rule #6: Prefer include guards over pragma once

Compilers makers often add features to improve their performance, and allow more control of their output. Some features get adopted by other competing compilers till they end up being “standard” but without a stamp of approval from the language. The directive pragma once is one of those features. Its origins are murky but legends speak of it being a GCC addition that was then co-opted by other vendors.

The intent of pragma once is the same as an include guard, to ensure that a compilation unit is only brought in a single time. It has the added benefit of avoiding the naming collisions that could occur with include guards. As pragma once was created to help speed up compile time it also has a reputation of being faster than an include guard.

Since its inception compilers have matured and the validity of pragma once improving compile times has come into question. GCC deprecated the directive, but later reneged on the decision. Intel claims no noticeable performance gains by using it. And in an experiment Noel Llopis found no difference between the the two in Visual Studio, minus some extreme cases.

Rationale

With include guards offering the same performance as pragma once it is recommended to stick with the standard. This is especially true if your code will be used externally as pragma once isn’t everywhere. Though feel free to disregard this tip if your code is only used internally and it's your experience that compile times are lessened via the pragma.

Rule #7: Don’t use redundant include guards

Another technique purported to speed up compilation is the redundant include guard. This looks exactly as one would expect, all includes are wrapped in an additional guard like the following.

#ifndef PROJECT_BAR_HPP_INCLUDED
#include <Bar.hpp>
#endif
 
class Foo
{
	public:
		Foo(const Bar& bar);
 
	private:
		Bar _bar;
} ;

The rationale for the redundant include guard is to prevent the compiler from opening the file to find out that it doesn't need to open the file. Since the guard contains the same logic as the header file it can save on the number of include files the compiler needs to parse.

This is another case of compiler technology advancing enough that the technique isn’t the silver bullet it once was. The case has been optimized for and provides no additional speedup.

Rationale

Redundant include guards put an additional burden on the programmer and clutter code files. They also require the guards to be updated if the include guard for the file changes. As with pragma once if you’re experience speaks otherwise feel free to use them, but modern compilers should perform the same with or without them.

Summary

Compile times can be managed before they become the bane of your existence as a programmer. Remember the following.

  1. Prefer forward declarations
  2. Include the associated header first
  3. Separate declaration and implementation
  4. Hide your includes
  5. Use precompiled headers
  6. Prefer include guards over pragma once
  7. Don’t use redundant include guards

By following these simple rules the amount of time waiting for the compiler will be minimized and the build time will decreased. There is also the matter of linker with regards to the build time but we’ll save that discussion for another time.

Also posted to my personal blog.