is a code monkey from Oxford, UK. He worked for Rebellion and NaturalMotion, and received DirectX MVP status from Microsoft a few times, but you might know him from Gamedev.net, where he was webmaster and a prolific forum poster for several years. These days he's running his own consultancy, The Binary Refinery, and working on an as-yet-unannounced Unity3D title...
Posts by Richard Fine
  1. The Many-Worlds Interpretation of Game Development ( Counting comments... )
  2. I am a slave to the groove ( Counting comments... )
  3. Extending the Unity3D editor ( Counting comments... )
  4. When they said "numbers don't lie," they were lying ( Counting comments... )
  5. Unity3D coroutines in detail ( Counting comments... )
  6. Designers are Descriptive; Programmers are Procedural ( Counting comments... )
  7. The Top 6 Misconceptions I Had About Unity ( Counting comments... )
  8. Hosting a Game Server on AWS ( Counting comments... )
  9. Play, Don't Show ( Counting comments... )
  10. How to Polish a Turd ( Counting comments... )
  11. It's time to stop using Subversion ( Counting comments... )
Technology/ Code /

Many processes in games take place over the course of multiple frames. You've got 'dense' processes, like pathfinding, which work hard each frame but get split across multiple frames so as not to impact the framerate too heavily. You've got 'sparse' processes, like gameplay triggers, that do nothing most frames, but occasionally are called upon to do critical work. And you've got assorted processes between the two.

Whenever you're creating a process that will take place over multiple frames - without multithreading - you need to find some way of breaking the work up into chunks that can be run one-per-frame. For any algorithm with a central loop, it's fairly obvious: an A* pathfinder, for example, can be structured such that it maintains its node lists semi-permanently, processing only a handful of nodes from the open list each frame, instead of trying to do all the work in one go. There's some balancing to be done to manage latency - after all, if you're locking your framerate at 60 or 30 frames per second, then your process will only take 60 or 30 steps per second, and that might cause the process to just take too long overall. A neat design might offer the smallest possible unit of work at one level - e.g. process a single A* node - and layer on top a way of grouping work together into larger chunks - e.g. keep processing A* nodes for X milliseconds. (Some people call this 'timeslicing', though I don't).

Still, allowing the work to be broken up in this way means you have to transfer state from one frame to the next. If you're breaking an iterative algorithm up, then you've got to preserve all the state shared across iterations, as well as a means of tracking which iteration is to be performed next. That's not usually too bad - the design of an 'A* pathfinder class' is fairly obvious - but there are other cases, too, that are less pleasant. Sometimes you'll be facing long computations that are doing different kinds of work from frame to frame; the object capturing their state can end up with a big mess of semi-useful 'locals,' kept for passing data from one frame to the next. And if you're dealing with a sparse process, you often end up having to implement a small state machine just to track when work should be done at all.

Wouldn't it be neat if, instead of having to explicitly track all this state across multiple frames, and instead of having to multithread and manage synchronization and locking and so on, you could just write your function as a single chunk of code, and mark particular places where the function should 'pause' and carry on at a later time?

Unity - along with a number of other environments and languages - provides this in the form of Coroutines.

How do they look?

In "Unityscript" (Javascript):

1
2
3
4
5
6
7
8
9
10
function LongComputation()
{
    while(someCondition)
    {
        /* Do a chunk of work */
 
        // Pause here and carry on next frame
        yield;
    }
}

In C#:

1
2
3
4
5
6
7
8
9
10
IEnumerator LongComputation()
{
    while(someCondition)
    {
        /* Do a chunk of work */
 
        // Pause here and carry on next frame
        yield return null;
    }
}

How do they work?

Let me just say, quickly, that I don't work for Unity Technologies. I've not seen the Unity source code. I've never seen the guts of Unity's coroutine engine. However, if they've implemented it in a way that is radically different from what I'm about to describe, then I'll be quite surprised. If anyone from UT wants to chime in and talk about how it actually works, then that'd be great.

The big clues are in the C# version. Firstly, note that the return type for the function is IEnumerator. And secondly, note that one of the statements is yield return. This means that yield must be a keyword, and as Unity's C# support is vanilla C# 3.5, it must be a vanilla C# 3.5 keyword. Indeed, here it is in MSDN - talking about something called 'iterator blocks.' So what's going on?

Firstly, there's this IEnumerator type. The IEnumerator type acts like a cursor over a sequence, providing two significant members: Current, which is a property giving you the element the cursor is presently over, and MoveNext(), a function that moves to the next element in the sequence. Because IEnumerator is an interface, it doesn't specify exactly how these members are implemented; MoveNext() could just add one to Current, or it could load the new value from a file, or it could download an image from the Internet and hash it and store the new hash in Current... or it could even do one thing for the first element in the sequence, and something entirely different for the second. You could even use it to generate an infinite sequence if you so desired. MoveNext() calculates the next value in the sequence (returning false if there are no more values), and Current retrieves the value it calculated.

Ordinarily, if you wanted to implement an interface, you'd have to write a class, implement the members, and so on. Iterator blocks are a convenient way of implementing IEnumerator without all that hassle - you just follow a few rules, and the IEnumerator implementation is generated automatically by the compiler.

An iterator block is a regular function that (a) returns IEnumerator, and (b) uses the yield keyword. So what does the yield keyword actually do? It declares what the next value in the sequence is - or that there are no more values. The point at which the code encounters a yield return X or yield break is the point at which IEnumerator.MoveNext() should stop; a yield return X causes MoveNext() to return true and Current to be assigned the value X, while a yield break causes MoveNext() to return false.

Now, here's the trick. It doesn't have to matter what the actual values returned by the sequence are. You can call MoveNext() repeatly, and ignore Current; the computations will still be performed. Each time MoveNext() is called, your iterator block runs to the next 'yield' statement, regardless of what expression it actually yields. So you can write something like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
IEnumerator TellMeASecret()
{
  PlayAnimation("LeanInConspiratorially");
  while(playingAnimation)
    yield return null;
 
  Say("I stole the cookie from the cookie jar!");
  while(speaking)
    yield return null;
 
  PlayAnimation("LeanOutRelieved");
  while(playingAnimation)
    yield return null;
}

and what you've actually written is an iterator block that generates a long sequence of null values, but what's significant is the side-effects of the work it does to calculate them. You could run this coroutine using a simple loop like this:

1
2
IEnumerator e = TellMeASecret();
while(e.MoveNext()) { }

Or, more usefully, you could mix it in with other work:

1
2
3
4
5
6
IEnumerator e = TellMeASecret();
while(e.MoveNext()) 
{ 
  // If they press 'Escape', skip the cutscene
  if(Input.GetKeyDown(KeyCode.Escape)) { break; }
}

It's all in the timing

As you've seen, each yield return statement must provide an expression (like null) so that the iterator block has something to actually assign to IEnumerator.Current. A long sequence of nulls isn't exactly useful, but we're more interested in the side-effects. Aren't we?

There's something handy we can do with that expression, actually. What if, instead of just yielding null and ignoring it, we yielded something that indicated when we expect to need to do more work? Often we'll need to carry straight on the next frame, sure, but not always: there will be plenty of times where we want to carry on after an animation or sound has finished playing, or after a particular amount of time has passed. Those while(playingAnimation) yield return null; constructs are bit tedious, don't you think?

Unity declares the YieldInstruction base type, and provides a few concrete derived types that indicate particular kinds of wait. You've got WaitForSeconds, which resumes the coroutine after the designated amount of time has passed. You've got WaitForEndOfFrame, which resumes the coroutine at a particular point later in the same frame. You've got the Coroutine type itself, which, when coroutine A yields coroutine B, pauses coroutine A until after coroutine B has finished.

What does this look like from a runtime point of view? As I said, I don't work for Unity, so I've never seen their code; but I'd imagine it might look a little bit like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
List<IEnumerator> unblockedCoroutines;
List<IEnumerator> shouldRunNextFrame;
List<IEnumerator> shouldRunAtEndOfFrame;
SortedList<float, IEnumerator> shouldRunAfterTimes;
 
foreach(IEnumerator coroutine in unblockedCoroutines)
{
    if(!coroutine.MoveNext())
        // This coroutine has finished
        continue;
 
    if(!coroutine.Current is YieldInstruction)
    {
        // This coroutine yielded null, or some other value we don't understand; run it next frame.
        shouldRunNextFrame.Add(coroutine);
        continue;
    }
 
    if(coroutine.Current is WaitForSeconds)
    {
        WaitForSeconds wait = (WaitForSeconds)coroutine.Current;
        shouldRunAfterTimes.Add(Time.time + wait.duration, coroutine);
    }
    else if(coroutine.Current is WaitForEndOfFrame)
    {
        shouldRunAtEndOfFrame.Add(coroutine);
    }
    else /* similar stuff for other YieldInstruction subtypes */
}
 
unblockedCoroutines = shouldRunNextFrame;

It's not difficult to imagine how more YieldInstruction subtypes could be added to handle other cases - engine-level support for signals, for example, could be added, with a WaitForSignal("SignalName") YieldInstruction supporting it. By adding more YieldInstructions, the coroutines themselves can become more expressive - yield return new WaitForSignal("GameOver") is nicer to read than while(!Signals.HasFired("GameOver")) yield return null, if you ask me, quite apart from the fact that doing it in the engine could be faster than doing it in script.

A couple of non-obvious ramifications

There's a couple of useful things about all this that people sometimes miss that I thought I should point out.

Firstly, yield return is just yielding an expression - any expression - and YieldInstruction is a regular type. This means you can do things like:

1
2
3
4
5
6
7
8
9
10
YieldInstruction y;
 
if(something)
 y = null;
else if(somethingElse)
 y = new WaitForEndOfFrame();
else
 y = new WaitForSeconds(1.0f);
 
yield return y;

The specific lines yield return new WaitForSeconds(), yield return new WaitForEndOfFrame(), etc, are common, but they're not actually special forms in their own right.

Secondly, because these coroutines are just iterator blocks, you can iterate over them yourself if you want - you don't have to have the engine do it for you. I've used this for adding interrupt conditions to a coroutine before:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
IEnumerator DoSomething()
{
  /* ... */
}
 
IEnumerator DoSomethingUnlessInterrupted()
{
  IEnumerator e = DoSomething();
  bool interrupted = false;
  while(!interrupted)
  {
    e.MoveNext();
    yield return e.Current;
    interrupted = HasBeenInterrupted();
  }
}

Thirdly, the fact that you can yield on other coroutines can sort of allow you to implement your own YieldInstructions, albeit not as performantly as if they were implemented by the engine. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
IEnumerator UntilTrueCoroutine(Func fn)
{
   while(!fn()) yield return null;
}
 
Coroutine UntilTrue(Func fn)
{
  return StartCoroutine(UntilTrueCoroutine(fn));
}
 
IEnumerator SomeTask()
{
  /* ... */
  yield return UntilTrue(() => _lives < 3);
  /* ... */
}

however, I wouldn't really recommend this - the cost of starting a Coroutine is a little heavy for my liking.

Conclusion

I hope this clarifies a little some of what's really happening when you use a Coroutine in Unity. C#'s iterator blocks are a groovy little construct, and even if you're not using Unity, maybe you'll find it useful to take advantage of them in the same way.