Ghostbusters Challenges: Synchronization Between Threads in the Infernal Engine

Thread synchronization is a complicated problem and rarely discussed in practice. We came to our own conclusions via experimentation and what worked well for us during the production of Ghostbusters.

Ghostbusters used two kinds of synchronization primitives: ‘crude locks’ and ‘critical sections.’ A crude lock is the lowest form of synchronization. It lets one thread sit in a loop until the other thread lets it continue. Here is how a simple implementation of a crude lock class could look in C++:

class CCrudeLock {
volatile long value;
public:
CCrudeLock() { value = 1; }
void lock() {
for (;;) {
if (1 == InterlockedCompareExchangeAcquire(&value,0,1)) {
return;
}
}
}
void unlock() {
value = 1;
}
};

As you can see in the lock code, if one thread tries to acquire the lock and it is busy, it will sit in a tight loop burning CPU time forever until it gets the lock. The InterlockedCompareExchangeAcquire function is an atomic function that will synchronize the access of a variable across multiple threads. In assembly language, it could look like the following:

mov ecx,dword ptr [esp+4] ; Get the address of value into ecx
mov edx,dword ptr [esp+8] ; Get “0” into edx
mov eax,dword ptr [esp+12] ; Get “1” into eax
lock cmpxchg dword ptr [ecx],edx ; Compare the value of eax with the destination, and if equal, write edx into the destination
ret 12 ; eax = return value

So the guts of the InterlockedCompareExchangeAcquire function is the ‘cmpxchg’ instruction with the ‘lock’ prefix, which preserves the order of operations so that more than one processor does not interrupt this single instruction -- it functions atomically.

On a cache coherent multiprocessor, releasing the crude lock is as simple as writing back to the memory location. If another thread is attempting to acquire the lock, the atomic interlocked function will guarantee the order of operations.

The other type of synchronization primitive we used is the standard Windows CRITICAL_SECTION structure. This structure is well documented, although the implementation may not be, so we will discuss how it might work internally. A standard critical section is similar to the crude lock, with the loop being finite: If a certain amount of loop iterations happen and it cannot acquire the lock, then yield the thread for an amount of time and start over.

class CCriticalSection {
volatile int value;
public:
CCriticalSection() { value = 1; }
void lock() {
int loopCount = 0;
for (;;) {
if (1 == InterlockedCompareExchangeAcquire(&value,0,1)) {
return;
}
loopCount++;
if (loopCount > 400) {
Yield();
loopCount = 0;
}
}
void unlock() {
value = 1;
}
};

Note that the loopCount value is local to the lock function, so that more than two threads can attempt to access our critical section at any one time.

When is it appropriate to use a crude lock instead of a critical section?

Take a four-core processor with two threads each, or hyperthreading. Although there are two sets of register contexts for each thread, there is only one execution unit. So if thread one is blocked on a core, thread two can start to execute if possible. If two code threads try to acquire the lock in a crude lock and they are both executing on the same processor, you can have a live lock situation occur.

Ghostbusters was a multiplatform title, with the PC, Xbox 360 and PlayStation 3 supported. For the PC and PS3, we used critical sections to synchronize code. The PC could be hyperthreaded, and the PS3 is hyperthreaded. Although the Xbox 360 is hyperthreaded, you are able to select which thread on which core you can run your code on, so in the end we were able to use the lighter crude lock for thread synchronization because we could guarantee what hardware threads our game would run on.

Note that whenever you have a crude lock or critical section in your code, you need to guarantee that you will be accessing that resource for only a very short amount of time. Staying inside of a critical section of your code for a long time will have negative effects on performance.



Sponsoring tomorrow starts today. Learn More>