Wednesday, October 15, 2014

Benchmarking

One of the most common things that are done in game dev is looping over objects to do repetitive task every frame of the game. It is usually here where the bottle necks of game become evident. This happens to be more the case with Will of the Wisp, a game that's main mechanic is based on controlling swarms. The wrong algorithm for a minion or allocating more memory (increasing the Garbage Collector task) than necessary could lead to huge performance spikes that will devastate playability. In order to help swarm performance, two generic aspects of the swarm were tested, how C# allocates memory when using operators over inline methods to manipulate vectors. The other will be how the for loop vs. for each loop effects overall performance and if the iterator is too much overhead.

C# Memory Allocation

There are two ways to manipulate a vector. The first is to use operators which are static overloads that take in both vectors and return a new vector. It seems with all the use of creating new vectors, that it may allocate more memory in the heap than should be necessary. This will create a performance spike when the garbage collector deallocates the extraneous memory. An alternative would be to use methods to update the vectors that would have been updated with the operators. This would stop the extraneous allocation which should reduce both the overhead of initializing new memory and the cost of the garbage collector. 

Test that uses Vector operators:
var object1Pos = new Vector2(10, 5); 
var object2Pos = new Vector2(-5, 3); 
Vector2 object1Trajectory = (object2Pos - object1Pos).normalized; 
object1Trajectory *= 8; object1Pos += object1Trajectory;

Test that uses Vector methods:
var object1Pos = new Vector2(10, 5); 
var object2Pos = new Vector2(-5, 3); 
var object1Trajectory = new Vector2(object2Pos); 
object1Trajectory.Add(object1Pos * -1); 
object1Trajectory.Normalize(); 
object1Trajectory.Scale(8); 
object1Pos.Add(object1Trajectory);

The code above were used in the benchmarking tests that were able to get the average computational time while eliminating the testing overhead costs and initial performance cost of loading libraries. The benchmarking test also measured how much memory was allocated in the heap and how much would be collected by the garbage collector.


The memory is in kilobytes and the time is in milliseconds. As can be seen from the tests, using methods is 28% more efficient than operators, though the increase isn't that significant. This was to be expected since the methods don't have to deal with the overhead of allocation. What went against my initial hypothesis was that the amount of memory allocated to the heap was the same in both operator and method algorithms. Looking deeper into how C# compiles, the use of 'new' doesn't always allocate the object to the heap and set the variable to that memory address. The compiler will allocate local objects to the stack where deallocation isn't a concern. The take away is that even though using methods to update the vector will provide some increase in performance, the garbage collector will do the same amount of work which is more of a bottle neck than the overhead of allocating stack memory. The operator code above is more readable than the method code below it, so for the meantime it will be better to use vector operators over methods. If the performance of the code dictates that the bottleneck is really vector operators, then the code may be changed to use methods over the operators.

For vs. Foreach

When iterating over lists or array, there are two options: for or for each loop. The for each loop is faster to write and more readable; however, at first look, there is overhead cost of generating and iterating using an iterator which needs an IEnumerable. For each loops also has restrictions on deleting objects or replacing the object in the iterator. That's why it seems that a for loop which is properly optimized to only access memory once each iteration would be a better choice in regards to performance.

There will be five tests that will iterate over a list of vectors and an array of vectors. The simple loop will only call the object once while doing a simple method call. The complex loop will do several calls to the object and more advanced calculations. The for loop will test two different kinds of complex loops. The first kind will be when the object is called, it will grab the object from the array each time. The second time will save the object in a temp variable while the object is manipulated.

Simple For Loop:
for (int index = 0; index < list.Count; index++)
{
    list[index].Normalize();
}


Simple For Each Loop:
foreach (Vector2 vector in list)
{
    vector.Normalize();
}


Complex For Loop (1):
Vector2 position = new Vector2(10, 10);
for (int index = 0; index < list.Count; index++)
{
    list[index].Normalize();
    list[index] *= 10;
    position += list[index];
}


Complex For Loop (2):
Vector2 position = new Vector2(10, 10);
for (int index = 0; index < list.Count; index++)
{
    Vector2 vector = list[index];
    vector.Normalize();
    vector *= 10;
    position += vector;
}


Complex For Each Loop:
Vector2 position = new Vector2(10, 10);
foreach (Vector2 vector in list)
{
    vector.Normalize();
    vector.Scale(10);
    position += vector;
}


Each method was tested twice using a List and an Array. The size of the list grew by a factor of two until the final loop was over an array size of one million.

The time it takes to iterate each test for the list.

The time it takes to iterate each test for the array

As is seen from above, there is no determinable difference between a simple loop for both a for loop and a for each loop; however, there is a slight improvement of the performance of looping over an array over a list. Also the for loop has a slight advantage over the for each loop because the for each loop has an overhead cost that outweighs the cost of the basic operation. These overhead costs are not significant enough to be a bottleneck, so the for each loop for a simple loop will provide clarity and readability.

The surprise is in the complex loop. Unsurprisingly, the second complex for loop did much worse than the second complex for loop because the first for loop has to make more reads from memory which does impact performance. The surprise comes that the for each loop did better than the for loop over the long run. The overhead cost of the for each loop didn't have the loss in performance as my original hypothesis and the observations of other programmers online when talking about performance in Unity. The for each loop does have a slight advantage when iterating over an array than the List. I would have loved to do the performance and calculations on a GameObject to get a more accurate view of how the loop would act in game. 

Based on the calculations, the for each loop would be the better loop to go with for both performance and readability. Where it loses in performance, the loss isn't significant enough to be a bottleneck. 








No comments:

Post a Comment