For vs For Each

I’ve been curious for a while now about the performance of the MS for each C++ extension compared to the traditional for loop. So a couple of weeks ago I dug up and old performance testing framework I’d put together and found out for myself.

The Framework works by running a specified piece of code (using a function pointer) a specified number of times. It uses Microsoft’s Query Performance Counter (QPC) to time the code each run and reports the average time as well as some other stats. I used it to run a number of different For and For each loops on the same data/container to see which one performs best. A reset function can be specified and used to reset data between each test run. The Full framework used can be found Here.

I started out testing looping through a std::vector<int> and assigning each individual int to 0, see below:

// For Loop, Using Subscript Operator for container access:
for (int i = 0; i < g_iTestItrations; ++i)
          g_aTestVector[i] = 0;

// For Loop, Using Iterators for container access:
// Note that the Iterators are setup for each test by the reset function,
// so the overhead in declaring and initialising iterators is not tested.
for (; g_vItr < g_vEnd; ++g_vItr)
          *g_vItr = 0;

// For Each Loop:
for each (auto Int in g_aTestVector)
     Int = 0;

The test was run 1000 times on a vector containing 10,000,000 ints. Compiled using Visual Studio 2012 (update 2) in Release mode.

The times for these loops are as follows:

For Loop (Subscript) For Loop (Iterator) For Each Loop
Time (ms) 16.767141 24.818655 0.000046

The For Each loop is interesting, taking a tiny fraction of a millisecond to set more than 38MB of memory to 0. That’s Fast! I can only assume that the compile has some way to optimize the for each loop in this case that it doesn’t have for the Different For loops.

The other interesting thing to not about this result is how much slower it was using iterators to loop through the vector when compared to a integer index + subscript operator. This was a continuing trend for all the vector tests.

In an attempt to change how the compiler from optimises the for (each) loops I change the tests so that I was setting the ints to a random number using rand(), instead of Zero, the results for these tests are below:

For Loop (Subscript) For Loop (Iterator) For Each Loop
Time (ms) 157.770657 160.231407 151.305207

The rand() function has significantly increased the time taken for our test loops to complete, however we can see a trend starting to emerge. Once again the for each loop is the fastest, beating the faster for loop by about 4.1%. The For Loop using the Iterator was once again the slowest.

One thing you’ll note is that the body of the different loops is different. The For loops need to use the deference or subscript operators to access the ints in the container, where as the for each loop does not. Using the Visual Studio 2012 Profiler we can see how much time is spent in the bodies of the different loops as opposed to the loop overhead itself.

For (each) Loops - Visual Studio Profiling Results
For (each) Loops – Visual Studio Profiling Results

Note: all the above results are from a single profiling Session.

Not only does the profile confirm our results but it also shows that significantly more time is spent in the loop body for the two for loops then the for each loop. This means that a significant part of the for each loops increased performance comes for faster access to the container. It can be assumed that the more often you access the container in the loop body the bigger the performance increase offered by the for each loop. If your wondering about the missing for each loop in the image above its not there because the profile never sampled it and has no stats for it (not surprising if it really runs in 0.000046ms).

To confirm that these results were not a fluke I re-ran the same tests, this time using a std::list<int> for the container instead of a vector. The results are as follows:

Set to 0 For Loop For Each Loop
Time (ms) 156.944983 116.675954
Set to rand() For Loop For Each Loop
Time (ms) 261.813159 235.885277

Despite removing the overhead of declaring and initialising the iterators for the for loop it is still slower than the for each loop, this time the difference is about 25.6% for the set to 0 test and 9.9% for the set to rand test.

Once again the visual studio profiler confirms the results, the for each loops are faster and less time is spent executing the loop body:

For (each) loops - Visual Studio Profile Results
For (each) loops – Visual Studio Profile Results

So, in addition to being easer to read/write and less error prone the for each loop is faster too. It was measurably faster in all the tests I ran and should prove significantly faster the more you need to access the container.

Curiosity Satisfied.


One thought on “For vs For Each”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s