Paradox: Why is yield return faster than list here

  • A+
Category:Languages

People have proven countless times, that yield return is slower than list.

Example: Is 'yield return' slower than "old school" return?

However when I tried, a benchmark, I got the opposite results:

Results: TestYield: Time =1.19 sec TestList : Time =4.22 sec 

Here, List is 400% slower. This happens regardless size. This makes no sense.

IEnumerable<int> CreateNumbers() //for yield {     for (int i = 0; i < Size; i++) yield return i; }  IEnumerable<int> CreateNumbers() //for list {     var list = new List<int>();     for (int i = 0; i < Size; i++) list.Add(i);     return list; } 

Here is how I consume them:

foreach (var value in CreateNumbers()) sum += value; 

I use all the correct benchmark rules to avoid conflicting results so this is not the issue.

If you see the underlying code, yield return is a state machine abomination, yet it is faster. Why?

Edit: All answers replicated that indeed Yield is faster than list.

New Results With Size set on constructor: TestYield: Time =1.001 TestList: Time =1.403 From a 400% slower difference, down to 40% slower difference. 

However, the insights are mind breaking. It means that all those programmers from 1960 and later that used list as the default collection were wrong and should have been shot (fired), because they didn't use the best tool for the situation (yield).

The answers argued that yield should be faster because it is not materialized.

1) I do not accept this logic. Yield has internal logic behind the scene, it is not a "theoretical model" but a compiler construct. Therefore it automatically materialises on consumption. I do not accept the argument that it "didn't materialise", since the cost is already paid on USE.

2) If a boat can travel by sea, but an old woman can't, you cannot demand the boat to "move by land". As you did here with the list. If a list requires materialization, and yield doesn't, that is not a "problem of yield" but instead a "feature". Yield should not be penalized in the test, just because it has more uses.

3) What i am arguing here is that the purpose of the test was to find the "Fastest collection" to consume / return results returned by a method if you know that the ENTIRE SET will be consumed.

Does yield become the new "De facto standard" for returning list arguments from methods.

Edit2: if i use pure inline array, it obtains the same performance as a Yield.

Test 3: TestYield: Time =0.987 TestArray: Time =0.962 TestList: Time =1.516  int[] CreateNumbers() {     var list = new int[Size];     for (int i = 0; i < Size; i++) list[i] = i;     return list; } 

Therefore, yield is automatically inlined into an array. List isn't.

 


If you measure the version using yield without materializing the list, it will have an advantage over the other version as it won't have to allocate and resize a large list (as well as trigger GC).

Based on your edit I would like to add the following:

However, keep in mind that semantically you're looking at two different methods. One produces a collection. It is finite in size, you can store references to the collection, change its elements, and share it.

The other produces a sequence. It is potentially unbounded, you get a new copy each time you iterate over it, and there may or may not be a collection behind it.

They are not the same thing. The compiler doesn't create a collection to implement a sequence. If you implement a sequence by materializing a collection behind the scenes you will see similar performance as the version that uses a list.

BenchmarkDotNet doesn't allow you to time deferred execution by default so you have to construct a test that consumes the methods which is what I have done below. I ran this through BenchmarkDotNet and got the following.

       Method |     Mean |    Error |   StdDev | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op | ------------- |---------:|---------:|---------:|------------:|------------:|------------:|--------------------:|  ConsumeYield | 475.5 us | 7.010 us | 6.214 us |           - |           - |           - |                40 B |   ConsumeList | 958.9 us | 7.271 us | 6.801 us |    285.1563 |    285.1563 |    285.1563 |           1049024 B | 

Notice the allocations. For some scenarios this could make a difference.

We can offset some of the allocations by allocating the correct size list, but ultimately this is not an apples to apples comparison. Numbers below.

       Method |     Mean |     Error |    StdDev | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op | ------------- |---------:|----------:|----------:|------------:|------------:|------------:|--------------------:|  ConsumeYield | 470.8 us |  2.508 us |  2.346 us |           - |           - |           - |                40 B |   ConsumeList | 836.2 us | 13.456 us | 12.587 us |    124.0234 |    124.0234 |    124.0234 |            400104 B | 

Code below.

[MemoryDiagnoser] public class Test {     static void Main(string[] args)     {         var summary = BenchmarkRunner.Run<Test>();     }      public int Size = 100000;      [Benchmark]     public int ConsumeYield()     {         var sum = 0;         foreach (var x in CreateNumbersYield()) sum += x;         return sum;     }      [Benchmark]     public int ConsumeList()     {         var sum = 0;         foreach (var x in CreateNumbersList()) sum += x;         return sum;     }      public IEnumerable<int> CreateNumbersYield() //for yield     {         for (int i = 0; i < Size; i++) yield return i;     }      public IEnumerable<int> CreateNumbersList() //for list     {         var list = new List<int>();         for (int i = 0; i < Size; i++) list.Add(i);         return list;     } } 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: