Why can't the compiler optimize closure variable by inlining?

  • A+
Category:Languages

I have a Main method like this:

static void Main(string[] args) {      var b = new byte[1024 * 1024];       Func<double> f = () =>      {          new Random().NextBytes(b);          return b.Cast<int>().Average();      };       var avg = f();      Console.WriteLine(avg); } 

Since I am accessing a local variable b here the compiler creates a class to capture that variable and b becomes the field of that class. Then the b lives as long as the life time of the compiler generated class and it causes a memory leak. Even if b goes out of scope (maybe not in this situation but imagine this is inside of another method and not Main), the byte array won't be deallocated.

What I wonder is, since I am not accessing or modifying the b anywhere after declaring Func, why can't the compiler inline that local variable and not bother with creating a class? Like this:

Func<double> f = () => {     var b = new byte[1024 * 1024];     new Random().NextBytes(b);     return b.Cast<int>().Average(); }; 

I compiled this code in Debug and Release modes, the DisplayClass is generated in both:

Why can't the compiler optimize closure variable by inlining?

Is this just not implemented as an optimization or is there anything I am missing?

 


Is this just not implemented as an optimization or is there anything I am missing?

For the specific example you give, you'd probably not want to make that code transformation because it changes the semantics of the program. If the new throws an exception, in the original program it should do so before the execution of the delegate, and in your transformation, the side effect is deferred. Whether that's an important property that should be preserved is debatable. (And doing so also creates problems for the debugger; the debugger already must pretend that elements of closure classes are locals of the containing method body, and this optimization might complicate it further.)

However, the more general point is germane. There are many optimizations you can do if you know that a closed-over variable is only used for its value.

When I was on the compiler team -- I left in 2012 -- Neal Gafter and I considered implementing such optimizations, as well as a number of more complex optimizations designed to reduce the likelihood of an expensive object's lifetime being extended too long by accident.

Aside: The simplest of the more complex scenarios is: we have two lambdas converted to delegates; one is stored in a short-lived variable and is closed over a local that contains a reference to an expensive object; one is stored in a long-lived variable and is closed over a local that refers to a cheap object. The expensive object lives as long as the long-lived variable even though it is not used. More generally, multiple closures could be constructed as a partition based on the closed-over relation; at the time we only partitioned closures based on nesting; closures at the same nesting level were one closure. The given scenario is rare and there are obvious workarounds, but it would be nice if it didn't happen at all.

We did not do so because there were more important optimizations and features during the period that we were implementing Roslyn, and we did not want to add risk to an already-long schedule.

We could perform such optimizations confidently because in C# it is pretty easy to know when a local has been aliased, and so you can know for sure whether it is ever written to after the closure is created.

I do not know if those optimizations have been implemented in the meanwhile; likely not.

I also do not know if the compiler does such optimizations for C# 7 local functions, though I suspect the answer is "yes". See what happens if you try a local function!

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: