Why is the short-circuit logical 'and' operator not used when comparing two nullables for equality?

  • A+
Category:Languages

I have a method which compares two nullable ints and prints the result of the comparison to the console:

static void TestMethod(int? i1, int? i2) {     Console.WriteLine(i1 == i2); } 

And this is the result of its decompilation:

private static void TestMethod(int? i1, int? i2) {     int? nullable = i1;     int? nullable2 = i2;     Console.WriteLine((nullable.GetValueOrDefault() == nullable2.GetValueOrDefault()) & (nullable.HasValue == nullable2.HasValue)); } 

The result is more or less what I expected but I wonder why the non-short-circuit version of the logical 'and' operator (&) is used instead of the short-circuit version (&&). It seems to me that the latter would be more efficient - if one side of the comparison is already known to be false then there is no need to evaluate the other side. Is the & operator necessary here or is this just an implementation detail that is not important enough to bother about?

 


The result is more less what I expected but I wonder why non-short-circuit version of logical and operator (&) is used instead of short-circuit version (&&). It seems to me that the latter would be more efficient - if one side of comparison is already known then there is no need to evaluate the other side. Is there a reason that mandates usage of & operator or this is just implementation detail that is not important enough to bother?

That's a great question.

First off, I worked on the code generators for both the pre-Roslyn nullable lowering code, and for the original implementation in Roslyn. This is some tricky code with a lot of opportunities for mistakes and missed optimizations. I wrote a long series of blog articles on how the Roslyn nullable lowering optimizer works, which begins here:

https://ericlippert.com/2012/12/20/nullable-micro-optimizations-part-one/

If this subject interests you, this series of articles will probably be a big help. Part three is particularly germane as it discusses a related question: for nullable arithmetic, do we generate (x.HasValue & y.HasValue) ? new int?(x.Value + y.Value) : new int?() or use && or use GetValueOrDefault or what? (The answer of course is I tried all of them and picked the one that makes the fastest smallest code.) The series however does not consider your specific question here which is about nullable equality. Nullable equality has slightly different rules than ordinary lifted arithmetic.

Of course, I have not been at Microsoft since 2012, and they may have changed it since then; I wouldn't know. (UPDATE: Looking at the linked issue in the comments above, it seems likely that I missed an optimization in my original implementation in 2011, and that this was fixed in 2017.)

To answer your specific question: the problem with && is that it is more expensive than & when the work being done on the right side of the operator is cheaper than the test-and-branch. Test-and-branch is not just more instructions. It is, obviously, a branch, which has lots of knock-on effects. At the processor level, branches require branch prediction, and branches can be predicted wrong. Branches mean more basic blocks, and remember, the jit optimizer runs at run time, which means the jit optimizer has to be fast. It is allowed to say "too many basic blocks in this method, I'm going to skip some optimizations", and so maybe adding more basic blocks unnecessarily is a bad thing.

To make a long story short, the C# compiler will generate "and" operations as eager if the right side has no side effect and the compiler believes that evaluating left & right will be faster and shorter than evaluating left ? right : false. Often the cost of evaluating right is so cheap that the cost of the branch is more expensive than just doing the eager arithmetic.

You can see this in areas other than the nullable optimizer; for example, if you have

bool x = X(); bool y = Y(); bool z = x && y; 

then that will be generated as z = x & y because the compiler knows that there is no expensive operation to be saved; Y() has already been called.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: