Why is typeA == typeB slower than typeA == typeof(TypeB)?

  • A+
Category:Languages

I've been optimising/benchmarking some code recently and came across this method:

public void SomeMethod(Type messageType) {     if (messageType == typeof(BroadcastMessage))     {         // ...     }     else if (messageType == typeof(DirectMessage))     {         // ...     }     else if (messageType == typeof(ClientListRequest))     {         // ...     } } 

This is called from a performance critical loop elsewhere, so I naturally assumed all those typeof(...) calls were adding unnecessary overhead (a micro-optimisation, I know) and could be moved to private fields within the class. (I'm aware there are better ways to refactor this code, however, I'd still like to know what's going on here.)

According to my benchmark this isn't the case at all (using BenchmarkDotNet).

[DisassemblyDiagnoser(printAsm: true, printSource: true)] [RyuJitX64Job] public class Tests {     private Type a = typeof(string);     private Type b = typeof(int);      [Benchmark]     public bool F1()     {         return a == typeof(int);     }      [Benchmark]     public bool F2()     {         return a == b;     } } 

Results on my machine (Window 10 x64, .NET 4.7.2, RyuJIT, Release build):

Why is typeA == typeB slower than typeA == typeof(TypeB)?

The functions compiled down to ASM:

F1

mov     rcx,offset mscorlib_ni+0x729e10 call    clr!InstallCustomModule+0x2320 mov     rcx,qword ptr [rsp+30h] cmp     qword ptr [rcx+8],rax sete    al movzx   eax,al 

F2

mov     qword ptr [rsp+30h],rcx mov     rcx,qword ptr [rcx+8] mov     rdx,qword ptr [rsp+30h] mov     rdx,qword ptr [rdx+10h] call    System.Type.op_Equality(System.Type, System.Type) movzx   eax,al 

I don't know how to interpret ASM so am unable to understand the significance of what's happening here. In a nut shell, why is F1 faster?

 


The assembly you posted shows that the comment of mjwills is, as expected, correct. As the linked article notes, the jitter can be smart about certain comparisons, and this is one of them.

Let's look at your first fragment:

mov     rcx,offset mscorlib_ni+0x729e10 

rcx is the "this pointer" of a call to a member function. The "this pointer" in this case will be the address of some CLR pre-allocated object, what exactly I do not know.

call    clr!InstallCustomModule+0x2320 

Now we call some member function on that object; I don't know what. The nearest public function that you have debug info for is InstallCustomModule, but plainly we are not calling InstallCustomModule here; we're calling the function that is 0x2320 bytes away from InstallCustomModule.

It would be interesting to see what the code at InstallCustomModule+0x2320 does.

Anyways, we make the call, and the return value goes in rax. Moving on:

mov     rcx,qword ptr [rsp+30h] cmp     qword ptr [rcx+8],rax 

This looks like it is fetching the value of a out of this and comparing it to whatever the function returned.

The rest of the code is just perfectly ordinary: moving the bool result of the comparison into the return register.

In short, the first fragment is equivalent to:

return ReferenceEquals(SomeConstantObject.SomeUnknownFunction(), this.a); 

Obviously an educated guess here is that the constant object and the unknown function are special-purpose helpers that rapidly fetch commonly-used type objects like typeof(int).

A second educated guess is that the jitter is deciding for itself that the pattern "compare a field of type Type to a typeof(something)" can best be made as a direct reference comparison between objects.

And now you can see for yourself what the second fragment does. It is just:

return Type.op_Equality(this.a, this.b); 

All it does is call a helper method that compares two types for value equality. Remember, the CLR does not guarantee reference equality for all equivalent type objects.

Now it should be clear why the first fragment is faster. The jitter knows hugely more about the first fragment. It knows, for instance, that typeof(int) will always return the same reference, and so you can do a cheap reference comparison. It knows that typeof(int) is never null. It knows the exact type of typeof(int) -- remember, Type is not sealed; you can make your own Type objects.

In the second fragment, the jitter knows nothing other than it has two operands of type Type. It doesn't know their runtime types, it doesn't know their nullity; for all it knows, you subclassed Type yourself and made up two instances that are reference-unequal but value-equal. It has to fall back to the most conservative position and call a helper method that starts going down the list: are they both null? Is one of the null and the other non-null? are they reference equal? And so on.

It looks like lacking that knowledge is costing you the enormous penalty of... half a nanosecond. I wouldn't worry about it.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: