Testing string length generates more code than comparing to empty string?

  • A+
Category:Languages

In Delphi, string <> '' seems to generate less code than Length(string) > 0.

Comparing for empty string, defined in TMyClass.UpdateString(const strMyString : String):

MyClassU.pas.31: begin 005CE6A0 55               push ebp                            005CE6A1 8BEC             mov ebp,esp 005CE6A3 83C4F8           add esp,-$08 005CE6A6 8955F8           mov [ebp-$08],edx 005CE6A9 8945FC           mov [ebp-$04],eax MyClassU.pas.32: if (strMyString <> '') then 005CE6AC 837DF800         cmp dword ptr [ebp-$08],$00 005CE6B0 740E             jz $005ce6c0 

As I understand it, this is comparing the address of the dynamically allocated string ([ebp-$08]) to zero. Makes sense, since empty strings point to nil.

Comparing for length, defined in TMyClass.UpdateString2(const strMyString : String):

MyClassU.pas.25: begin 005CE664 55               push ebp 005CE665 8BEC             mov ebp,esp 005CE667 83C4F4           add esp,-$0c 005CE66A 8955F8           mov [ebp-$08],edx 005CE66D 8945FC           mov [ebp-$04],eax 005CE670 8B45F8           mov eax,[ebp-$08] MyClassU.pas.26: if (Length(strMyString) > 0) then 005CE673 8945F4           mov [ebp-$0c],eax 005CE676 837DF400         cmp dword ptr [ebp-$0c],$00 005CE67A 740B             jz $005ce687 005CE67C 8B45F4           mov eax,[ebp-$0c] 005CE67F 83E804           sub eax,$04 005CE682 8B00             mov eax,[eax] 005CE684 8945F4           mov [ebp-$0c],eax 005CE687 837DF400         cmp dword ptr [ebp-$0c],$00 005CE68B 7E0E             jle $005ce69b 

What? Should't it just be cmp dword ptr [ebp-$04],$00, as the string length is stored at offset -$04 within the string?

My guess is it's because optimizations were off and the compiler did not optimize Lenght (boils down to PInteger(PByte(S) - 4)^), but I don't understand why there are two comparisons. In fact both comparisons are present even with optimizations turned on:

MyClassU.pas.27: if (Length(strMyString) > 0) then 005CE6B1 8BC6             mov eax,esi 005CE6B3 85C0             test eax,eax 005CE6B5 7405             jz $005ce6bc 005CE6B7 83E804           sub eax,$04 005CE6BA 8B00             mov eax,[eax] 005CE6BC 85C0             test eax,eax 005CE6BE 7E0A             jle $005ce6ca 

vs

MyClassU.pas.33: if (strMyString <> '') then 005CE6D9 85F6             test esi,esi 005CE6DB 740A             jz $005ce6e7 

 


The second block of code does more work, and not surprisingly that takes more code.

In the first block of code you simply compare against the empty string. The compiler knows that is equivalent to comparing the pointer against nil and generates that code.

The second block of code first obtains the length of the string. That involves checking whether the pointer is nil. If it is, then the length is zero. Otherwise the length is read from the string meta data record.

The compiler simply does not know that every time the pointer is not nil, the length must be positive and so is not able to optimise.

As for why Length doesn't read from the string record directly, that should be obvious now. An empty string is implemented as the nil pointer and so has no string record. In order to find the length you need to deal with two different cases:

  1. String is empty, length is 0.
  2. String is not empty, length is read from the string record.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: