In Swift, how bad it is to declare variable in loops

  • A+

I don't know all Swift mechanics, and how it handles variables.

I always preferred to declare variables before entering a for or while loop, not matter the language, rather than declaring them inside the loop over and over.

But is it that bad to re-declare variables ? Would it affect performance with a very large iteration ? How does specifically Swift handle this behavior ?

example :

while i < 100 {   let a = someFunc()   i += 1 } 


let a: MyObj while i < 100 {  a = someFunc()  i += 1 } 


This would not impact performance, and version 1 is highly preferred. Even if it would impact performance, you would need to demonstrate that on your precise code before you would consider any other option but version 1. There are no universal performance answers when dealing with an optimizing compiler. Doing anything unusual "for performance" that you have not deeply explored with your code runs a high likelihood of making things worse. The normal cases are the most optimized cases.

(I know I'm overstating this. There are definitely ways to look at code and say "that's going to be horribly inefficient." And there are some quirky parts of Swift where things that look ok are in fact bad, most notably using + to combine strings, or using pre-Swift4 reduce to create an array. But in the cases that those matter, you're going to discover it really quickly because they're really bad when they matter.)

But we don't have to guess about any of this. We can just ask the compiler.

// inside.swift import Foundation  func runme() {     var i = 0     while i < 100 {       let a = Int.random(in: 0...10)       print(a)       i += 1     } }   // outside.swift import Foundation  func runme() {     var i = 0     var a: Int     while i < 100 {       a = Int.random(in: 0...10)       print(a)       i += 1     } } 

First, note that I put these in a function. That's important. Putting them at the top level makes a a global in one case, and globals have special handling, including thread-safe initialization, which makes the "outside" case look more expensive and complicated than it would be in more normal usage. (It is very, very hard to correctly test micro-optimizations in such a way that you can draw general "this is faster" conclusions. There are so many factors.)

Second notice the print. We need to make sure to use a in a side-effecty way, or else the optimizer might remove it entirely. print is pretty good, even though it's quite complicated. You can also use the result to modify a global, but the compiler could definitely optimize that much more aggressively and might eliminate things we wanted to see. (You really really have to test this stuff on the actual case you care about.)

Now we can see what Swift is going to do with each of these using swiftc -O -emit-sil. That -O is critical. So many people try to do performance testing without turning on the optimizer, and those results are beyond meaningless.

So what's the SIL look like? (Swift Intermediate Language. This is the first big step towards turning your program into machine code. If two things generate the same SIL, they're going to generate the same machine code.)

The SIL is a little long (8000 lines), so I'm going to trim it a bit. My comments in <>. This is going to get a little tedious, because exploring this stuff is very nitpicky. If you want to skip it, the TL-DR is: there's no difference between these two pieces of code. Not "a small difference that won't matter." Literally (except for a hint to the debugger), no difference.

// runme() sil hidden @$S4main5runmeyyF : $@convention(thin) () -> () { bb0:   ... <define a bunch of variables and function calls> ...  <compute the random number and put it in %29> // %19                                            // user: %49 bb1(%19 : $Builtin.Int64):                        // Preds: bb5 bb0   %20 = alloc_stack $SystemRandomNumberGenerator  // users: %23, %30, %21   store %2 to %20 : $*SystemRandomNumberGenerator // id: %21   br bb2                                          // id: %22  bb2:                                              // Preds: bb3 bb1   %23 = apply %6<SystemRandomNumberGenerator>(%20, %5) : $@convention(method) <τ_0_0 where τ_0_0 : RandomNumberGenerator> (@inout τ_0_0, @thin UInt.Type) -> UInt // user: %24   %24 = struct_extract %23 : $UInt, #UInt._value  // users: %28, %25   %25 = builtin "cmp_ult_Int64"(%24 : $Builtin.Int64, %4 : $Builtin.Int64) : $Builtin.Int1 // user: %26   cond_br %25, bb3, bb4                           // id: %26  bb3:                                              // Preds: bb2   br bb2                                          // id: %27  bb4:                                              // Preds: bb2   %28 = builtin "urem_Int64"(%24 : $Builtin.Int64, %3 : $Builtin.Int64) : $Builtin.Int64 // user: %29   %29 = struct $Int (%28 : $Builtin.Int64)        // users: %42, %31   dealloc_stack %20 : $*SystemRandomNumberGenerator // id: %30   < *** Note that %29 is called "a" *** >    debug_value %29 : $Int, let, name "a"           // id: %31  ... < The print call. This is a lot more code than you think it is...> ...  < Add one to i and check for overflow >    %49 = builtin "sadd_with_overflow_Int64"(%19 : $Builtin.Int64, %8 : $Builtin.Int64, %13 : $Builtin.Int1) : $(Builtin.Int64, Builtin.Int1) // users: %51, %50   %50 = tuple_extract %49 : $(Builtin.Int64, Builtin.Int1), 0 // users: %55, %53   %51 = tuple_extract %49 : $(Builtin.Int64, Builtin.Int1), 1 // user: %52   cond_fail %51 : $Builtin.Int1                   // id: %52   < Loop if i < 100 >   %53 = builtin "cmp_slt_Int64"(%50 : $Builtin.Int64, %1 : $Builtin.Int64) : $Builtin.Int1 // user: %54   cond_br %53, bb5, bb6                           // id: %54  bb5:                                              // Preds: bb4   br bb1(%50 : $Builtin.Int64)                    // id: %55  bb6:                                              // Preds: bb4   %56 = tuple ()                                  // user: %57   return %56 : $()                                // id: %57 } // end sil function '$S4main5runmeyyF' 

The "outside" code is almost identical. What's different? Note where the *** in the code above marking the call to debug_value? That's missing in "outside" because a is defined as a function variable rather than a block variable.

Know what's missing in both of these? An alloc_stack call for "a". It's an integer; it can fit in a register. It's up to the lower level compiler whether it's stored in a register or the stack. The optimizer sees that "a" doesn't escape this region of the code, so it includes a hint for the debugger, but it doesn't actually bother to demand storage for it, not even on the stack. It can just take the return register of Random and move it to the parameter register for print. It's up to LLVM and its optimizer to decide all this.

The lesson from all this is that it literally doesn't matter for performance. In obscure cases where it might matter (such as when a is a global), version 1 would be more efficient, which I assume is the opposite of what you were expecting.


:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: