- A+

Category：Languages

I have to speed up my script. I have some cycles like:

`DT <- data.frame(Index=1:20, A=c(10:29)) cost1 <- 3 cost2 <- 0.05 cost3 <- 50 DT$S[1] <- cost1 for (j in 2:(20)) { DT$S[j] <- DT$S[j-1]-cost3+DT$S[j-1]*cost2/12 } `

Where cost1 and cost2 are constants. Is it possible to avoid writing a cycle?

The main problem with your approach is that you are repeatedly calling elements of data.frame (`DT$S`

), but that is not needed in this calculations. If we replace that with vector and add the results to data.frame at the end, it is much faster. Also we can simplify the formula.

`n <- 1e4 DT <- data.frame(Index = 1:n, A = seq(10, by = 1, length.out = n)) cost1 <- 3 cost2 <- 0.05 cost3 <- 50 your <- function() { DT$S[1] <- cost1 for (j in 2:(n)) { DT$S[j] <- DT$S[j - 1] - cost3 + DT$S[j - 1]*cost2/12 } } your() `

My function:

`my <- function() { cc <- (1 + cost2/12) r <- vector('numeric', length = n) r[1] <- cost1 for (j in 2:(n)) { # r[j] <- r[j - 1] - cost3 + r[j - 1] * cost2/12 r[j] <- r[j - 1] * cc - cost3 } r } DT$S2 <- my() all.equal(DT$S, DT$S2) # [1] TRUE microbenchmark::microbenchmark(your(), my(), times = 2) # Unit: milliseconds # expr min lq mean median uq max neval cld # your() 487.229621 487.229621 490.86917 490.86917 494.508715 494.508715 2 b # my() 1.515178 1.515178 1.59408 1.59408 1.672982 1.672982 2 a `