# Dynamic variable names in R regressions

• A+
Category：Languages

Being aware of the danger of using dynamic variable names, I am trying to loop over varios regression models where different variables specifications are choosen. Usually `!!rlang::sym()` solves this kind of problem for me just fine, but it somehow fails in regressions. A minimal example would be the following:

``y= runif(1000)  x1 = runif(1000)  x2 = runif(1000)   df2= data.frame(y,x1,x2) summary(lm(y ~ x1+x2, data=df2)) ## works  var = "x1" summary(lm(y ~ !!rlang::sym(var)) +x2, data=df2) # gives an error ``

My understanding was that `!!rlang::sym(var))` takes the values of `var` (namely x1) and puts that in the code in a way that R thinks this is a variable (not a char). BUt I seem to be wrong. Can anyone enlighten me?

Personally, I like to do this with some computing on the language. For me, a combination of `bquote` with `eval` is easiest (to remember).

``var <- as.symbol(var) eval(bquote(summary(lm(y ~ .(var) + x2, data = df2)))) #Call: #lm(formula = y ~ x1 + x2, data = df2) # #Residuals: #     Min       1Q   Median       3Q      Max  #-0.49298 -0.26248 -0.00046  0.24111  0.51988  # #Coefficients: #            Estimate Std. Error t value Pr(>|t|)     #(Intercept)  0.50244    0.02480  20.258   <2e-16 *** #x1          -0.01468    0.03161  -0.464    0.643     #x2          -0.01635    0.03227  -0.507    0.612     #--- #Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 # #Residual standard error: 0.2878 on 997 degrees of freedom #Multiple R-squared:  0.0004708,    Adjusted R-squared:  -0.001534  #F-statistic: 0.2348 on 2 and 997 DF,  p-value: 0.7908 ``

I find this superior to any approach that doesn't show the same call as `summary(lm(y ~ x1+x2, data=df2))`.