- A+

My dataset looks like the following `R`

dataset

`dat <- data.frame(z = seq(0.5, 1,0.1), matrix(1:24, nrow = 6) ) colnames(dat) <- c("z", "A", "B", "C", "D") dat # z A B C D # 0.5 1 7 13 19 # 0.6 2 8 14 20 # 0.7 3 9 15 21 # 0.8 4 10 16 22 # 0.9 5 11 17 23 # 1.0 6 12 18 24 `

I would like to perform the same operation for each entry in columns `A`

, `B`

, `C`

and `D`

, such that I need to add another column to `dat`

where for each one of these column I sum the entry in each row of the remaining three columns, divide this sum by the standard deviation of the row entries, and multiply this ratio by the corresponding row value in column `z`

. For example, take the first entry in column `A`

. The operation is `0.5 * (7 + 13 + 19) / sd(c(7, 13, 19))`

. For the second entry in column `B`

, it would be `0.6 * (2 + 14 + 20) / sd(c(2, 14, 20))`

. These operations yield a `6 x 4`

matrix, which I need to attach to `dat`

.

My dataset is huge (and I would like to have the function in a way that I can bootstrap it quickly), so I am wondering which one is the fastest way to this. The `for`

loop is quite slow (and it would make bootstrapping a nightmare). I was thinking about the `dplyr`

package, but I'm not very familiar. Thank you.

one for loop is enough for this:

`m=function(x,y){ l=unlist(dat[y,names(dat)!=x]) unname(l[1]*sum(l[-1])/sd(l[-1])) } matrix(mapply(m,names(dat)[-1],t(row(dat[-1]))),nrow(dat),byrow = T) [,1] [,2] [,3] [,4] [1,] 3.25 1.800298 1.472971 1.75 [2,] 4.20 2.356753 1.963961 2.40 [3,] 5.25 2.978674 2.520417 3.15 [4,] 6.40 3.666061 3.142338 4.00 [5,] 7.65 4.418912 3.829724 4.95 [6,] 9.00 5.237229 4.582576 6.00 `

Using tidyverse:

`dat%>% mutate(i=1:nrow(dat))%>% group_by(i)%>% gather(key,val,-i)%>% summarise(s=list(map_dbl(2:ncol(dat), ~val[1]*sum(val[-c(1,.x)])/sd(val[-c(1,.x)]))))%>% pull(s)%>%invoke(rbind,.) [,1] [,2] [,3] [,4] [1,] 3.25 1.800298 1.472971 1.75 [2,] 4.20 2.356753 1.963961 2.40 [3,] 5.25 2.978674 2.520417 3.15 [4,] 6.40 3.666061 3.142338 4.00 [5,] 7.65 4.418912 3.829724 4.95 [6,] 9.00 5.237229 4.582576 6.00 `

You can also do:

`sapply(1:4,function(x)dat[,1]*colSums(s<-t(dat[-c(1,x+1)]))/sqrt(diag(var(s)))) [,1] [,2] [,3] [,4] [1,] 3.25 1.800298 1.472971 1.75 [2,] 4.20 2.356753 1.963961 2.40 [3,] 5.25 2.978674 2.520417 3.15 [4,] 6.40 3.666061 3.142338 4.00 [5,] 7.65 4.418912 3.829724 4.95 [6,] 9.00 5.237229 4.582576 6.00 `