How to do operations on list columns in an R data.table to output another list column?

  • A+
Category:Languages

I still have a difficult time thinking about how one works with R data.table columns which are lists.

Here is an R data.table:

library(data.table) dt = data.table(       numericcol = rep(42, 8),       listcol = list(c(1, 22, 3), 6, 1, 12, c(5, 6, 1123), 3, 42, 1)   ) > dt    numericcol        listcol 1:         42        1,22, 3 2:         42              6 3:         42              1 4:         42             12 5:         42    5,   6,1123 6:         42              3 7:         42             42 8:         42              1 

I would like to create a column for the absolute values between the elements of numericcol and listcol:

> dt    numericcol        listcol    absvals  1:         42        1,22, 3    41, 20, 39 2:         42              6    36 3:         42              1    41 4:         42             12    30 5:         42    5,   6,1123    37, 36, 1081 6:         42              3    39 7:         42             42    0 8:         42              1    41 

So, my first thought would be to use sapply() as follows:

dt[, absvals := sapply(listcol, function(x) abs(x-numericcol))] 

This outputs the following:

> dt    numericcol        listcol absvals 1:         42        1,22, 3      41 2:         42              6      20 3:         42              1      39 4:         42             12      41 5:         42    5,   6,1123      20 6:         42              3      39 7:         42             42      41 8:         42              1      20 

So, absvals is now a column of unlisted elements, with an individual element in each row, and is a different dimension than the data.table.

(1) How would one create absvals to retain the list structure of listcol?

(2) In cases like these, if I am only interested in a vector of the values, how do R data.table users create such a data structure?

Maybe

vec = as.vector(dt[, absvals := sapply(listcol, function(x) abs(x-numericcol))]) 

?


Another solution using mapply:

dt[, absvals := mapply(listcol, numericcol, FUN = function(x, y) abs(x-y))]  #output dt    numericcol        listcol        absvals 1:         42        1,22, 3       41,20,39 2:         42              6             36 3:         42              1             41 4:         42             12             30 5:         42    5,   6,1123   37,  36,1081 6:         42              3             39 7:         42             42              0 8:         42              1             41 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: