Randomise order of groups in R data table while preserving internal order of groups

• A+
Category：Languages

In R, I have the following sample data table:

``library(data.table) x <- data.table(Group = c("d1", "d1", "d1", "d1", "d2", "d3", "d3", "d4", "d5", "d5", "d5", "d6", "d7", "d7", "d7", "d7", "d7")) x[, InternalOrder := seq(.N), by = Group] ``

Which looks like this:

``# Input: #     Group InternalOrder  1:    d1             1  2:    d1             2  3:    d1             3  4:    d1             4  5:    d2             1  6:    d3             1  7:    d3             2  8:    d4             1  9:    d5             1 10:    d5             2 11:    d5             3 12:    d6             1 13:    d7             1 14:    d7             2 15:    d7             3 16:    d7             4 17:    d7             5 ``

My goal is to randomise the order of groups in the data table x while preserving the internal order of each group.

I have already worked out a solution

``groupsizes <- x[, .N, by = Group]\$N  # Get number of elements (= rows) for each group set.seed(10) x[, RandomGroupID := rep(sample(c(1:length(unique(x\$Group))), replace = F), groupsizes)]  # Make new column with random ID for each group setorder(x, RandomGroupID, InternalOrder)  # Re-order data by random group ID and internal order ``

that gives the desired output:

``# Output (as desired):      Group InternalOrder RandomGroupID  1:    d5             1             1  2:    d5             2             1  3:    d5             3             1  4:    d2             1             2  5:    d3             1             3  6:    d3             2             3  7:    d1             1             4  8:    d1             2             4  9:    d1             3             4 10:    d1             4             4 11:    d4             1             5 12:    d7             1             6 13:    d7             2             6 14:    d7             3             6 15:    d7             4             6 16:    d7             5             6 17:    d6             1             7 ``

Since I am trying to improve my data table skills, I would like to know if there is a nicer, more idiomatic solution that does not require the intermediate step of creating the vector `groupsizes` but assigns a new column making use of the typical data table syntax using the `by` argument in combination with `.GRP` or `.I` or the like. I have thought of something like `x[, RandomGroupIDAlternative := rep(sample(c(1:length(unique(x\$Group))), replace = F), .GRP), by = Group]` which obviously does not give the desired output.

I am looking forward to your comments and to seeing alternative solutions to this problem.

You can also do it using `split` and `rbindlist`:

``x_new <- rbindlist(sample(split(x, by='Group')))      Group InternalOrder  1:    d4             1  2:    d1             1  3:    d1             2  4:    d1             3  5:    d1             4  6:    d5             1  7:    d5             2  8:    d5             3  9:    d6             1 10:    d7             1 11:    d7             2 12:    d7             3 13:    d7             4 14:    d7             5 15:    d3             1 16:    d3             2 17:    d2             1 ``