Suppose this is my dataset
(dput) dataset<-structure(list(group1 = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c("b", "x"), class = "factor"), group2 = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c("g", "y"), class = "factor"), var1 = c(2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L)), .Names = c("group1", "group2", "var1"), class = "data.frame", row.names = c(NA, -9L ))
I need calculate frequency for two groups
and for variable var1, calculate count of 1 value and 2 value. For each group. So the desired output
total_count_of_group var1-1 var1-2 x y 5 3 2 b g 4 2 2
This output mean that total_count_of_group x+y=5 obs. by this group. where 1 value meets 3 times and 2 value two times.
similary total_count_of_group b+g=4 obs. by this group. where 1 value meets 2 times and 2 value two times.
How to get such table?
This can be solved in two steps:
- Aggregate group totals and update
- Reshape from long to wide format
library(data.table) dcast(setDT(dataset)[, total_count_of_group := .N, by =. (group1, group2)], group1 + group2 + total_count_of_group~ paste0("var1=", var1), length)
group1 group2 total_count_of_group var1_1 var1_2 1: b g 4 2 2 2: x y 5 3 2
Please, note that this will work for an arbitrary number of different values in
var1 as well as an arbitrary number of groups.