performing frequency table by group with calculation count of value in R

  • A+
Category:Languages

Suppose this is my dataset

(dput) dataset<-structure(list(group1 = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L,  1L, 1L), .Label = c("b", "x"), class = "factor"), group2 = structure(c(2L,  2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L), .Label = c("g", "y"), class = "factor"),      var1 = c(2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L, 2L)), .Names = c("group1",  "group2", "var1"), class = "data.frame", row.names = c(NA, -9L )) 

I need calculate frequency for two groups

x+y b+g 

and for variable var1, calculate count of 1 value and 2 value. For each group. So the desired output

        total_count_of_group    var1-1  var1-2 x   y          5                                                  3         2 b   g          4              2         2 

This output mean that total_count_of_group x+y=5 obs. by this group. where 1 value meets 3 times and 2 value two times.

similary total_count_of_group b+g=4 obs. by this group. where 1 value meets 2 times and 2 value two times.

How to get such table?

 


This can be solved in two steps:

  1. Aggregate group totals and update dataset
  2. Reshape from long to wide format

using data.table:

library(data.table) dcast(setDT(dataset)[, total_count_of_group := .N, by =. (group1, group2)],        group1 + group2 + total_count_of_group~ paste0("var1=", var1), length) 
   group1 group2 total_count_of_group var1_1 var1_2 1:      b      g                    4      2      2 2:      x      y                    5      3      2 

Please, note that this will work for an arbitrary number of different values in var1 as well as an arbitrary number of groups.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: