New vector with letters representing unique combinations of other vectors

  • A+
Category:Languages

I have

dat <-data.frame(study=letters[c(1,1,1,4,4,4,4,10,10)],n1i=c(25,25,22,38,50,30,30,50,50))  

I want

     study n1i grp 1     a  25   A 2     a  25   A 3     a  22   B 4     d  38   A 5     d  50   B 6     d  30   C 7     d  30   C 8     j  50   A 9     j  50   A 

But this...

dat$grp<-     as.vector(unlist(aggregate(dat$n1i,    list(dat$study), function(x) LETTERS[1:length(x)])$x))  

...gives me

> dat   study n1i grp 1     a  25   A 2     a  25   B 3     a  22   C 4     d  38   A 5     d  50   B 6     d  30   C 7     d  30   D 8     j  50   A 9     j  50   B 

In words I want the "grp" letters to go from 1 to whenever it reaches the last unique combination of study*n1i.

 


dat <-data.frame(study=letters[c(1,1,1,4,4,4,4,10,10)],n1i=c(25,25,22,38,50,30,30,50,50))   library(dplyr)  dat %>%   group_by(study) %>%                    # for each study   mutate(id = row_number()) %>%          # get the number of row as an id   group_by(study, n1i) %>%               # for each study and n1i combination   transmute(grp = LETTERS[min(id)]) %>%  # add the letters based on the minimum id value of that combination, while removing the id column   ungroup()                              # forget the grouping  # # A tibble: 9 x 3 #   study   n1i grp   #   <fct> <dbl> <chr> # 1 a        25 A     # 2 a        25 A     # 3 a        22 C     # 4 d        38 A     # 5 d        50 B     # 6 d        30 C     # 7 d        30 C     # 8 j        50 A     # 9 j        50 A  

This approach assumes that the duplicated rows are one after the other.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: