Homogenize use of single and double digit numbers in string

  • A+
Category:Languages

I have a very large data.table in which (a large number of) items are defined by strings including text and numbers.

library(data.table)     dd <- data.table(x = c("A4","A4","A4","A14","A14","A14","B4","B4","B4"),y = c("A4","A14","B4","A4","A14","B4","A4","A14","B4"), z = c(1,2,3,4,5,6,7,8,9))  x   y   z A4  A4  1 A4  A14 2 A4  B4  3 A14 A4  4 A14 A14 5 A14 B4  6 B4  A4  7 B4  A14 8 B4  B4  9 

Numbers can be single or double digit and therefore R will order them always according to the first digit in the number (A14 before A4). Mixedsort can handle this. However, when I reshape the long data to wide

wide <- dcast(dd, x ~ y, value.var = "z") 

R is applying again the ordering according to the basic ordering rule.

x    A14  A4  B4 A14  5    4   6 A4   2    1   3 B4   8    7   9 

I need however the original ordering for following matrix calculations. Is there any efficient way to rename string + single digits to string + double digits (A4 -> A04) or another approach I have missed?

 


Another, and probably the easiest, option is to use mixedorder from the gtools-package:

wide <- dcast(dd, x ~ y, value.var = "z")[gtools::mixedorder(x)] 

which gives:

> wide      x A14 A4 B4 1:  A4   2  1  3 2: A14   5  4  6 3:  B4   8  7  9 

If you also want to get the column order set the same way, you can additionally use setcolorder:

setcolorder(wide, c(1, gtools::mixedorder(names(wide)[-1]) + 1)) 

which then gives:

> wide      x A4 A14 B4 1:  A4  1   2  3 2: A14  4   5  6 3:  B4  7   8  9 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: