Split string every n characters new column

  • A+
Category:Languages

Suppose I have a data frame like this with a string vector, var2

var1  var2 1     abcdefghi  2     abcdefghijklmnop 3     abc  4     abcdefghijklmnopqrst 

What is the most efficient way to split var2 every n characters into new columns until the end of each string,

e.g if every 4 characters, the output would like look like this:

var1  var2                  new_var1  new_var2 new_var3  new_var4  new_var5 1     abcdefghi             abcd      efgh     i  2     abcdefghijklmnop      abcd      efgh     ijkl      mnop  3     abc                   abc 4     abcdefghijklmnopqrst  abcd      efgh     ijkl      mnop      qrst  

stringr package? Using "str_split_fixed"

Or Using regular expressions:

gsub("(.{4})", "//1 ", "abcdefghi") 

Capacity to create new columns that go to new_var_n depending on length of var2, which could be 10000 characters for example.

 


Alternatively, you can try read.fwf in base R. No special package is needed:

tmp <- read.fwf(     textConnection(dtf$var2),     widths = rep(4, ceiling(max(nchar(dtf$var2) / 4))),     stringsAsFactors = FALSE)  cbind(dtf, tmp)  #   var1                 var2   V1   V2   V3   V4   V5 # 1    1            abcdefghi abcd efgh    i <NA> <NA> # 2    2     abcdefghijklmnop abcd efgh ijkl mnop <NA> # 3    3                  abc  abc <NA> <NA> <NA> <NA> # 4    4 abcdefghijklmnopqrst abcd efgh ijkl mnop qrst 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: