Getting the unique count of strings from a text string

  • A+
Category:Languages

I am wondering on how to get the unique number of characters from the text string. Let's say I am looking for a count of repetition of the words apples, bananas, pineapples, grapes in this string.

 A<- c('I have a lot of pineapples, apples and grapes. One day the pineapples person gave the apples person two baskets of grapes')   df<- data.frame(A)  

Let's say I want to get all the unique count of the fruits listed in the text.

  library(stringr)   df$fruituniquecount<- str_count(df$A, "apples|pineapples|grapes|bananas") 

I tried this but I get the over all count. I would like to the answer as '3'. Please suggest your ideas.

 


You could use str_extract_all and then calculate the length of the unique elements.

Input:

A <- c('I have a lot of pineapples, apples and grapes. One day the pineapples person gave the apples person two baskets of grapes') fruits <- "apples|pineapples|grapes|bananas" 

Result

length(unique(c(stringr::str_extract_all(A, fruits, simplify = TRUE)))) # [1] 3 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: