- A+

Category：Languages

My data :

`data <- c(1,5,11,15,24,31,32,65) `

There are 2 neighbours: **31 and 32**. I wish to remove them and keep only the mean value (e.g. **31.5**), in such a way data would be :

`data <- c(1,5,11,15,24,31.5,65) `

It seems simple, but I wish to do it automatically, and sometimes with vectors containing more neighbours. For instance :

`data_2 <- c(1,5,11,15,24,31,32,65,99,100,101,140) `

Here is my solution, which uses run-length encoding to identify groups:

`foo <- function(x) { y <- x - seq_along(x) #normalize to zero differences in groups ind <- rle(y) #run-length encoding ind$values <- ind$lengths != 1 #to find groups ind$values[ind$values] <- cumsum(ind$values[ind$values]) #group ids ind <- inverse.rle(ind) xnew <- x xnew[ind != 0] <- ave(x, ind, FUN = mean)[ind != 0] #calculate means xnew[!(duplicated(ind) & ind != 0)] #remove duplicates from groups } foo(data) #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0 foo(data_2) #[1] 1.0 5.0 11.0 15.0 24.0 31.5 65.0 100.0 140.0 data_3 <- c(1, 2, 4, 1, 2) foo(data_3) #[1] 1.5 4.0 1.5 `

I assume that you don't need an extremely efficient solution. If you do, I'd recommend a simple C++ `for`

loop in Rcpp.