# how to impute the distance to a value

• A+
Category：Languages

I'd like to fill missing values with a "row distance" to the nearest non-NA value. In other words, how would I convert column x in this sample dataframe into column y?

``#    x y #1   0 0 #2  NA 1 #3   0 0 #4  NA 1 #5  NA 2 #6  NA 1 #7   0 0 #8  NA 1 #9  NA 2 #10 NA 3 #11 NA 2 #12 NA 1 #13  0 0 ``

I can't seem to find the right combination of dplyr group_by and mutate row_number() statements to do the trick. The various imputation packages that I've investigated are designed for more complicated scenarios where imputation is performed using statistics and other variables.

``d<-data.frame(x=c(0,NA,0,rep(NA,3),0,rep(NA,5),0),y=c(0,1,0,1,2,1,0,1,2,3,2,1,0)) ``

We can use

``d\$z = sapply(seq_along(d\$x), function(z) min(abs(z - which(!is.na(d\$x))))) #     x y z # 1   0 0 0 # 2  NA 1 1 # 3   0 0 0 # 4  NA 1 1 # 5  NA 2 2 # 6  NA 1 1 # 7   0 0 0 # 8  NA 1 1 # 9  NA 2 2 # 10 NA 3 3 # 11 NA 2 2 # 12 NA 1 1 # 13  0 0 0 ``

If you want to do this in dplyr, you can just wrap the `sapply` part in a `mutate`.

``d %>%    mutate(z = sapply(seq_along(x), function(z) min(abs(z - which(!is.na(x)))))) ``

or, using also `library(purrr)` (thanks to @Onyambu):

``d %>% mutate(m=map_dbl(1:n(),~min(abs(.x-which(!is.na(x)))))) ``