Check if column value is in between (range) of two other column values

  • A+
Category:Languages

I have a data frame that looks like this (Dataframe X):

id  number  found 1   5225    NA 2   2222    NA 3   3121    NA 

I have another data frame that looks like this (Dataframe Y):

id  number1  number2     1   4000     6000 2   2500     3300 3   7000     8000 

What I want to do is this: For each value in the Dataframe X "number" column, search if it is equal to or between ANY of the "number1" and "number2" pair values of Dataframe Y. If it is, then I want to insert a "YES in the "found" column of the respective row in Dataframe X:

id  number  found 1   5225    YES 2   2222    NA 3   3121    YES 

How would I go about doing this? Thanks for the help.

 


Here is an option using fuzzy_join

library(fuzzy_join) library(dplyr) fuzzy_left_join(X, Y[-1], by = c("number" = "number1", "number" = "number2"),       match_fun  =list(`>=`, `<=`)) %>%      mutate(found = c(NA, "YES")[(!is.na(number1)) + 1]) %>%      select(names(X)) #    id number found #1  1   5225   YES #2  2   2222  <NA> #3  3   3121   YES 

Or another option is a non-equi join with data.table

library(data.table) setDT(X)[, found := NULL] X[Y, found := "YES", on = .(number >= number1, number <= number2)] X #   id number found #1:  1   5225   YES #2:  2   2222  <NA> #3:  3   3121   YES 

data

X <- structure(list(id = 1:3, number = c(5225L, 2222L, 3121L), found = c(NA,    NA, NA)), class = "data.frame", row.names = c(NA, -3L))  Y <- structure(list(id = 1:3, number1 = c(4000L, 2500L, 7000L), number2 = c(6000L,      3300L, 8000L)), class = "data.frame", row.names = c(NA, -3L)) 

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: