dataframe - In R: Replacing value of a data frame column by the value of another data frame when between condition is matched -
i have 2 dataframes:
set.seed(343) testdf <- data.frame(score = sample(50, size=50, replace=true), number = rep(letters[1:25],2), rev = rep(0,50)) sourcedf <- data.frame(min = c(1,10,20,30,40), max = c(9, 19, 29, 39, 50), rev = 1:5)
for each row of testdf testdf$score between sourcedf$min , sourcedf$max of sourcedf, replace value of testdf$rev corresponding sourcedf$rev.
i have working 2 loops , if condition ... slow (my dataset has close 1 million rows). tried using findinterval without success.
is there better/more efficient way this?
first, see comment on how improve question , make reproducible. second, here's possible approach how run overlapping joins using data.table::foverlaps
library(data.table) setkey(setdt(testdf)[, score2 := score], score, score2) # create bounds , key setkey(setdt(sourcedf), min, max) # key min, max indx <- foverlaps(sourcedf, testdf, nomatch = 0l, = true) # run foverlaps testdf[indx$yid, rev := sourcedf[indx$xid, rev]] # update in place corresponding values
Comments
Post a Comment