dataframe - In R: Replacing value of a data frame column by the value of another data frame when between condition is matched -


i have 2 dataframes:

set.seed(343) testdf <- data.frame(score = sample(50, size=50, replace=true), number = rep(letters[1:25],2), rev = rep(0,50)) sourcedf <- data.frame(min = c(1,10,20,30,40), max = c(9, 19, 29, 39, 50), rev = 1:5) 

for each row of testdf testdf$score between sourcedf$min , sourcedf$max of sourcedf, replace value of testdf$rev corresponding sourcedf$rev.

i have working 2 loops , if condition ... slow (my dataset has close 1 million rows). tried using findinterval without success.

is there better/more efficient way this?

first, see comment on how improve question , make reproducible. second, here's possible approach how run overlapping joins using data.table::foverlaps

library(data.table) setkey(setdt(testdf)[, score2 := score], score, score2) # create bounds , key setkey(setdt(sourcedf), min, max) # key min, max indx <- foverlaps(sourcedf, testdf, nomatch = 0l, = true) # run foverlaps testdf[indx$yid,  rev := sourcedf[indx$xid, rev]] # update in place corresponding values 

Comments

Popular posts from this blog

OpenCV OpenCL: Convert Mat to Bitmap in JNI Layer for Android -

android - org.xmlpull.v1.XmlPullParserException: expected: START_TAG {http://schemas.xmlsoap.org/soap/envelope/}Envelope -

python - How to remove the Xframe Options header in django? -