dataframe - R: Insert multiple rows (variable number) in data frame -
i have data frame with, say, 5 rows, 2 observables. need insert "dummy" or "zero" rows in data frame number of rows per observable same (and can bigger n rows longer one). e.g.:
# have: x = c("a","a","b","b","b") y = c(2,4,5,2,6) dft = data.frame(x,y) print(dft) x y 1 2 2 4 3 b 5 4 b 2 5 b 6
here's i'd get, i.e. add n rows per observable 4. mock df
x1 = c("a","a","a","a","b","b","b","b") y1 = c(2,4,0,0,5,2,6,0) dft1 = data.frame(x1,y1) print(dft1) x1 y1 1 2 2 4 3 0 4 0 5 b 5 6 b 2 7 b 6 8 b 0
i started getting n rows in original data frame per observable ddply
, know how many rows need add each observable.
library(plyr) nr = ddply(dft,.(x),summarise,val=length(x)) print(nr) x val 1 2 2 b 3 # n extras 2 , 1 reach 4 per obs. repl = 4 - nr$val repl_name = nr$x repl_x = rep(repl_name,repl) print(repl_x) [1] a b levels: b dfa = matrix("-",nrow=sum(repl),ncol=1) dff = data.frame(repl_x,as.data.frame(dfa)) names(dff) <- names(dft) dft = rbind(dft,dff) dft = dft[order(as.character(dft$x)),] print(dft) x y 1 2 2 4 6 - 7 - 3 b 5 4 b 2 5 b 6 8 b -
i did achieve goal, in quite few operations , transformations.
so, question - there simpler , faster way insert arbitrary number of empty/dummy rows in several places in data frame. number of columns , rows can any.
note: code above works, believe question not "review code" type, genuine - "how better" question. thank you!
you can try using "data.table" package let use "length<-"
expand out rows.
demo:
library(data.table) as.data.table(dft)[, lapply(.sd, `length<-`, 4), = x] ## x y z ## 1: 2 2 ## 2: 4 3 ## 3: na na ## 4: na na ## 5: b 5 4 ## 6: b 2 5 ## 7: b 6 6 ## 8: b na na
update
upon provocation thela-the-taunter™, if want stick base r, perhaps can create function following:
narowsbygroup <- function(indf, group, rowsneeded) { do.call(rbind, lapply(split(indf, indf[[group]]), function(x) { x <- data.frame(lapply(x, `length<-`, rowsneeded)) x[group] <- x[[group]][1] x })) }
usage be:
narowsbygroup(dft, 1, 4) # x y z # 1 2 2 # 2 4 3 # 3 na na # 4 na na # 5 b 5 4 # 6 b 2 5 # 7 b 6 6 # 8 b na na
sample data:
x = c("a","a","b","b","b") y = c(2,4,5,2,6) z = c(2,3,4,5,6) dft = data.frame(x,y,z)
Comments
Post a Comment