Merging in r without writing multiple merge statements -
i have 1 data set base & there 7 other datasets 7 different years & 3 different regions .these datasets include amount,region & year common base data.
however, need merge 7 data sets 1 one base dataset.how achieve that?
base dataset:
company_region raised_amount_usd year sf bay area 1000050 2011 sf bay area 2520000 2011 sf bay area 15000 2010 singapore 615000 2011
for year 2007:
raised_amount_usd z e year company_region 1.00e+06 5 0 2007 singapore 8.00e+06 6 1 2007 singapore
50000 3 0 2007 singapore 35000 3 0 2007 singapore
& have data other years 2008-2012.i need columns z & e in base data set.instead of writing 7 merge statements how can done through function?
would great if can out.thanks in advance!!
if want keep columns z , e, bind_rows() dplyr package seems answer (see here combine 2 data frames rows (rbind) when have different sets of columns)
# create example <- c(rep("sf bay area",3),"singapore") b <- c(1000050,2520000,15000,615000) c <- c(2011,2010,2011,2011) base <- cbind.data.frame(a,b,c,stringsasfactors =f) colnames(base) <- c("company_region","raised_amount_usd","year") <- c(rep("germany",4)) b <- c(100055,2524400,150020,68880) c <- c(2007,2007,2007,2007) e <- c(1,1,1,1) z <- c(1,1,1,1) data_germany <- cbind.data.frame(a,b,c,e,z,stringsasfactors =f) colnames(data_germany) <- c("company_region","raised_amount_usd","year","e","z") <- c(rep("italy",4)) b <- c(100055,2524400,150020,68880) c <- c(2007,2007,2007,2007) e <- c(1,1,1,1) z <- c(1,1,1,1) data_italy <- cbind.data.frame(a,b,c,e,z,stringsasfactors =f) colnames(data_italy) <- c("company_region","raised_amount_usd","year","e","z") # bin german , italian data @ once dplyr library(dplyr) base %>% bind_rows(data_germany) %>% bind_rows(data_italy) -> base
if don't want keep z , e, can this:
# function extent base dataframe # base_df = base dataframe extent # add_df = dataframe should added base dataframe fun_extent_data <- function(base_df,add_df) { library(dplyr) base_df <- base_df add_df <- add_df # choose necessary columns add_df %>% select(company_region,raised_amount_usd,year) -> add_df_light # bind data base dataframe rbind.data.frame(base_df,add_df_light,stringsasfactors = false) -> base_df return(base_df) } # use function fun_extent_data(base,data_germany) -> base # use function german , italian data @ once dplyr library(dplyr) base %>% fun_extent_data(.,data_germany) %>% fun_extent_data(.,data_italy) -> base
Comments
Post a Comment