python - Pandas dataframe generate column with different row info, but no apply function -


maybe question name not accurate (sorry because don't find accurate word describe question...), let me make example:

the following dataframe income "week_id" , "user_id":

week_id  user income  1        1    100 1        2    50 2        1    200 2        2    30 2        3    150 3        1    100 3        2    150 .... 

i want add new column, contains "income" of previous week, looks like:

week_id  user income previous_week_income 1        1    100    0 1        2    50     0 2        1    200    100 2        2    30     50 2        3    150    0 3        1    100    200 3        2    150    30 .... 

it looks generate new column information other rows, other current row.

i know solution apply function, it's row row, seems slow case ( origin dataframe may tens of millions of rows ), wonder other fast solution result?

the background generate factor predictive analysis, want use previous week income 1 variable when predict current week income.

thanks in advance :)

i think need dataframegroupby.shift fillna if each week_id has unique users:

df['previous_week_income'] = df.groupby('user')['income'].shift().fillna(0) print (df)    week_id  user  income  previous_week_income 0        1     1     100                   0.0 1        1     2      50                   0.0 2        2     1     200                 100.0 3        2     2      30                  50.0 4        2     3     150                   0.0 5        3     1     100                 200.0 6        3     2     150                  30.0 

Comments

Popular posts from this blog

Spring Boot + JPA + Hibernate: Unable to locate persister -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -

c - double free or corruption (fasttop) -