python - Groupby and any() | all() -
i have following pd.dataframe
in [155]: df1 out[155]: order_id acq date uid 2 3 false 2014-01-03 1 3 4 true 2014-01-04 2 4 5 false 2014-01-05 3 6 7 true 2014-01-08 5 7 8 false 2014-01-08 5 9 10 false 2014-01-10 6 0 11 false 2014-01-11 6
where each entry order, values order_id
, date
, uid
, acq
(indicates whether first order associated uid
in dataset).
i trying filter , keep orders placed users have made first order inside time period covered in dataset (i.e. @ least 1 of orders of such users satisfy acq == true
).
so, desired output be:
order_id acq date uid 3 4 true 2014-01-04 2 6 7 true 2014-01-08 5 7 8 false 2014-01-08 5
and have managed reach by:
in [156]: df1.groupby('uid').filter(lambda x: x.acq.any() == true) out[156]: order_id acq date uid 3 4 true 2014-01-04 2 6 7 true 2014-01-08 5 7 8 false 2014-01-08 5
however, when try find orders placed users have made first order outside time period covered in dataset (i.e. orders should satisfy acq == false
) seem lost. have tried this:
in [159]: df1.groupby('uid').filter(lambda x: x.acq.all() == false) out[159]: order_id acq date uid 2 3 false 2014-01-03 1 4 5 false 2014-01-05 3 6 7 true 2014-01-08 5 ## <- order acquisition, therefore orders uid == 5 should filtered out. 7 8 false 2014-01-08 5 9 10 false 2014-01-10 6 0 11 false 2014-01-11 6
how should go filtering out orders placed users have orders satisfy acq == false
?
any ideas appreciated, thanks!
you need first use condition , add all
:
print (df1.groupby('uid').filter(lambda x: (x.acq == false).all())) order_id acq date uid 2 3 false 2014-01-03 1 4 5 false 2014-01-05 3 9 10 false 2014-01-10 6 0 11 false 2014-01-11 6
Comments
Post a Comment