Python pandas remove rows where multiple conditions are not met -
lets have dataframe this:
id num 0 1 1 1 2 2 2 3 1 3 4 2 4 1 1 5 2 2 6 3 1 7 4 2
the above can generated testing purposes:
test = pd.dataframe({'id': np.array([1,2,3,4] * 2,dtype='int32'), 'num': np.array([1,2] * 4,dtype='int32') })
now, want keep rows condition met: id
not 1 , num
not 1. want remove rows index 0 , 4. actual dataset easier remove rows dont want rather specify rows want
i have tried this:
test = test[(test['id'] != 1) & (test['num'] != 1)]
however, gives me this:
id num 1 2 2 3 4 2 5 2 2 7 4 2
it seems have removed rows id
1 or num
1
i've seen number of other questions answer 1 used above doesn't seem working out in case
if change boolean condition equality , invert combined boolean conditions enclosing both in additional parentheses desired behaviour:
in [14]: test = test[~((test['id'] == 1) & (test['num'] == 1))] test out[14]: id num 1 2 2 2 3 1 3 4 2 5 2 2 6 3 1 7 4 2
i think understanding of boolean syntax incorrect want or
conditions:
in [22]: test = test[(test['id'] != 1) | (test['num'] != 1)] test out[22]: id num 1 2 2 2 3 1 3 4 2 5 2 2 6 3 1 7 4 2
if think means first condition excludes row 'id' equal 1 , 'num' column:
in [24]: test[test['id'] != 1] out[24]: id num 1 2 2 2 3 1 3 4 2 5 2 2 6 3 1 7 4 2 in [25]: test[test['num'] != 1] out[25]: id num 1 2 2 3 4 2 5 2 2 7 4 2
so wanted or
(|
) above conditions
Comments
Post a Comment