hadoop - Compare tuples on basis of a field in pig -
(abc,****,tool1,12) (abc,****,tool1,10) (abc,****,tool1,13) (abc,****,tool2,101) (abc,****,tool3,11)
above input data
following dataset in pig.
schema : username,ip,tool,duration
i want add duration of same tools
output
(abc,****,tool1,35) (abc,****,tool2,101) (abc,****,tool3,11
use group , use sum on duration.
a = load 'data.csv' using pigstorage(',') (username:chararray,ip:chararray,tool:chararray,duration:int); b = group (username,ip,tool); c = foreach b generate flatten(group) (username,ip,tool),sum(a.duration); dump c;
Comments
Post a Comment