best match algorithm in python -
what should optimal implementation best match in python.
i have txt file has country codes mapping e.g.
codes name
123 abc
1234 def
1235 ghi
124 jkl
1241 mno
this txt file big(13500 records) i'm putting sample.
further have cdr files country code(numeric) in each record(row) want convert country name.
now mean best match is, cdr record contains country code "1234" country name "def", if "1235" country name "ghi" if country code "1236" perfect match fails , should fall "abc" since "123" available.
i don't know there standard name kinda search. greedy search in regular expressions.
what can best implementation kind search, since cdr files big(upto 25gb).
dictionaries easiest way implement this. see below solution:
- convert
123 abc
1234 def
1235 ghi
124 jkl
1241 mno
to {1241: 'mno', 1234: 'def', 123: 'abc', 124: 'jkl', 1235: 'ghi'}
- read cdr file country codes , search in dictionary
- if code not found remove unit's place , search again.
- still not found- print 'no match found'
below code same-
country_name = {} open('u:\countrynames.csv','r') f: line in f: linesplit = line.split() country_name[int(linesplit[0])] = linesplit[1] open('u:\countrycodescdr.csv','r') f: line in f: country_code = int(line.strip()) while country_code != 0: if country_code in country_name: print country_name[country_code] break else: country_code /=10 else: print 'no match found'
Comments
Post a Comment