python - how to loop through xml file and save the result into a dataframe -
i have big xml file without knowing exact structure of file. try loop through xml hierarchies , extract results , put them dataframe. however, code break down after parsed soome text. error returned as
file:"c:\python34\lib....\frame.py", line 4327, in append elif isinstance(other, list) , not isinstance(other[0], dataframe): indexerror: list index out of range
here code python 3.4:
def getelements(fn): df = pd.dataframe() index = 0 context = iter(et.iterparse(fn, events=('start', 'end'))) _, root = next(context) event, elem in context: if elem.tag=='row': alist=[] index += 1 c = elem.findall('column') cl in iter(c): pair = (str(cl.get('name')), str(cl.text), index) alist.append(pair) df=df.append(alist) #print out during processing print(df[ :index]) print ("index=%s"%index) df.to_csv('jv2015.csv', sep=',', quotechar="'") return df
the xml file seems has "row" element, under "row", there different numbers of "column" element. each "column" element, there attribute "name" , value of attribute. example ..
<row> <column name="xxx">value</column> <column name="yyy">value</column> ... </row>
Comments
Post a Comment