mysql - Java - Having a huge ArrayList (1 million +), how to create a String of it in a acceptable amount of time? -
as mentioned, have huge arraylists in format:
list<list<string>> alist;
i lists out of .csv wich on extern databasesystem. (i wrote visual objects script export data need calculations)
after have .csv, load content list this:
string line; alist = new arraylist<list<string>>(); int i=0; // datei laden, und anschließend die zeilen der csv in eine arraylist speichern try { br = new bufferedreader(new filereader(path)); while((line = br.readline()) != null) { line = line.replace(",", "."); if(line.endswith(delimitter)) line = line + " "; alist.add(arrays.aslist(line.split(delimitter))); i++; if(i==10000000) break; } }
it takes me ~15900 ms store data in arraylist (~1,1 million rows , 11 columns). pretty okay guess. now, have list in format need, create insert out of it, can import data in our database. i'm creating insert this:
public string getinsertstring() { // tabelle ergibt sich aus dem dateinamen, ohne das .csv string insert="replace "+tablename + " ("; // spaltennamen aus array auslesen, immer die erste zeile des csv for(int i=0; i< alist.get(0).size();i++) { if(i==0) insert = insert + alist.get(0).get(i).trim(); else insert = insert + " ,"+ alist.get(0).get(i).trim(); } insert= insert + ") \rvalues"; // values der spalten in den insert schreiben + korreckte syntax des bfehels sicherstellen usw. for(int i=1;i < alist.size();i++) // size nach "unten" in der 2d liste { insert= insert +"("; for(int j=0; j < alist.get(0).size();j++) // size nach "rechts" in der 2d liste { // bei dem ersten ohne "," starten, damit die syntax stimmt // sollte der aktuelle wert eine zahl oder "null" sein, keine "'" setzen. ansonsten "'" setzen fuer den insert in die db if(j==0) { if((stringutils.isstrictlynumeric(alist.get(i).get(j).trim())) || alist.get(i).get(j).trim().contains("null")) insert = insert + alist.get(i).get(j).trim(); else insert = insert + "'" + alist.get(i).get(j).trim() + "'"; } else { if(((stringutils.isstrictlynumeric(alist.get(i).get(j).trim()))) || (alist.get(i).get(j).trim().contains("null"))) insert = insert +","+ alist.get(i).get(j).trim(); else insert = insert + ",'" + alist.get(i).get(j).trim() + "'"; } } if(i < alist.size()-1) insert= insert +"),"; else insert= insert +")"; insert = insert +"\r"; } //system.out.println(insert); return insert; }
here go on whole list , add values string can use string insert. use filenames table names , first row of file columns in insert. other rows values.
after step done, string "replace tablename (column1,column2 ... columnx) values(value1, value2 ... valuex), (value1, value2 ... valuex), ...."
now excecute insertindb class string , yeah, thats it.
but second step takes way time. (i wait hour now) there smarter way want do? (insert csv automaticly in our database)
would biglist maybe improve speed? https://dzone.com/articles/biglist-scalable-high (can't test right now)
edit: did solve problem:
i created class sql statement -->
public class buildinsert { private string insertstring; public string getinsertstring() { return insertstring; } buildinsert(string tablename, list<string> alist ) { int size = alist.size(); this.insertstring = "replace " + tablename + "("; // insert "header" for(int j=1; j <= size ;j++) { if(j < size) this.insertstring = this.insertstring + alist.get(j-1)+","; else this.insertstring = this.insertstring + alist.get(j-1)+")\n"; } this.insertstring = this.insertstring +"values("; // insert values for(int j=1; j <= size ;j++) { if(j < size) this.insertstring = this.insertstring + "?,"; else this.insertstring = this.insertstring + "?)"; } } }
and tock string batch insert / prepared statement mike mentioned:
connection con; statement stmt; final int batchsize = 1000; int count = 0; int sizeh = alist.size(); int sizel = alist.get(0).size(); try { // connection oeffnen und prepared statment vorbereiten system.out.println("connecting database..."); con = drivermanager.getconnection(db_url,user,pass); con.setautocommit(false); ps = con.preparestatement(insertstring); stmt = con.createstatement(); //< alist.size() for(int i=1;i < sizeh ;i++) // size nach "unten" { for(int j=0; j < sizel;j++) // size nach "rechts" { ps.setstring(j+1, alist.get(i).get(j)); } ps.addbatch(); if(++count % batchsize == 0){ ps.executebatch(); con.commit(); } } ps.executebatch(); con.commit(); }
the insert faster now. (~230 sek 1,1 mil rows)
thank guys ;-)
now, have list in format need, create insert out of it, can import data in our database.
don't that! jdbc batch insert instead. (see this on how preparedstatement save lot of trouble on formatting insert statements)
Comments
Post a Comment