mysql - Java - Having a huge ArrayList (1 million +), how to create a String of it in a acceptable amount of time? -


as mentioned, have huge arraylists in format:

list<list<string>> alist; 

i lists out of .csv wich on extern databasesystem. (i wrote visual objects script export data need calculations)

after have .csv, load content list this:

string line;        alist = new arraylist<list<string>>();        int i=0;     // datei laden, und anschließend die zeilen der csv in eine arraylist speichern          try {                           br = new bufferedreader(new filereader(path));             while((line = br.readline()) != null)             {                 line = line.replace(",", ".");                 if(line.endswith(delimitter))                     line = line + " ";                 alist.add(arrays.aslist(line.split(delimitter)));                 i++;                 if(i==10000000)                     break;             }                 } 

it takes me ~15900 ms store data in arraylist (~1,1 million rows , 11 columns). pretty okay guess. now, have list in format need, create insert out of it, can import data in our database. i'm creating insert this:

public string getinsertstring()     {         // tabelle ergibt sich aus dem dateinamen, ohne das .csv         string insert="replace "+tablename + " (";          // spaltennamen aus array auslesen, immer die erste zeile des csv         for(int i=0; i< alist.get(0).size();i++)         {             if(i==0)                 insert = insert + alist.get(0).get(i).trim();             else                 insert = insert + " ,"+ alist.get(0).get(i).trim();         }         insert= insert + ") \rvalues";          // values der spalten in den insert schreiben + korreckte syntax des bfehels sicherstellen usw.         for(int i=1;i < alist.size();i++) // size nach "unten" in der 2d liste         {              insert= insert +"(";             for(int j=0; j < alist.get(0).size();j++) // size nach "rechts" in der 2d liste             {                 // bei dem ersten ohne "," starten, damit die syntax stimmt                 // sollte der aktuelle wert eine zahl oder "null" sein, keine "'" setzen. ansonsten "'" setzen fuer den insert in die db                 if(j==0)                 {                                if((stringutils.isstrictlynumeric(alist.get(i).get(j).trim())) || alist.get(i).get(j).trim().contains("null"))                         insert = insert + alist.get(i).get(j).trim();                     else                          insert = insert + "'" + alist.get(i).get(j).trim() + "'";                 }                                    else                 {                     if(((stringutils.isstrictlynumeric(alist.get(i).get(j).trim()))) || (alist.get(i).get(j).trim().contains("null")))                         insert = insert +","+ alist.get(i).get(j).trim();                     else                          insert = insert + ",'" + alist.get(i).get(j).trim() + "'";                       }              }             if(i < alist.size()-1)                 insert= insert +"),";             else                  insert= insert +")";             insert = insert +"\r";          }          //system.out.println(insert);         return insert;      } 

here go on whole list , add values string can use string insert. use filenames table names , first row of file columns in insert. other rows values.

after step done, string "replace tablename (column1,column2 ... columnx) values(value1, value2 ... valuex), (value1, value2 ... valuex), ...."

now excecute insertindb class string , yeah, thats it.

but second step takes way time. (i wait hour now) there smarter way want do? (insert csv automaticly in our database)

would biglist maybe improve speed? https://dzone.com/articles/biglist-scalable-high (can't test right now)

edit: did solve problem:

i created class sql statement -->

public class buildinsert {       private string insertstring;      public string getinsertstring()     {         return insertstring;     }      buildinsert(string tablename, list<string> alist )     {            int size = alist.size();          this.insertstring = "replace " + tablename + "(";          // insert "header"         for(int j=1; j <= size ;j++)          {             if(j < size)                 this.insertstring = this.insertstring + alist.get(j-1)+",";             else                 this.insertstring = this.insertstring + alist.get(j-1)+")\n";          }          this.insertstring = this.insertstring +"values(";          // insert values         for(int j=1; j <= size ;j++)          {             if(j < size)                 this.insertstring = this.insertstring + "?,";             else                 this.insertstring = this.insertstring + "?)";          }     }  } 

and tock string batch insert / prepared statement mike mentioned:

connection con;         statement stmt;          final int batchsize = 1000;         int count = 0;          int sizeh = alist.size();         int sizel = alist.get(0).size();          try {               // connection oeffnen und prepared statment vorbereiten             system.out.println("connecting database...");             con = drivermanager.getconnection(db_url,user,pass);             con.setautocommit(false);              ps = con.preparestatement(insertstring);              stmt = con.createstatement();            //< alist.size()             for(int i=1;i < sizeh ;i++) // size nach "unten"              {                 for(int j=0; j < sizel;j++) // size nach "rechts"                  {                     ps.setstring(j+1, alist.get(i).get(j));                 }                  ps.addbatch();                  if(++count % batchsize == 0){                     ps.executebatch();                     con.commit();                 }             }                  ps.executebatch();             con.commit();          } 

the insert faster now. (~230 sek 1,1 mil rows)

thank guys ;-)

now, have list in format need, create insert out of it, can import data in our database. 

don't that! jdbc batch insert instead. (see this on how preparedstatement save lot of trouble on formatting insert statements)


Comments

Popular posts from this blog

Spring Boot + JPA + Hibernate: Unable to locate persister -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -

c - double free or corruption (fasttop) -