apache spark - pyspark 2.0 throw AlreadyExistsException(message:Database default already exists) when interact with hive -


i 've upgraded spark 2.0.0 1.3.1, wrote simple code interact hive(1.2.1) used spark sql, i've put hive-site.xml spark conf directory, , expected results sql, throws weird alreadyexistsexception(message:database default exists) , how ignore this?

【code】

from pyspark.sql import sparksession  ss = sparksession.builder.appname("test").master("local") \     .config("spark.ui.port", "4041") \     .enablehivesupport()\     .getorcreate() ss.sparkcontext.setloglevel("info") ss.sql("show tables").show() 

【log】

setting default log level "warn". adjust logging level use sc.setloglevel(newlevel). 16/08/08 19:41:22 warn util.nativecodeloader: unable load native-hadoop library platform... using builtin-java classes applicable 16/08/08 19:41:24 info execution.sparksqlparser: parsing command: show tables 16/08/08 19:41:25 info hive.hiveutils: initializing hivemetastoreconnection version 1.2.1 using spark classes. 16/08/08 19:41:26 info metastore.hivemetastore: 0: opening raw store implemenation class:org.apache.hadoop.hive.metastore.objectstore 16/08/08 19:41:26 info metastore.objectstore: objectstore, initialize called 16/08/08 19:41:26 info datanucleus.persistence: property hive.metastore.integral.jdo.pushdown unknown - ignored 16/08/08 19:41:26 info datanucleus.persistence: property datanucleus.cache.level2 unknown - ignored 16/08/08 19:41:26 info metastore.objectstore: setting metastore object pin classes hive.metastore.cache.pinobjtypes="table,storagedescriptor,serdeinfo,partition,database,type,fieldschema,order" 16/08/08 19:41:27 info datanucleus.datastore: class "org.apache.hadoop.hive.metastore.model.mfieldschema" tagged "embedded-only" not have own datastore table. 16/08/08 19:41:27 info datanucleus.datastore: class "org.apache.hadoop.hive.metastore.model.morder" tagged "embedded-only" not have own datastore table. 16/08/08 19:41:27 info datanucleus.datastore: class "org.apache.hadoop.hive.metastore.model.mfieldschema" tagged "embedded-only" not have own datastore table. 16/08/08 19:41:27 info datanucleus.datastore: class "org.apache.hadoop.hive.metastore.model.morder" tagged "embedded-only" not have own datastore table. 16/08/08 19:41:27 info datanucleus.query: reading in results query "org.datanucleus.store.rdbms.query.sqlquery@0" since connection used closing 16/08/08 19:41:27 info metastore.metastoredirectsql: using direct sql, underlying db mysql 16/08/08 19:41:27 info metastore.objectstore: initialized objectstore 16/08/08 19:41:27 info metastore.hivemetastore: added admin role in metastore 16/08/08 19:41:27 info metastore.hivemetastore: added public role in metastore 16/08/08 19:41:27 info metastore.hivemetastore: no user added in admin role, since config empty 16/08/08 19:41:27 info metastore.hivemetastore: 0: get_all_databases 16/08/08 19:41:27 info hivemetastore.audit: ugi=felix   ip=unknown-ip-addr  cmd=get_all_databases    16/08/08 19:41:28 info metastore.hivemetastore: 0: get_functions: db=default pat=* 16/08/08 19:41:28 info hivemetastore.audit: ugi=felix   ip=unknown-ip-addr  cmd=get_functions: db=default pat=*  16/08/08 19:41:28 info datanucleus.datastore: class "org.apache.hadoop.hive.metastore.model.mresourceuri" tagged "embedded-only" not have own datastore table. 16/08/08 19:41:28 info session.sessionstate: created local directory: /usr/local/cellar/hive/1.2.1/libexec/conf/tmp/3fbc3578-fdeb-40a9-8469-7c851cb3733c_resources 16/08/08 19:41:28 info session.sessionstate: created hdfs directory: /tmp/hive/felix/3fbc3578-fdeb-40a9-8469-7c851cb3733c 16/08/08 19:41:28 info session.sessionstate: created local directory: /usr/local/cellar/hive/1.2.1/libexec/conf/tmp/felix/3fbc3578-fdeb-40a9-8469-7c851cb3733c 16/08/08 19:41:28 info session.sessionstate: created hdfs directory: /tmp/hive/felix/3fbc3578-fdeb-40a9-8469-7c851cb3733c/_tmp_space.db 16/08/08 19:41:28 info client.hiveclientimpl: warehouse location hive client (version 1.2.1) /user/hive/warehouse 16/08/08 19:41:28 info session.sessionstate: created local directory: /usr/local/cellar/hive/1.2.1/libexec/conf/tmp/8eaa63ec-9710-499f-bd50-6625bf4459f5_resources 16/08/08 19:41:28 info session.sessionstate: created hdfs directory: /tmp/hive/felix/8eaa63ec-9710-499f-bd50-6625bf4459f5 16/08/08 19:41:28 info session.sessionstate: created local directory: /usr/local/cellar/hive/1.2.1/libexec/conf/tmp/felix/8eaa63ec-9710-499f-bd50-6625bf4459f5 16/08/08 19:41:28 info session.sessionstate: created hdfs directory: /tmp/hive/felix/8eaa63ec-9710-499f-bd50-6625bf4459f5/_tmp_space.db 16/08/08 19:41:28 info client.hiveclientimpl: warehouse location hive client (version 1.2.1) /user/hive/warehouse 16/08/08 19:41:28 info metastore.hivemetastore: 0: create_database: database(name:default, description:default database, locationuri:hdfs://localhost:9900/user/hive/warehouse, parameters:{}) 16/08/08 19:41:28 info hivemetastore.audit: ugi=felix   ip=unknown-ip-addr  cmd=create_database: database(name:default, description:default database, locationuri:hdfs://localhost:9900/user/hive/warehouse, parameters:{})  16/08/08 19:41:28 error metastore.retryinghmshandler: alreadyexistsexception(message:database default exists)     @ org.apache.hadoop.hive.metastore.hivemetastore$hmshandler.create_database(hivemetastore.java:891)     @ sun.reflect.nativemethodaccessorimpl.invoke0(native method)     @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62)     @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43)     @ java.lang.reflect.method.invoke(method.java:497)     @ org.apache.hadoop.hive.metastore.retryinghmshandler.invoke(retryinghmshandler.java:107)     @ com.sun.proxy.$proxy22.create_database(unknown source)     @ org.apache.hadoop.hive.metastore.hivemetastoreclient.createdatabase(hivemetastoreclient.java:644)     @ sun.reflect.nativemethodaccessorimpl.invoke0(native method)     @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62)     @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43)     @ java.lang.reflect.method.invoke(method.java:497)     @ org.apache.hadoop.hive.metastore.retryingmetastoreclient.invoke(retryingmetastoreclient.java:156)     @ com.sun.proxy.$proxy23.createdatabase(unknown source)     @ org.apache.hadoop.hive.ql.metadata.hive.createdatabase(hive.java:306)     @ org.apache.spark.sql.hive.client.hiveclientimpl$$anonfun$createdatabase$1.apply$mcv$sp(hiveclientimpl.scala:291)     @ org.apache.spark.sql.hive.client.hiveclientimpl$$anonfun$createdatabase$1.apply(hiveclientimpl.scala:291)     @ org.apache.spark.sql.hive.client.hiveclientimpl$$anonfun$createdatabase$1.apply(hiveclientimpl.scala:291)     @ org.apache.spark.sql.hive.client.hiveclientimpl$$anonfun$withhivestate$1.apply(hiveclientimpl.scala:262)     @ org.apache.spark.sql.hive.client.hiveclientimpl.liftedtree1$1(hiveclientimpl.scala:209)     @ org.apache.spark.sql.hive.client.hiveclientimpl.retrylocked(hiveclientimpl.scala:208)     @ org.apache.spark.sql.hive.client.hiveclientimpl.withhivestate(hiveclientimpl.scala:251)     @ org.apache.spark.sql.hive.client.hiveclientimpl.createdatabase(hiveclientimpl.scala:290)     @ org.apache.spark.sql.hive.hiveexternalcatalog$$anonfun$createdatabase$1.apply$mcv$sp(hiveexternalcatalog.scala:99)     @ org.apache.spark.sql.hive.hiveexternalcatalog$$anonfun$createdatabase$1.apply(hiveexternalcatalog.scala:99)     @ org.apache.spark.sql.hive.hiveexternalcatalog$$anonfun$createdatabase$1.apply(hiveexternalcatalog.scala:99)     @ org.apache.spark.sql.hive.hiveexternalcatalog.withclient(hiveexternalcatalog.scala:72)     @ org.apache.spark.sql.hive.hiveexternalcatalog.createdatabase(hiveexternalcatalog.scala:98)     @ org.apache.spark.sql.catalyst.catalog.sessioncatalog.createdatabase(sessioncatalog.scala:147)     @ org.apache.spark.sql.catalyst.catalog.sessioncatalog.<init>(sessioncatalog.scala:89)     @ org.apache.spark.sql.hive.hivesessioncatalog.<init>(hivesessioncatalog.scala:51)     @ org.apache.spark.sql.hive.hivesessionstate.catalog$lzycompute(hivesessionstate.scala:49)     @ org.apache.spark.sql.hive.hivesessionstate.catalog(hivesessionstate.scala:48)     @ org.apache.spark.sql.hive.hivesessionstate$$anon$1.<init>(hivesessionstate.scala:63)     @ org.apache.spark.sql.hive.hivesessionstate.analyzer$lzycompute(hivesessionstate.scala:63)     @ org.apache.spark.sql.hive.hivesessionstate.analyzer(hivesessionstate.scala:62)     @ org.apache.spark.sql.execution.queryexecution.assertanalyzed(queryexecution.scala:49)     @ org.apache.spark.sql.dataset$.ofrows(dataset.scala:64)     @ org.apache.spark.sql.sparksession.sql(sparksession.scala:582)     @ sun.reflect.nativemethodaccessorimpl.invoke0(native method)     @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:62)     @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43)     @ java.lang.reflect.method.invoke(method.java:497)     @ py4j.reflection.methodinvoker.invoke(methodinvoker.java:237)     @ py4j.reflection.reflectionengine.invoke(reflectionengine.java:357)     @ py4j.gateway.invoke(gateway.java:280)     @ py4j.commands.abstractcommand.invokemethod(abstractcommand.java:128)     @ py4j.commands.callcommand.execute(callcommand.java:79)     @ py4j.gatewayconnection.run(gatewayconnection.java:211)     @ java.lang.thread.run(thread.java:745)  16/08/08 19:41:28 info metastore.hivemetastore: 0: get_database: default 16/08/08 19:41:28 info hivemetastore.audit: ugi=felix   ip=unknown-ip-addr  cmd=get_database: default    16/08/08 19:41:28 info metastore.hivemetastore: 0: get_database: default 16/08/08 19:41:28 info hivemetastore.audit: ugi=felix   ip=unknown-ip-addr  cmd=get_database: default    16/08/08 19:41:28 info metastore.hivemetastore: 0: get_tables: db=default pat=* 16/08/08 19:41:28 info hivemetastore.audit: ugi=felix   ip=unknown-ip-addr  cmd=get_tables: db=default pat=*     16/08/08 19:41:28 info spark.sparkcontext: starting job: showstring @ nativemethodaccessorimpl.java:-2 16/08/08 19:41:28 info scheduler.dagscheduler: got job 0 (showstring @ nativemethodaccessorimpl.java:-2) 1 output partitions 16/08/08 19:41:28 info scheduler.dagscheduler: final stage: resultstage 0 (showstring @ nativemethodaccessorimpl.java:-2) 16/08/08 19:41:28 info scheduler.dagscheduler: parents of final stage: list() 16/08/08 19:41:28 info scheduler.dagscheduler: missing parents: list() 16/08/08 19:41:28 info scheduler.dagscheduler: submitting resultstage 0 (mappartitionsrdd[2] @ showstring @ nativemethodaccessorimpl.java:-2), has no missing parents 16/08/08 19:41:28 info memory.memorystore: block broadcast_0 stored values in memory (estimated size 3.9 kb, free 366.3 mb) 16/08/08 19:41:29 info memory.memorystore: block broadcast_0_piece0 stored bytes in memory (estimated size 2.4 kb, free 366.3 mb) 16/08/08 19:41:29 info storage.blockmanagerinfo: added broadcast_0_piece0 in memory on 172.68.80.25:58224 (size: 2.4 kb, free: 366.3 mb) 16/08/08 19:41:29 info spark.sparkcontext: created broadcast 0 broadcast @ dagscheduler.scala:1012 16/08/08 19:41:29 info scheduler.dagscheduler: submitting 1 missing tasks resultstage 0 (mappartitionsrdd[2] @ showstring @ nativemethodaccessorimpl.java:-2) 16/08/08 19:41:29 info scheduler.taskschedulerimpl: adding task set 0.0 1 tasks 16/08/08 19:41:29 info scheduler.tasksetmanager: starting task 0.0 in stage 0.0 (tid 0, localhost, partition 0, process_local, 5827 bytes) 16/08/08 19:41:29 info executor.executor: running task 0.0 in stage 0.0 (tid 0) 16/08/08 19:41:29 info codegen.codegenerator: code generated in 152.42807 ms 16/08/08 19:41:29 info executor.executor: finished task 0.0 in stage 0.0 (tid 0). 1279 bytes result sent driver 16/08/08 19:41:29 info scheduler.tasksetmanager: finished task 0.0 in stage 0.0 (tid 0) in 275 ms on localhost (1/1) 16/08/08 19:41:29 info scheduler.taskschedulerimpl: removed taskset 0.0, tasks have completed, pool  16/08/08 19:41:29 info scheduler.dagscheduler: resultstage 0 (showstring @ nativemethodaccessorimpl.java:-2) finished in 0.288 s 16/08/08 19:41:29 info scheduler.dagscheduler: job 0 finished: showstring @ nativemethodaccessorimpl.java:-2, took 0.538913 s 16/08/08 19:41:29 info codegen.codegenerator: code generated in 13.588415 ms +-------------------+-----------+ |          tablename|istemporary| +-------------------+-----------+ |      app_visit_log|      false| |        cms_article|      false| |                 p4|      false| |              p_bak|      false| +-------------------+-----------+  16/08/08 19:41:29 info spark.sparkcontext: invoking stop() shutdown hook 

ps:everything works when test in java.

any highly appreciated.

as log show, message not mean bad thing happened, check if default database has been existed; in fact, these exception logs should not displayed if default database exist.


Comments

Popular posts from this blog

Spring Boot + JPA + Hibernate: Unable to locate persister -

go - Golang: panic: runtime error: invalid memory address or nil pointer dereference using bufio.Scanner -

c - double free or corruption (fasttop) -