Saturday, May 5, 2018

HiveOnSpark系列:metadata.HiveException: java.util.concurrent.TimeoutException


行一个数据量很大的SparkOnHive SQL(如下), 会报TimeoutException。



ERROR : Failed to monitor Job[0] with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(java.util.concurrent.TimeoutException)'
org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.TimeoutException
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:174) ~[hive-exec-2.3.2.jar:2.3.2]
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getState(RemoteSparkJobStatus.java:81) ~[hive-exec-2.3.2.jar:2.3.2]
at org.apache.hadoop.hive.ql.exec.spark.status.RemoteSparkJobMonitor.startMonitor(RemoteSparkJobMonitor.java:82) [hive-exec-2.3.2.jar:2.3.2]
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobRef.monitorJob(RemoteSparkJobRef.java:60) [hive-exec-2.3.2.jar:2.3.2]
at org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:116) [hive-exec-2.3.2.jar:2.3.2]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:199) [hive-exec-2.3.2.jar:2.3.2]
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100) [hive-exec-2.3.2.jar:2.3.2]
at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:79) [hive-exec-2.3.2.jar:2.3.2]
Caused by: java.util.concurrent.TimeoutException
at io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:56) ~[netty-all-4.0.52.Final.jar:4.0.52.Final]
at org.apache.hadoop.hive.ql.exec.spark.status.impl.RemoteSparkJobStatus.getSparkJobInfo(RemoteSparkJobStatus.java:171) ~[hive-exec-2.3.2.jar:2.3.2]
... 7 more
根据错误栈信息追踪源代码(入口: RemoteSparkJobStatus.java:174),发现有个设置超时选项,regarding "Timeout for requests from Hive client to remote Spark driver". 默认是60s,应该是因为application比较复杂,导致请求时间会久一些,设置为600s即可解决。
sparkClientTimtout = hiveConf.getTimeVar(HiveConf.ConfVars.SPARK_CLIENT_FUTURE_TIMEOUT, TimeUnit.SECONDS);
SPARK_CLIENT_FUTURE_TIMEOUT("hive.spark.client.future.timeout",
"60s", new TimeValidator(TimeUnit.SECONDS),
"Timeout for requests from Hive client to remote Spark driver.")

No comments:

Post a Comment