読者です 読者をやめる 読者になる 読者になる

CDH4 YARNでHive

動かないというがあったので試してみました。別なところではまりましたが、動いた気がします。なにげに完全分散環境でのHiveは初めてでした。なのと、特にHiveについてあまりまじめに調べていないので、こういう設定正しいかは定かではないです。metastoreはローカルのままとか。

とりあえず、exampleでresource managerにjobを投げられることを確認した状態で、hiveにクエリを投げると怒られました。

WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Logging initialized using configuration in jar:file:/usr/local/share/hive-0.8.1-cdh4.0.0/lib/hive-common-0.8.1-cdh4.0.0.jar!/hive-log4j.properties
Hive history file=/tmp/marblejenka/hive_job_log_marblejenka_201207021832_657273834.txt
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/share/hadoop-2.0.0-cdh4.0.0/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/share/hive-0.8.1-cdh4.0.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
hive> select * from words where word = 'Hello';
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
12/07/02 18:32:37 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name
12/07/02 18:32:37 WARN conf.Configuration: mapred.system.dir is deprecated. Instead, use mapreduce.jobtracker.system.dir
12/07/02 18:32:37 WARN conf.Configuration: mapred.local.dir is deprecated. Instead, use mapreduce.cluster.local.dir
12/07/02 18:32:37 WARN conf.HiveConf: hive-site.xml not found on CLASSPATH
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Execution log at: /tmp/marblejenka/marblejenka_20120702183232_4d2d358e-5ebe-4372-9b1f-b429acb4d8fc.log
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/share/hadoop-2.0.0-cdh4.0.0/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/share/hive-0.8.1-cdh4.0.0/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 1; number of reducers: 0
2012-07-02 18:34:31,211 null map = 0%, reduce = 0%
Ended Job = job_1341216118328_0010 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1341216118328_0010_m_000000 (and more) from job job_1341216118328_0010
Exception in thread "Thread-27" java.lang.IllegalArgumentException: Does not contain a valid host:port authority: local
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:206)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:158)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:147)
at org.apache.hadoop.hive.ql.exec.JobTrackerURLResolver.getURL(JobTrackerURLResolver.java:42)
at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:198)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:83)
at java.lang.Thread.run(Thread.java:680)
Execution failed with exit status: 2
Obtaining error information

Task failed!
Task ID:
Stage-1

Logs:

/tmp/marblejenka/hive.log
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
hive>

”Does not contain a valid host:port authority: local”がなんやねんという感じでいろいろ調べてみると、hive-site.xmlにJobTrackerのURLを指定する必要があるっぽかったので、指定しました。

<property>
  <name>hadoop.config.dir</name>
  <value>/usr/local/etc/hadoop/conf</value>
</property>
<property>
  <name>mapred.job.tracker</name>
  <value>fc5</value>
</property>

こんな感じです。JobTrackerじゃないという話はありますが、ResourceManagerが動いているホスト名を指定しています。設定ファイルはMRv1用ではなくYARN用のものにしています。

それと、インストレーションガイドの下の方にはHADOOP_MAPRED_HOMEを指定しておけと書いてありますが、指定していなくても通りました。

hive> select * from words where word = 'Hello';
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1341216118328_0024, Tracking URL = http://fc5:8088/proxy/application_1341216118328_0024/
Kill Command = /usr/local/share/hadoop-2.0.0-cdh4.0.0/bin/hadoop job -Dmapred.job.tracker=fc5 -kill job_1341216118328_0024
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
2012-07-02 21:34:09,707 Stage-1 map = 0%, reduce = 0%
2012-07-02 21:34:13,856 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.84 sec
MapReduce Total cumulative CPU time: 840 msec
Ended Job = job_1341216118328_0024
MapReduce Jobs Launched:
Job 0: Map: 1 Accumulative CPU: 0.84 sec HDFS Read: 0 HDFS Write: 0 SUCESS
Total MapReduce CPU Time Spent: 840 msec
OK
Hello
Hello
Time taken: 9.078 seconds

hive> select word, count(*) from words group by word;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_1341216118328_0025, Tracking URL = http://fc5:8088/proxy/application_1341216118328_0025/
Kill Command = /usr/local/share/hadoop-2.0.0-cdh4.0.0/bin/hadoop job -Dmapred.job.tracker=fc5 -kill job_1341216118328_0025
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2012-07-02 21:34:23,012 Stage-1 map = 0%, reduce = 0%
2012-07-02 21:34:27,158 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.76 sec
2012-07-02 21:34:28,201 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.76 sec
2012-07-02 21:34:29,246 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.26 sec
MapReduce Total cumulative CPU time: 2 seconds 260 msec
Ended Job = job_1341216118328_0025
MapReduce Jobs Launched:
Job 0: Map: 1 Reduce: 1 Accumulative CPU: 2.26 sec HDFS Read: 0 HDFS Write: 0 SUCESS
Total MapReduce CPU Time Spent: 2 seconds 260 msec
OK
Hello 2
Hive 1
World 1
Time taken: 10.19 seconds
hive>

という感じで動いたっぽいです。サンプルは、信頼と安心のひしだまメモを使用しました。


で、動いたは動いたっぽいですが、containerのログに、

java.lang.InterruptedException
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1961)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2038)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:424)
at org.apache.hadoop.mapreduce.v2.app.speculate.DefaultSpeculator$1.run(DefaultSpeculator.java:189)
at java.lang.Thread.run(Thread.java:662)

というのが出ていて非常に怪しい感じ満載です。


感想
・hiveよくわからないけど流行ってるっぽいのと、結合まわりに偉そうな実装があるから遊んでもいいかも
・pigはこのへんのめんどいところはなく普通に完全分散で動きました
・設定するべきポイントを真面目に洗っていないというのもありますが、なんやかんやでYARNはいろいろ怪しい雰囲気はあります
・怪しいかどうかはさておき、それなりに検証に時間を使わないとCDH3/MRv1からの乗り換えは厳しそう
・森巣さんのところのあれって、java.security.AccessController.doPrivilegedとか言われているし、HDFSの権限周りな雰囲気だけどどうなんですかね