###-- hiveserver2(metastore) belongs to user 'supertool' -- K1201:~>ps aux | grep -v grep | grep metastore.HiveMetaStore --color 500 30320 0.0 0.5 1209800 263548 ? Sl Jan28 59:29 /usr/java/jdk1.7.0_11//bin/java -Xmx10000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/workspace/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/workspace/hadoop -Dhadoop.id.str=supertool -Dhadoop.root.logger=INFO,console -Djava.library.path=/home/workspace/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /home/workspace/hive-0.13.0-bin/lib/hive-service-0.13.0.jar org.apache.hadoop.hive.metastore.HiveMetaStore K1201:~>cat /etc/passwd | grep 500 supertool:x:500:500:supertool:/home/supertool:/bin/bash ###-- invoke hive command in user 'withdata' and create a database and table -- 114:~>whoami withdata 114:~>hive hive> create database test_db; OK Time taken: 1.295 seconds hive> use test_db; OK Time taken: 0.031 seconds hive> create table test_tbl(id int); OK Time taken: 0.864 seconds ###-- the newly-created database and table belongs to user 'supertool' -- 114:~>hadoop fs -ls /user/supertool/hive/warehouse | grep test_db drwxrwxr-x - supertool supertool 0 2015-07-08 15:13 /user/supertool/hive/warehouse/test_db.db 114:~>hadoop fs -ls /user/supertool/hive/warehouse/test_db.db Found 1 items drwxrwxr-x - supertool supertool 0 2015-07-08 15:13 /user/supertool/hive/warehouse/test_db.db/test_tbl
This can be explained by Hive User Impersonation. By default, HiveServer2 performs the query processing as the user who submitted the query. But if related parameters, which are as follows, are set wrongly, the query will run as the user that the hiveserver2 process runs as. The correct way to configure is as below:
<property> <name>hive.server2.enable.doAs</name> <value>true</value> <description>Set this property to enable impersonation in Hive Server 2</description> </property> <property> <name>hive.metastore.execute.setugi</name> <value>true</value> <description>Set this property to enable Hive Metastore service impersonation in unsecure mode. In unsecure mode, setting this property to true will cause the metastore to execute DFS operations using the client's reported user and group permissions. Note that this property must be set on both the client and server sides. If the client sets it to true and the server sets it to false, the client setting will be ignored.</description> </property>
The above settings is self-explained well in their descriptions. Thus there's a need to rectify our hive-site.xml and then restart our hiveserver2(metastore) service.
At this point, there's a puzzling problem that no matter how I change my HIVE_HOME/conf/hive-site.xml, the corresponding setting is not altered at runtime. Eventually, I found that there's another hive-site.xml under HADOOP_HOME/etc/hadoop directory. Consequently, it is advised that we should not put any hive-related configuration files under HADOOP_HOME directory in avoidance of confusion. The official configuration files loading order of precedence can be found at REFERENCE_5.
After revising HIVE_HOME/conf/hive-site.xml, the following commands have guaranteed that the preceding problem is addressed properly.
###-- check runtime hive parameters related with hive user impersonation -- k1227:/home/workspace/hive-0.13.0-bin>hive hive> set system:user.name; system:user.name=hadoop hive> set hive.server2.enable.doAs; hive.server2.enable.doAs=true hive> set hive.metastore.execute.setugi; hive.metastore.execute.setugi=true ###-- start hiveserver2(metastore) again -- k1227:/home/workspace/hive-0.13.0-bin>hive --service metastore Starting Hive Metastore Server 15/07/08 14:28:59 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces 15/07/08 14:28:59 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize 15/07/08 14:28:59 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative 15/07/08 14:28:59 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node 15/07/08 14:28:59 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive 15/07/08 14:28:59 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack 15/07/08 14:28:59 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize 15/07/08 14:28:59 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed 15/07/08 14:28:59 WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore. 15/07/08 14:28:59 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use hive.hmshandler.retry.* instead SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/workspace/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/workspace/hive-0.13.0-bin/lib/jud_test.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] ^Z [1]+ Stopped hive --service metastore k1227:/home/workspace/hive-0.13.0-bin>bg 1 [1]+ hive --service metastore & k1227:/home/workspace/hive-0.13.0-bin>ps aux | grep metastore hadoop 6597 26.6 0.4 1161404 275564 pts/0 Sl 14:28 0:14 /usr/java/jdk1.7.0_11//bin/java -Xmx20000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/home/workspace/hadoop/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/home/workspace/hadoop -Dhadoop.id.str=hadoop -Dhadoop.root.logger=INFO,console -Djava.library.path=/home/workspace/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx512m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar /home/workspace/hive-0.13.0-bin/lib/hive-service-0.13.0.jar org.apache.hadoop.hive.metastore.HiveMetaStore hadoop 11936 0.0 0.0 103248 868 pts/0 S+ 14:29 0:00 grep metastore
In which, `set system:user.name` will display current user executing hive command; `set [parameter]` will display the specific parameter's value at runtime. Alternatively, we could list all runtime parameters via `set` command in hive, or from command line: `hive -e "set;" > hive_runtime_parameters.txt`.
A possible exception 'TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9083' will be complained when launching metastore service. According to REFERENCE_6, this is because another metastore or sort of service occupies 9083 port, which is the default port for hive metastore. Kill it beforehand:
k1227:/home/workspace/hive-0.13.0-bin>lsof -i:9083 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 3499 hadoop 236u IPv4 3913377019 0t0 TCP *:9083 (LISTEN) k1227:/home/workspace/hive-0.13.0-bin>kill -9 3499
In this way, we could create database/table again, the owner of corresponding HDFS files/dirs will be changed to the user invoking hive command.
REFERENCE:
1. Setting Up HiveServer2 - Impersonation
2. hive-default.xml.template [hive.metastore.execute.setugi]
3. Hive User Impersonation -mapr
4. Configuring User Impersonation with Hive Authorization - drill
5. AdminManual Configuration - hive [order of precedence]
6. TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:9083 - cloudera community
No comments:
Post a Comment