0%

hive 2.3.7 安装

Hive介绍

官网:hive.apache.org/

Apache Hive™数据仓库软件有助于使用SQL读取,编写和管理驻留在分布式存储中的大型数据集。可以将结构投影到已存储的数据中。提供了命令行工具JDBC驱动程序以将用户连接到Hive。

hive提供了SQL查询功能 hdfs分布式存储。

hive本质HQL转化为MapReduce程序。

Hive 安装前提

  1. 启动 hdfs 集群
  2. 启动 yarn 集群
  3. 启动 mysql

如果想用hive的话,需要提前安装部署好hadoop集群。

安装 Hive

1
ln -s /opt/apache-hive-2.3.7-bin /usr/local/hive

编辑环境变量

1
vi /etc/profile
1
2
3
HIVE_HOME=/usr/local/hive
export HIVE_HOME
PATH=$JAVA_HOME/bin:$HIVE_HOME/bin

配置 hive-site.xml

1
2
cd /usr/local/hive
vi conf/hive-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>

<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
</configuration>

下面的部分如果不配置会产生错误

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
<property>
<name>hive.exec.local.scratchdir</name>
<value>/usr/local/hive</value>
<description>Local scratch space for Hive jobs</description>
</property>

<property>
<name>hive.downloaded.resources.dir</name>
<value>/usr/local/hive/hive-downloaded-addDir/</value>
<description>Temporary local directory for added resources in the remote file system. </description>
</property>

<property>
<name>hive.querylog.location</name>
<value>/usr/local/hive/querylog-location-addDir/</value>
<description>Location of Hive run time structured log file</description>
</property>

<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/usr/local/hive/hive-logging-operation-log-addDir/</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>

<!-- hiveserver2 的配置 -->
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>localhost</value>
</property>

配置 hive-env.sh

1
2
3
4
5
# 添加
export JAVA_HOME=/usr/local/java
export HIVE_HOME=/usr/local/hive
export HADOOP_HOME=/usr/local/hadoop
export HIVE_CONF_DIR=/usr/local/hive/conf

修改hive-log4j.properties

1
vi hive-log4j.properties
1
hive.log.dir=自定义目录

下载并配置 mysql 驱动包

1
cp /mnt/share/mysql-connector-java-5.1.47.jar /usr/local/hive/lib/

初始化元数据

1
/usr/local/hive/bin/schematool -dbType mysql -initSchema

启动 Hive

1
/usr/local/hive/bin/hive

测试

1
2
3
4
create table users(user_id int,username varchar(20),pwd varchar(20),email varchar(30),grade int);
insert into users(user_id,username,pwd,email,grade)values(1,'admin','1234','admin@qq.com',2);
insert into users(user_id,username,pwd,email,grade)values(2,'admin2','1234','admin2@qq.com',2);