Hive是一种以SQL风格进行任何大小数据分析的工具,其特点是采取类似关系数据库的SQL命令。其特点是通过 SQL处理Hadoop的大数据,数据规模可以伸缩扩展到100PB+,数据形式可以是结构或非结构数据。
Hive与传统关系数据库比较有如下几个特点:
>侧重于分析,而非实时在线交易无事务机制;
>不像关系数据库那样可以随机进行 insert或update;
>通过Hadoop的map/reduce进行分布式处理,传统数据库则没有;
>传统关系数据库只能拓展最多20个服务器,而Hive可以拓展到上百个服务器。
Hive与mapreduce比较
Hive只需要通过写几条SQL语句就可以实现用java开发mapreduce的功能,简化了代码量,减少了开发效率,对于不熟悉java而又要开发mapreduce的有极大好处。
HIVE_HOME/conf/hive-env.sh文件
# Set HADOOP_HOME to point to a specific hadoop install directory HADOOP_HOME=/home/xusy/share/cdh5.3.6/hadoop-2.5.0-cdh5.3.6 # Hive Configuration Directory can be controlled by: export HIVE_CONF_DIR=/home/xusy/share/cdh5.3.6/hive-0.13.1-cdh5.3.6/conf
$ sbin/hadoop-daemon.sh start namenode $ sbin/hadoop-daemon.sh start datanode $ sbin/yarn-daemon.sh start resourcemanager $ sbin/yarn-daemon.sh start nodemanager $ sbin/mr-jobhistory-daemon.sh start historyserver
$bin/hadoop fs -mkdir /tmp $bin/hadoop fs -mkdir /user/hive/warehouse $bin/hadoop fs -chmod g+w /tmp $bin/hadoop fs -chmod g+w /user/hive/warehouse
hive>drop table if exists student ; hive>CREATE TABLE student( hive>id int, hive>name string) hive>ROW FORMAT DELIMITED hive>FIELDS TERMINATED BY ',';
stu.txt
01,xusy 02,liyj 03,liujl 04,hunl 05,zhaoyy 06,zhoujq 07,yuang 08,zhangz
//通过hue上传到hdfs的/user/xusy/data下,如果没装hue的话,就用命令上传;
数据源存储在hdgs的路径如下:
hive>load data inpath '/user/xusy/data/stu.txt' overwrite into table student ;
hive>select * from student ;
hive>select count(1) from student ;
$sudo yum -y install mysql-server
切换到root用户 # service mysqld start
# chkconfig mysqld on # chkconfig --list|grep mysqld
$mysqladmin -uroot password root
$mysql -uroot -proot
mysql> SET PASSWORD = PASSWORD('123456');
mysql>use mysql ; mysql> select Password,Host,User from user ;
mysql> update user set Host='%' where User='root'and Host='localhost'; mysql> select Password,Host,User from user ;
mysql> delete from user where User='root' and Host='127.0.0.1'; mysql> select Password,Host,User from user;
mysql> delete from user where User='root' and Host='xuxudede.com'; mysql>select Password,Host,User from user;
mysql> delete from user where Host='xuxudede.com'; mysql> select Password,Host,User from user ;
mysql> delete from user where Host='localhost'; mysql> select Password,Host,User from user ;
mysql>flush privileges ;
注意:如果HIVE_HOME/conf下的hive-site.xml不存在,则自己创建。
配置如下:
javax.jdo.option.ConnectionURL jdbc:mysql://xuxudede.com/metadata?createDatabaseIfNotExist=true javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver
javax.jdo.option.ConnectionUserName root
javax.jdo.option.ConnectionPassword holystar
$tar -zxvf mysql-connecto-java-5.1.27.tar.gz $ cp mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /hive-0.13.1-cdh5.3.6/lib/
$bin/hive
mysql> use metadata;
mysql> select * from TBLS ; mysql> use TBLS ;
至此,已成功安装Hive及配置了mysql存储hive的元数据!