Sparksql整合HIVE的步骤

//Spark和hive进行整合
hive版本是 1.2.1,考虑到兼容性
1.安装hive(可选)

1.安装MySQL并创建一个普通用户,并且授权
	CREATE USER 'bigdata'@'%' IDENTIFIED BY '123456'; 
	GRANT ALL PRIVILEGES ON hivedb.* TO 'bigdata'@'%' IDENTIFIED BY '123456' WITH GRANT OPTION;
	FLUSH PRIVILEGES;

2.添加一个hive-site.xml

  
    javax.jdo.option.ConnectionURL
    jdbc:mysql://hadoop01:3306/hivedb?createDatabaseIfNotExist=true
    JDBC connect string for a JDBC metastore
  

   
    javax.jdo.option.ConnectionDriverName
    com.mysql.jdbc.Driver
    Driver class name for a JDBC metastore
  

  
    javax.jdo.option.ConnectionUserName
    root
    username to use against metastore database
  

  
    javax.jdo.option.ConnectionPassword
    123456
    password to use against metastore database
  


flush privileges;
3.将配置好的hive-site.xml 放入$SPARK-HOME/conf目录下
4.将hadoop的core-site.xmlhdfs-site.xml都放入到Spark的conf目录下

vi /etc/profile
//hadoop配置文件的目录
export HADOOP_CONF_DIR=/usr/local/hadoop-2.7.3/etc/hadoop

5.然后启动HDFS,接下来启动Spark
6.执行spark-sql并制定mysql连接驱动位置
bin/spark-sql --master spark://hadoop01:7077 --jars /usr/local/spark-2.2.2-bin-hadoop2.7/jars/mysql-connector-java-5.1.39.jar 
7.执行hivesql

创建sparkSession时,加上对Hive的支持
    enableHiveSupport()
    
    spark.sql("CREATE TEMPOARY FUNCTION ip2Long as '包名+类名+方法名'")

你可能感兴趣的:(spark-sql)