Presto的单机安装和整合MySQL与Hive

1.安装server和cli
1.1 server的安装

wget -c https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.226/presto-cli-0.226-executable.jar
tar zxvf presto-server-0.226.tar.gz -C ~/app
cd ~/appt/presto-server-0.226
mkdir etc

1.2 coordinator 节点配置
etc/node.properties

node.environment=production
node.id=ffffffff-ffff-ffff-ffff-ffffffffffff-hadoop002
node.data-dir=/home/hadoop/data/presto

etc/config.properties

coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8090
query.max-memory=8GB
query.max-memory-per-node=1GB
query.max-total-memory-per-node=2GB
discovery-server.enabled=true
discovery.uri=http://hadoop002:8090

etc/jvm.config

-server
-Xmx8G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError

1.3 相关命令介绍
# 启动 Presto
bin/launcher start

# 停止 Presto
bin/launcher stop

# 前台运行 Presto,建议刚开始的时候使用这种方式,如果配置有错误,可以立刻在console上看到错误信息,方便调试。
bin/launcher run
 

关于Presto的更多命令,可以通过如下命令查看
[hadoop@hadoop002 presto-server-0.226]$ bin/launcher --help
Web端访问:http://hadoop002:8090/ui/
Presto的单机安装和整合MySQL与Hive_第1张图片
1.4 安装客户端
下载 presto-cli-0.100-executable.jar:Presto CLI为用户提供了一个用于查询的可交互终端窗口。CLI是一个 可执行 JAR文件, 这也就意味着你可以像UNIX终端窗口一样来使用CLI ,https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.227/presto-cli-0.226-executable.jar文件下载后,重名名为 presto , 使用 chmod +x 命令设置可执行权限
2 配置Hive Connector
2.1 在/home/hadoop/app/presto-server-0.226/etc/catalog目录下创建hive.properties配置文件

connector.name=hive-hadoop2
hive.metastore.uri=thrift://hadoop002:9083
hive.config.resources=/home/hadoop/app/hadoop/etc/hadoop/core-site.xml, /home/hadoop/app/hadoop/etc/hadoop/hdfs-site.xml
hive.allow-drop-table=true

# 参数说明
# connector.name:hive-hadoop2 #应该为指定的版本,以便于presto使用对应的适配器,如果不指定版本,进入cli模式执行查询命令会报:ERROR   main    com.facebook.presto.server.PrestoServer No factory for connector hive 
java.lang.IllegalArgumentException: No factory for connector hive*
# hive.metastore.uri:hive metasrore的URI地址
# hive.config.resources:为core-site.xml和hdfs-site.xml的地址

2.2 想要查询连接到hive中查询数据还需要先启动hive的metastore
启动方式:

bin/hive --service metastore  #或者后台启动:
bin/hive --service metastore 2>&1 >> /var/log.log &
#后台启动,关闭shell连接依然存在:
nohup bin/hive --service metastore 2>&1 >> /var/log.log &

如果启动失败,查看hive的配置文件hive-site.xml是否配置了"hive.metastore.uris"参数


  hive.metastore.uris
  thrift://ip:9083
 

2.3 启动Hive的cli服务

[hadoop@hadoop002 cli]$ ./presto --server localhost:8090 --catalog hive --schema default --debug
--server localhost:8090 
--catalog hive指的是使用 /home/hadoop/app/presto-server-0.226/etc/catalog/hive.properties 这个catalog配置文件的名字
--schema default指的是使用hive的default数据库
--debug 使用debug模式,便于查看日志

2.4 进行相应的操作

presto:default> show tables;
     Table      
----------------
 hive_rownumber 
(1 row)

Query 20191020_162125_00003_gqcnh, FINISHED, 1 node
http://hadoop002:8090/ui/query.html?20191020_162125_00003_gqcnh
Splits: 19 total, 19 done (100.00%)
CPU Time: 0.0s total,    21 rows/s,   659B/s, 17% active
Per Node: 0.0 parallelism,     0 rows/s,    11B/s
Parallelism: 0.0
Peak Memory: 82.4KB
0:03 [1 rows, 31B] [0 rows/s, 11B/s]

presto:default> show schemas from hive;
       Schema       
--------------------
 default            
 information_schema 
 test               
(3 rows)

Query 20191020_162627_00004_gqcnh, FINISHED, 1 node
http://hadoop002:8090/ui/query.html?20191020_162627_00004_gqcnh
Splits: 19 total, 19 done (100.00%)
CPU Time: 0.0s total,   333 rows/s, 4.77KB/s, 6% active
Per Node: 0.0 parallelism,     9 rows/s,   134B/s
Parallelism: 0.0
Peak Memory: 82.4KB
0:00 [3 rows, 44B] [9 rows/s, 134B/s]

presto:default> show tables;
     Table      
----------------
 hive_rownumber 
(1 row)

Query 20191020_163547_00005_gqcnh, FINISHED, 1 node
http://hadoop002:8090/ui/query.html?20191020_163547_00005_gqcnh
Splits: 19 total, 19 done (100.00%)
CPU Time: 0.0s total,   111 rows/s, 3.36KB/s, 5% active
Per Node: 0.0 parallelism,     3 rows/s,   109B/s
Parallelism: 0.0
Peak Memory: 0B
0:00 [1 rows, 31B] [3 rows/s, 109B/s]

presto:default> show tables from hive.test;
         Table         
-----------------------
 dept                  
 emp                   
 emp_dynamic_partition 
 hive_array            
 hive_map              
 hive_struct           
 rating_json           
 rating_width          
 wc                    
(9 rows)

Query 20191020_163808_00006_gqcnh, FINISHED, 1 node
http://hadoop002:8090/ui/query.html?20191020_163808_00006_gqcnh
Splits: 19 total, 19 done (100.00%)
CPU Time: 0.0s total,   600 rows/s, 13.5KB/s, 15% active
Per Node: 0.0 parallelism,    27 rows/s,   632B/s
Parallelism: 0.0
Peak Memory: 0B
0:00 [9 rows, 208B] [27 rows/s, 632B/s]

presto:default> select * from hive.test.dept;
 deptno |   dname    |   loc    
--------+------------+----------
     10 | ACCOUNTING | NEW YORK 
     20 | RESEARCH   | DALLAS   
     30 | SALES      | CHICAGO  
     40 | OPERATIONS | BOSTON   
(4 rows)

Query 20191020_163820_00007_gqcnh, FINISHED, 1 node
http://hadoop002:8090/ui/query.html?20191020_163820_00007_gqcnh
Splits: 17 total, 17 done (100.00%)
CPU Time: 0.2s total,    20 rows/s,   414B/s, 37% active
Per Node: 0.1 parallelism,     1 rows/s,    27B/s
Parallelism: 0.1
Peak Memory: 0B
0:03 [4 rows, 82B] [1 rows/s, 27B/s]

presto:default> 

2.5 退出命令为quit或者exit
3. 配置Mysql Connector
3.1 在 /home/hadoop/app/presto-server-0.226/etc/catalog/目录下创建mysql.properties配置文件

connector.name=mysql
connection-url=jdbc:mysql://IP或者hostname:3306
connection-user=root
connection-password=123456

3.2 启动MySQL的cli服务

[hadoop@hadoop002 cli]$ ./presto --server localhost:8090 --catalog mysql --schema test --debug

## 配置说明
--server localhost:8090 
--catalog mysql  指的是使用 /home/hadoop/app/presto-server-0.226/etc/catalog/mysql.properties 这个catalog配置文件的名字
--schema test  指的是使用mysql的test数据库
--debug

3.3 MySQL连接的常用写法:
SHOW SCHEMAS FROM mysql;#查询数据库列表
SHOW TABLES FROM mysql.test;#查询指定数据库下的数据表
SELECT * FROM mysql.test.emp;查询指定数据表数据

presto:test> SHOW SCHEMAS FROM mysql;
       Schema       
--------------------
 hadoop_hive        
 information_schema 
 maxwell            
 performance_schema 
 test               
(5 rows)

Query 20191020_152414_00008_if8r6, FINISHED, 1 node
http://localhost:8090/ui/query.html?20191020_152414_00008_if8r6
Splits: 19 total, 19 done (100.00%)
CPU Time: 0.0s total,   555 rows/s, 9.01KB/s, 6% active
Per Node: 0.0 parallelism,    25 rows/s,   427B/s
Parallelism: 0.0
Peak Memory: 0B
0:00 [5 rows, 83B] [25 rows/s, 427B/s]

presto:test> SHOW TABLES FROM mysql.test;
 Table 
-------
 dept  
 emp   
 user  
(3 rows)

Query 20191020_152432_00009_if8r6, FINISHED, 1 node
http://localhost:8090/ui/query.html?20191020_152432_00009_if8r6
Splits: 19 total, 19 done (100.00%)
CPU Time: 0.0s total,   200 rows/s, 3.45KB/s, 10% active
Per Node: 0.1 parallelism,    13 rows/s,   231B/s
Parallelism: 0.1
Peak Memory: 0B
0:00 [3 rows, 53B] [13 rows/s, 231B/s]

presto:test> SELECT * FROM mysql.test.user;
 id | name  | age 
----+-------+-----
  2 | jerry |   5 
(1 row)

Query 20191020_152453_00010_if8r6, FINISHED, 1 node
http://localhost:8090/ui/query.html?20191020_152453_00010_if8r6
Splits: 17 total, 17 done (100.00%)
CPU Time: 0.0s total,    62 rows/s,     0B/s, 15% active
Per Node: 0.1 parallelism,     3 rows/s,     0B/s
Parallelism: 0.1
Peak Memory: 0B
0:00 [1 rows, 0B] [3 rows/s, 0B/s]

presto:test> 

3.4 退出命令为quit或者exit
4. MySQL和Hive跨库查询

presto:test> select hd.deptno,hd.dname,hd.loc,me.ename, me.job from  mysql.test.emp me join hive.test.dept hd on hd.deptno = me.deptno;
 deptno |   dname    |   loc    | ename  |    job    
--------+------------+----------+--------+-----------
     20 | RESEARCH   | DALLAS   | SMITH  | CLERK     
     30 | SALES      | CHICAGO  | ALLEN  | SALESMAN  
     30 | SALES      | CHICAGO  | WARD   | SALESMAN  
     20 | RESEARCH   | DALLAS   | JONES  | MANAGER   
     30 | SALES      | CHICAGO  | MARTIN | SALESMAN  
     30 | SALES      | CHICAGO  | BLAKE  | MANAGER   
     10 | ACCOUNTING | NEW YORK | CLARK  | MANAGER   
     20 | RESEARCH   | DALLAS   | SCOTT  | ANALYST   
     10 | ACCOUNTING | NEW YORK | KING   | PRESIDENT 
     30 | SALES      | CHICAGO  | TURNER | SALESMAN  
     20 | RESEARCH   | DALLAS   | ADAMS  | CLERK     
     30 | SALES      | CHICAGO  | JAMES  | CLERK     
     20 | RESEARCH   | DALLAS   | FORD   | ANALYST   
     10 | ACCOUNTING | NEW YORK | MILLER | CLERK     
(14 rows)

Query 20191020_165817_00010_gqcnh, FINISHED, 1 node
http://localhost:8090/ui/query.html?20191020_165817_00010_gqcnh
Splits: 66 total, 66 done (100.00%)
CPU Time: 0.1s total,   310 rows/s, 1.38KB/s, 11% active
Per Node: 0.1 parallelism,    23 rows/s,   107B/s
Parallelism: 0.1
Peak Memory: 0B
0:01 [18 rows, 82B] [23 rows/s, 107B/s]

presto:test> 

通过WebUI查看
Presto的单机安装和整合MySQL与Hive_第2张图片
Presto的单机安装和整合MySQL与Hive_第3张图片
6.配置SSL请参考如下链接
https://mintopsblog.com/2018/04/07/prestodb-basic-installation-with-https-ssl-configuration/

你可能感兴趣的:(Presto)