Impala的负载均衡的实现

利用HAProxy实现impala的负载均衡

CDH 官网给出了这个方案,利用 HAProxy 对 Impala Daemon 实现负载均衡, 针对目前的查询量 HAProxy 为单节点部署,若查询量比较大你可以选择将HAProxy 配置成高可用(HAProxy +keepalived)

HAProxy的部署

安装haproxy

yum -y install haproxy

配置haproxy

vim /etc/haproxy/haproxy.cfg

#---------------------------------------------------------------------
# Example configuration for a possible web application.  See the
# full configuration options online.
#
#   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------

#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     5000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 3
    timeout http-request    10s
    timeout queue           3m
    timeout connect         5000s
    timeout client          3600s
    timeout server          3600s
    timeout http-keep-alive 10s
    #健康检查时间
    timeout check           10s
    maxconn                 3000
#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend static
    balance     roundrobin
    server      static 127.0.0.1:4331 check

#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
    balance     roundrobin
    server  app1 127.0.0.1:5001 check
    server  app2 127.0.0.1:5002 check

#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend  main *:5000
    acl url_static       path_beg       -i /static /images /javascript /stylesheets
    acl url_static       path_end       -i .jpg .gif .png .css .js

    use_backend static          if url_static
    default_backend             app


#--------------配置 impala-jdbc -------------------------------------------------------
listen impala :25001
    balance leastconn
    option tcplog
    mode tcp
    #bind 0.0.0.0:21051
    #listen impalajdbc
    server impala_jdbc_01 host01:21050 check
    server impala_jdbc_02 host02:21050 check
    server impala_jdbc_03 host03:21050 check
    server impala_jdbc_04 host04:21050 check
    server impala_jdbc_05 host05:21050 check
    server impala_jdbc_06 host06:21050 check
    server impala_jdbc_07 host07:21050 check
    server impala_jdbc_08 host08:21050 check

#--------------配置 impala-hue -------------------------------------------------------
# balance 算发为 source 是为了解决在hue中查询时查询失效的问题
listen impala :25002
    balance source
    option tcplog
    mode tcp
    server impala_hue_01 host01:21050 check
    server impala_hue_02 host02:21050 check
    server impala_hue_03 host03:21050 check
    server impala_hue_04 host04:21050 check
    server impala_hue_05 host05:21050 check
    server impala_hue_06 host06:21050 check
    server impala_hue_07 host07:21050 check
    server impala_hue_08 host08:21050 check

#--------------配置impala-shell-------------------------------------------------------
listen impala :25003
    balance leastconn
    option tcplog
    mode tcp
    #listen impalashell
    server impala_shell_01 host01:21000 check
    server impala_shell_02 host02:21000 check
    server impala_shell_03 host03:21000 check
    server impala_shell_04 host04:21000 check
    server impala_shell_05 host05:21000 check
    server impala_shell_06 host06:21000 check
    server impala_shell_07 host07:21000 check
    server impala_shell_08 host08:21000 check


#--------------配置 hive-jdbc -------------------------------------------------------



#-----web ui----------------------------------------------------------------
listen stats :1080
    balance
    stats  uri /stats
    stats refresh 30s
    #管理界面访问IP和端口
    #bind 0.0.0.0:1080
    mode http
    #定义管理界面
    #listen status


检查配置是否正确

/usr/sbin/haproxy  -f /etc/haproxy/haproxy.cfg
开启HAProxy代理服务
开启: service haproxy start
关闭: service haproxy stop
重启: service haproxy restart
开机自启动: chkconfig haproxy on
访问
http://{hostname}:1080/stats    (如:http://192.168.xx.xxx:1080/stats) 回看到如下页面

Impala的负载均衡的实现_第1张图片

Hue中设置Impala的负载均衡,重启相关服务

hue_safety_valve.ini

server_host:HAProxy服务的HOSTNAME或IP
server_port:HAProxy中配置监听的Impala的端口

[impala]
server_host=host
server_port=25002

Impala的负载均衡的实现_第2张图片

说明:如果impala-hue的配置的balanceb算法不配置为 source, 则在hue 中查询impala会经常出现查询失效的状况(Invalid query handle)。具体可参考:https://docs.gethue.com/administrator/administration/reference/#impala-and-hive-ha

Impala shell 测试
impala-shell -i 192.168.xx.xxx:25003

Impala的负载均衡的实现_第3张图片

ImpalaJDBC测试

自己写一个测试案例进行测试就可以了,这里不在赘述了。

移动端见个人公众号文章: 大数据理论与实战
个人博客网站见: 个人博客

你可能感兴趣的:(大数据各组件安装,impala)