日志收集系统ELK+kafka部署

日志收集系统ELK+kafka部署

文章目录

    • 日志收集系统ELK+kafka部署
      • 一、系统准备
        • 1.1 部署简介
        • 1.2 升级内核
        • 1.3 内核优化
        • 1.4 安装NTP服务
        • 1.5 关闭swap分区
      • 二、部署ISCSI
        • 2.1 配置数据网络-iscsi服务端
        • 2.2 配置数据网络-iscsi客户端
        • 2.3 安装target-iscsi服务端
        • 2.4 安装iscsi-客户端
        • 2.5 iscis客户端登陆
        • 2.6 挂载存储-客户端
      • 三、部署Elasticsearch
        • 3.1 下载软件包
        • 3.2 安装JDK11
        • 3.3 安装Elasticsearch
        • 3.4 配置Elasticsearch
          • 3.4.1 修改elasticsearch/bin/elasticsearch
          • 3.4.2 修改elasticsearch/config/elasticsearch.yml
          • 3.4.3 优化jam.options
          • 3.4.4 创建普通用户
          • 3.4.5 修改文件描述符
          • 3.4.6 优化内核
        • 3.5 启动elasticsearch
        • 3.6 放行防火墙
        • 3.7 调整elasticsearch日志模版
      • 四、部署zookeeper
        • 4.1 下载解压安装包
        • 4.2部署JDK
        • 4.3 配置zoo.cfg文件
        • 4.4 启动zookeeper
        • 4.5 添加守护进程
        • 4.6 放行防火墙
      • 五、部署Kafka
        • 5.1 下载并解压安装包
        • 5.2 配置server.properties文件
        • 5.3 启动kafka
        • 5.4 添加守护进程
        • 5.5 放行防火墙
      • 六、部署Logstash
        • 6.1 下载并解压
        • 6.2 配置logstash.conf文件
        • 6.3 其他服务器上增加logstash-补充
        • 6.4 华为防火墙直接用logstash接收-补充
        • 6.5 启动logstash
        • 6.6 添加守护进程
        • 6.7 启动服务
      • 七、部署Kibana
        • 7.1 下载并解压
        • 7.2 配置kibana.yaml文件
        • 7.3 启动kibana
        • 7.4 放行防火墙
        • 7.5 添加守护进程
      • 八、部署filebeat-Linux
        • 8.1 下载并解压
        • 8.2 编辑filebeat.yml配置文件
        • 8.3 添加守护进程
      • 九、部署filebeat-windows系统
        • 9.1 下载并解压
        • 9.2 编辑filebeat.yml
        • 9.3 开启filebeat
      • 十、部署nginx
        • 10.1 环境准备
        • 10.2 安装nginx
        • 10.3 启动nginx 并配置文件
        • 10.4 添加守护进程
        • 10.5 创建http认证的用户名和密码
        • 10.6 修改nginx配置文件
      • 十一、ELK故障相关问题
        • 11.1 ELK问题小记1
        • 11.2 ELK问题小记2
    • 关于k8s部署日志系统参考我的这篇

一、系统准备

1.1 部署简介

​ 说明:由于没有单独的日志收集系统用于收集公司内部linux平台和Windows平台的测试App日志、Mysql数据库日志和华为防火墙日志,因此搭建了此套日志收集系统。另外考虑到运行Elasticsearch主机(此主机磁盘阵列用的是4块SSD组的RIAD 0 【生产环境用此模式可能有磁盘损坏的风险,不过目前我们允许偶尔停机维护,而且日志存到辅助存储里了,万一系统崩了,我可以重新部署-.-,因此就追求性能了】 )的存储容量有限,特外挂了ISCSI作为辅助存储使用(ISCIS组建单独的数据网络)。[主机容量够的情况,不推荐使用ISCSI作为外挂存储,这种方式性能不是很满意]

部署示意图:

日志收集系统ELK+kafka部署_第1张图片

1.2 升级内核
wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
yum makecache && yum update -y
rpm -Uvh https://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
yum --enablerepo=elrepo-kernel install kernel-ml -y
grubby --default-kernel

grub2-set-default 0
grub2-mkconfig -o /boot/grub2/grub.cfg
awk -F\' '$1=="menuentry " {print i++ " : " $2}' /boot/grub2/grub.cfg
reboot

yum update -y && yum makecache

注:如果内核升级失败,需要到BIOS里面将securt boot 设置为disable
注:曾遇到过DELL服务器由于BIOS版本过低,升级Linux内核至最新版本后,启动异常的情况,需要更新DELL服务器BIOS版本。
1.3 内核优化
cat >> /etc/sysctl.d/elk.conf << EOF
net.ipv4.tcp_fin_timeout = 10
net.ipv4.ip_forward = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_keepalive_time = 600
net.ipv4.ip_local_port_range = 30000   65000
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_max_tw_buckets = 1048576
net.ipv4.route.gc_timeout = 100
net.ipv4.tcp_syn_retries = 1
net.ipv4.tcp_synack_retries = 1
net.core.somaxconn = 32768
net.core.netdev_max_backlog = 16384
net.core.rmem_default=262144
net.core.wmem_default=262144
net.core.rmem_max=16777216
net.core.wmem_max=16777216
net.core.optmem_max=16777216
net.netfilter.nf_conntrack_max=2097152
net.nf_conntrack_max=2097152
net.netfilter.nf_conntrack_tcp_timeout_fin_wait=30
net.netfilter.nf_conntrack_tcp_timeout_time_wait=30
net.netfilter.nf_conntrack_tcp_timeout_close_wait=15
net.netfilter.nf_conntrack_tcp_timeout_established=300
net.ipv4.tcp_max_orphans = 524288
fs.file-max=2097152
fs.nr_open=2097152
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl -p /etc/sysctl.d/elk.conf
sysctl --system



cat >> /etc/systemd/system.conf << EOF
DefaultLimitNOFILE=2097152
DefaultLimitNPROC=2097152
EOF

cat /etc/systemd/system.conf


cat >> /etc/security/limits.conf << EOF
* soft nofile 2097152
* hard nofile 2097152
* soft nproc  2097152
* hard nproc  2097152
EOF

cat /etc/security/limits.conf

reboot
最好重启检查
注:内存比较小的情况下,请注意soft nofile 的值,设置太大会启动不了系统
1.4 安装NTP服务
vim /etc/chrony.conf 
修改
server 192.168.20.4 iburst
server time4.aliyun.com iburst

systemctl enable chronyd
systemctl start chronyd
chronyc sources
1.5 关闭swap分区
swapoff -a
vim /etc/fstab文件中注释掉 swap一行

注:物理服务器内存够用了,因此我这里关闭了,具体是否要关,可以根据实际情况而定

二、部署ISCSI

IP-SAN以C/S模式运行,默认端口3260

2.1 配置数据网络-iscsi服务端
[root@iscsi:/etc/sysconfig/network-scripts]# vim ifcfg-em2
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=em2
UUID=d5e88453-1701-4b15-91c1-125eae09b260
DEVICE=em2
ONBOOT=yes
IPADDR=192.168.13.101
PREFIX=24
GATEWAY=192.168.13.1
DNS1=223.5.5.5
IPV4_ROUTE_METRIC=105

:X 保存    并重启网络systemctl restart network
2.2 配置数据网络-iscsi客户端
[root@elk:/etc/sysconfig/network-scripts]# vim ifcfg-em2
TYPE=Ethernet
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=em2
UUID=f84b0810-5a08-4f7d-9b78-74610707b315
DEVICE=em2
ONBOOT=yes
IPADDR=192.168.13.100
PREFIX=24
GATEWAY=192.168.13.1
DNS1=223.5.5.5
IPV4_ROUTE_METRIC=105

:X 保存     并重启网络systemctl restart network
2.3 安装target-iscsi服务端

考虑到数据用于日志(已有备份)存储,因此安全性其次,使用性能为主的RAID 0阵列模式

[root@iscsi:/root]# yum install targetcli -y
[root@iscsi:/root]# systemctl enable target.
[root@iscsi:/root]# systemctl start target
[root@iscsi:/root]# systemctl status target
注://进入交互界面进行配置
[root@iscsi:/root]# targetcli
Warning: Could not load preferences file /root/.targetcli/prefs.bin.
targetcli shell version 2.1.51
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.
/> backstores/block create iscsi /dev/sdb
Created block storage object iscsi using /dev/sdb.
/> iscsi/
/iscsi> create iqn.2020-12.com.lowan:iscsi    //命名格式:iqn.yyyy.mm.<主机名(域名)反写>:自定义名称
Created target iqn.2020-12.com.lowan:iscsi.
Created TPG 1.
Global pref auto_add_default_portal=true
Created default portal listening on all IPs (0.0.0.0), port 3260.   //可以看到监听的IP地址和端口号
注://创建ACL列表,客户端只能使用列表内的用户才能连接target存储
/iscsi> iqn.2020-12.com.lowan:iscsi/tpg1/acls create iqn.2020-12.com.lowan:elk
Created Node ACL for iqn.2020-12.com.lowan:elk 
注://指定块存储对象iscsi为target的逻辑单元,逻辑单元号LUN0
/iscsi> iqn.2020-12.com.lowan:iscsi/tpg1/luns create /backstores/block/iscsi
Created LUN 0.
Created LUN 0->0 mapping in node ACL iqn.2020-12.com.lowan:elk
注://删除原本监听的IP地址和端口号
/> iscsi/iqn.2020-12.com.lowan:iscsi/tpg1/portals/ delete 0.0.0.0 3260
Deleted network portal 0.0.0.0:3260
注://添加一个新的监听IP地址和端口号
/> iscsi/iqn.2020-12.com.lowan:iscsi/tpg1/portals create 192.168.13.101 3260
Using default IP port 3260
Created network portal 192.168.13.101:3260.   //注意:监控服务器端的数据网络IP地址和端口

/> ls
o- / ....................................................................................... [...]
  o- backstores ............................................................................ [...]
  | o- block ................................................................ [Storage Objects: 1]
  | | o- iscsi ....................。。。。。。。。。。..... [/dev/sdb (10.9TiB) write-thru activated]
  | |   o- alua ................................................................ [ALUA Groups: 1]
  | |     o- default_tg_pt_gp ..................................... [ALUA state: Active/optimized]
  | o- fileio ............................................................. [Storage Objects: 0]
  | o- pscsi ................................................................. [Storage Objects: 0]
  | o- ramdisk ............................................................... [Storage Objects: 0]
  o- iscsi ........................................................................... [Targets: 1]
  | o- iqn.2020-12.com.lowan:iscsi ...................................................... [TPGs: 1]
  |   o- tpg1 .............................................................. [no-gen-acls, no-auth]
  |     o- acls ........................................................................ [ACLs: 1]
  |     | o- iqn.2020-12.com.lowan:elk ......................................... [Mapped LUNs: 1]
  |     |   o- mapped_lun0 ............................................. [lun0 block/iscsi (rw)]
  |     o- luns ....................................................................... [LUNs: 1]
  |     | o- lun0 .................................. [block/iscsi (/dev/sdb) (default_tg_pt_gp)]
  |     o- portals ............................................................... [Portals: 1]
  |       o- 192.168.13.101:3260 ......................................................... [OK]
  o- loopback ..................................................................... [Targets: 0]


注://配置ACL规则列表的账户和密码
/> cd iscsi/iqn.2020-12.com.lowan:iscsi/tpg1/acls/iqn.2020-12.com.lowan:elk/
/iscsi/iqn.20...com.lowan:elk> set auth userid=lowaniotiscsi
Parameter userid is now 'lowaniotiscsi'.
/iscsi/iqn.20...com.lowan:elk> set auth password=ysyhl9T098
Parameter password is now 'ysyhl9T098'.
注://回到根目录,保存密码。保存必须在跟目录
/iscsi/iqn.20...com.lowan:elk> cd /
/> saveconfig
Configuration saved to /etc/target/saveconfig.json
/> exit
Global pref auto_save_on_exit=true
Last 10 configs saved in /etc/target/backup/.
Configuration saved to /etc/target/saveconfig.json

注://重启target服务 并添加防火墙
[root@iscsi:/root]# systemctl restart target
[root@iscsi:/root]# firewall-cmd --permanent --add-port=3260/tcp
success
[root@iscsi:/root]# firewall-cmd --reload
success

2.4 安装iscsi-客户端
[root@elk:/root]# yum install iscsi-initiator-utils -y
注://修改initiatorname.iscsi配置
[root@elk:/root]# vim /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.2020-12.com.lowan:elk
注://修改iscsid.conf配置
[root@elk:/root]# vim /etc/iscsi/iscsid.conf
node.session.auth.authmethod = CHAP    			//第57行,取消注释
node.session.auth.username = lowaniotiscsi		//第61行,取消注释,并修改为ACL规则列表内的用户名
node.session.auth.password = ysyhl9T098			//第62行,取消注释,并修改为对应用户名的密码

注://重新启动iscsid服务
[root@elk:/root]# systemctl restart iscsid
[root@elk:/root]# systemctl enable iscsid
2.5 iscis客户端登陆
[root@elk:/root]# iscsiadm --mode discoverydb --type sendtargets --portal 192.168.13.101 --discover
192.168.13.101:3260,1 iqn.2020-12.com.lowan:iscsi
[root@elk:/root]# iscsiadm --mode node --targetname iqn.2020-12.com.lowan:iscsi --portal 192.168.13.101:3260 --login
Logging in to [iface: default, target: iqn.2020-12.com.lowan:iscsi, portal: 192.168.13.101,3260] (multiple)
Login to [iface: default, target: iqn.2020-12.com.lowan:iscsi, portal: 192.168.13.101,3260] successful.

2.6 挂载存储-客户端
[root@elk:/root]# lsblk
NAME            MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sdb               8:16   0  10.9T  0 disk   <------这个这个
sr0              11:0    1  1024M  0 rom  
sda               8:0    0 929.5G  0 disk 
├─sda2            8:2    0     5G  0 part /boot
├─sda3            8:3    0 924.5G  0 part 
│ ├─centos-swap 253:1    0    64G  0 lvm  [SWAP]
│ ├─centos-data 253:2    0 710.5G  0 lvm  /data
│ └─centos-root 253:0    0   150G  0 lvm  /
└─sda1            8:1    0     2M  0 part 
[root@elk:/root]# mkdir /iscsi
[root@elk:/root]# blkid /dev/sdb
/dev/sdb: UUID="726bc132-0bbf-4288-b8d7-8673e41fb8b4" TYPE="xfs" 
[root@elk:/root]# mount UUID=726bc132-0bbf-4288-b8d7-8673e41fb8b4 /iscsi
[root@elk:/root]# df -TH
Filesystem              Type      Size  Used Avail Use% Mounted on
devtmpfs                devtmpfs   34G     0   34G   0% /dev
tmpfs                   tmpfs      34G     0   34G   0% /dev/shm
tmpfs                   tmpfs      34G  9.4M   34G   1% /run
tmpfs                   tmpfs      34G     0   34G   0% /sys/fs/cgroup
/dev/mapper/centos-root xfs       161G  3.0G  159G   2% /
/dev/sda2               xfs       5.4G  228M  5.2G   5% /boot
/dev/mapper/centos-data xfs       763G  571M  762G   1% /data
tmpfs                   tmpfs     6.8G     0  6.8G   0% /run/user/0
/dev/sdb                xfs        12T   35M   12T   1% /iscsi    <------挂载成功 

注:添加到开机用脚本挂载,目前发现使用/etc/fstab挂载不行(开机时,iscsi程序未启动,会报错)
[root@elk:/etc/init.d]# vim /etc/init.d/mountIscsi

#!/bin/bash
# chkconfig: 35 10 90 
sleep 60
mount UUID=726bc132-0bbf-4288-b8d7-8673e41fb8b4 /iscsi


----------------------------------------------------------------------------
[root@elk:/etc/init.d]# chkconfig --add mountIscsi 
[root@elk:/etc/init.d]# chkconfig --level 35 mountIscsi on
[root@elk:/etc/init.d]# chmod +x /etc/init.d/mountIscsi

三、部署Elasticsearch

3.1 下载软件包
wget -c https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.10.1-linux-x86_64.tar.gz
3.2 安装JDK11

注:ElasticSearch 7.X 版本需要JDK11支持

jdk下载地址:https://www.oracle.com/java/technologies/javase-jdk11-downloads.html

[root@elk:/opt/src]# tar -xf jdk-11.0.9_linux-x64_bin.tar.gz -C /usr/local/
[root@elk:/opt/src]# cd /usr/local/  && cd jdk-11.0.9/  && ls
bin  conf  include  jmods  legal  lib  README.html  release
3.3 安装Elasticsearch
[root@elk:/opt/src]# tar -xf elasticsearch-7.10.1-linux-x86_64.tar.gz -C /data/
[root@elk:/opt/src]# cd /data/ && ls
elasticsearch-7.10.1
[root@elk:/data]# mv elasticsearch-7.10.1 elasticsearch
[root@elk:/iscsi]# mkdir -p /iscsi/elasticsearch/{data,logs}

3.4 配置Elasticsearch
3.4.1 修改elasticsearch/bin/elasticsearch
[root@elk:/data/elasticsearch/bin]# vim elasticsearch
添加如下内容:
#!/bin/bash

# CONTROLLING STARTUP:
#
# This script relies on a few environment variables to determine startup
# behavior, those variables are:
#
#   ES_PATH_CONF -- Path to config directory
#   ES_JAVA_OPTS -- External Java Opts on top of the defaults set
#
# Optionally, exact memory values can be set using the `ES_JAVA_OPTS`. Example
# values are "512m", and "10g".
#
#   ES_JAVA_OPTS="-Xms8g -Xmx8g" ./bin/elasticsearch
#配置自己的jdk11
export JAVA_HOME=/usr/local/jdk-11.0.9
export PATH=$JAVA_HOME/bin:$PATH

#添加jdk判断
if [ -x "$JAVA_HOME/bin/java" ]; then
        JAVA="/usr/local/jdk-11.0.9/bin/java"
else
        JAVA=`which java`
fi

source "`dirname "$0"`"/elasticsearch-env
........(省略号)........


3.4.2 修改elasticsearch/config/elasticsearch.yml
[root@elk:/data/elasticsearch/config]# vim elasticsearch.yml 
修改如下内容:
cluster.name: es.lowan.com
node.name: elk
path.data: /iscsi/elasticsearch/data
path.logs: /iscsi/elasticsearch/logs
bootstrap.memory_lock: true
network.host: 192.168.20.100
http.port: 9200


注意:如果部署的是单节点的话。还需要修改下列内容
不然会报如下错误:[1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
ERROR: Elasticsearch did not exit normally - check the logs at /data/elasticsearch/logs/es.lowan.com.log

# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.seed_hosts: ["host1", "host2"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#
cluster.initial_master_nodes: ["elk"]
#cluster.initial_master_nodes: ["node-1", "node-2"]
3.4.3 优化jam.options
[root@elk:/data/elasticsearch/config]# vim jvm.options
修改如下内容:
# 内存足够的话,可以考虑最大给到32G内存
-Xms31g
-Xmx31g

## GC configuration
## 原有的注释了,添加了G1回收器,原因是经典款11支持的垃圾回收器是-XX:+UseG1GC
#8-13:-XX:+UseConcMarkSweepGC
#8-13:-XX:CMSInitiatingOccupancyFraction=75
#8-13:-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseG1GC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
3.4.4 创建普通用户

注:ES使用root用户是无法启动的

[root@elk:/data/elasticsearch]# useradd -s /bin/bash -M elasticsearch
[root@elk:/data/elasticsearch]# chown -R elasticsearch:elasticsearch /data/elasticsearch
[root@elk:/data/elasticsearch]# chown -R elasticsearch:elasticsearch /iscsi/elasticsearch
[root@elk:/data/elasticsearch]# id elasticsearch
uid=1002(elasticsearch) gid=1002(elasticsearch) groups=1002(elasticsearch)

3.4.5 修改文件描述符
[root@elk:/data/elasticsearch]# vim /etc/security/limits.d/elasticsearch.conf
elasticsearch soft nofile 2097152
elasticsearch soft fsize unlimited
elasticsearch hard memlock unlimited
elasticsearch soft memlock unlimited
3.4.6 优化内核
[root@elk:/data/elasticsearch]# sysctl -w vm.max_map_count=262144
vm.max_map_count = 262144
[root@elk:/data/elasticsearch]# echo "vm.max_map_count=262144" >> /etc/sysctl.conf
[root@elk:/data/elasticsearch]# sysctl -p
vm.max_map_count = 262144

3.5 启动elasticsearch
[root@elk:/data/elasticsearch]# su -c "/data/elasticsearch/bin/elasticsearch -d" elasticsearch
[root@elk:/data/elasticsearch]# ps -ef | grep elasticsearch
elastic+  2219     1 99 16:08 ?        00:00:12 /usr/local/jdk-11.0.9/bin/java -Xshare:auto -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.numDirectArenas=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Djava.locale.providers=SPI,COMPAT -Xms32g -Xmx32g -XX:+UseG1GC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.io.tmpdir=/tmp/elasticsearch-8878251938516401686 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m -XX:MaxDirectMemorySize=17179869184 -Des.path.home=/data/elasticsearch -Des.path.conf=/data/elasticsearch/config -Des.distribution.flavor=default -Des.distribution.type=tar -Des.bundled_jdk=true -cp /data/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch -d
root      2247  1779  0 16:08 pts/0    00:00:00 grep --color=auto elasticsearch
root@elk:/data/elasticsearch]# netstat -luntp | grep 9200
tcp6       0      0 192.168.20.100:9200     :::*                    LISTEN      2219/java  

3.6 放行防火墙
[root@elk:/data/elasticsearch]# firewall-cmd --permanent --add-port=9200/tcp
success
[root@elk:/data/elasticsearch]# firewall-cmd --reload
success
[root@elk:/data/elasticsearch]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: em1 em2
  sources: 
  services: dhcpv6-client ssh
  ports: 9200/tcp 9100/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules: 

firewall-cmd --permanent --remove-port=9100/tcp
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.20.123 port protocol=tcp port=9100 accept'
3.7 调整elasticsearch日志模版
[root@elk:/data/elasticsearch]# 
注: 由于是单机单副本,所以需要如下操作,假如是三机集群就可以不用一下操作
curl -H "Content-Type:application/json" -XPUT http://192.168.20.100:9200/_template/lowan -d '{
	"template": "lowan*",
	"index_patterns": ["lowan*"],
	"settings": {
		"number_of_shards": 5,
		"number_of_replicas": 0,
		"max_shards_per_node":10000
	}
}'

四、部署zookeeper

本次用单节点部署在iscsi服务器上

注: zk的安装依赖java,所以需要先安装jdk

如果需要部署集群的可参考我的这篇:05 部署zookeeper和kafka集群 https://blog.csdn.net/weixin_43667733/article/details/117267272#comments_16648904

4.1 下载解压安装包
wget -c https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.6.2/apache-zookeeper-3.6.2-bin.tar.gz

tar xf apache-zookeeper-3.6.2-bin.tar.gz -C /usr/local/
cd /usr/local/ && mv apache-zookeeper-3.6.2-bin/ zookeeper
mkdir /usr/local/zookeeper/{data,logs}

4.2部署JDK
tar xf jdk-8u251-linux-x64.tar.gz -C /usr/local/

cat >> /etc/profile << EOF
export JAVA_HOME=/usr/local/jdk1.8.0_251
export CLASSPATH=.:$JAVA_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin
EOF
source /etc/profile
4.3 配置zoo.cfg文件
cd /usr/local/zookeeper/conf && cp -p zoo_sample.cfg zoo.cfg
[root@iscsi:/usr/local/zookeeper/conf]# vim zoo.cfg
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/logs
# the port at which the clients will connect
clientPort=2181

4.4 启动zookeeper
[root@iscsi:/usr/local/zookeeper/bin]# /usr/local/zookeeper/bin/zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED

ps -ef | grep zookeeper
4.5 添加守护进程
cat >> /etc/systemd/system/zookeeper.service <
4.6 放行防火墙
[root@iscsi:/usr/local/kafka/config]# firewall-cmd --permanent --add-port=2181/tcp
success
[root@iscsi:/usr/local/kafka/config]# firewall-cmd --reload
success
// 后期此策略进行了变更,变更为限制致 主机:端口

五、部署Kafka

本次用单节点部署在iscsi服务器上

5.1 下载并解压安装包
wget -c https://mirrors.tuna.tsinghua.edu.cn/apache/kafka/2.6.0/kafka_2.13-2.6.0.tgz
tar xf kafka_2.13-2.6.0.tgz -C /usr/local/
cd /usr/local && mv kafka_2.13-2.6.0 kafka
mkdir -p /data/kafka/logs

5.2 配置server.properties文件
[root@iscsi:/root]# vim /usr/local/kafka/config/server.properties 
---------------------------------------->
broker.id=0
listeners=PLAINTEXT://:9092
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
log.dirs=/data/kafka/logs
num.partitions=1
num.recovery.threads.per.data.dir=1
offsets.topic.replication.factor=1
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
zookeeper.connect=localhost:2181
zookeeper.connection.timeout.ms=18000
group.initial.rebalance.delay.ms=0
auto.create.topics.enable=true
delete.topics.enable=true
5.3 启动kafka
#启动
/usr/local/kafka/bin/kafka-server-start.sh -daemon /usr/local/kafka/config/server.properties
#检查java进程
jps

ps -ef | grep kafka
#如停止
/usr/local/kafka/bin/kafka-server-stop.sh
5.4 添加守护进程
cat >> /etc/systemd/system/kafka.service << EOF
[Unit]
Description=kafka server
Requires=zookeeper.service
After=zookeeper.service

[Service]
Type=simple
User=root
Environment=JAVA_HOME=/usr/local/jdk1.8.0_251
ExecStart=/bin/sh -c '/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties'
ExecStop=/bin/sh -c '/usr/local/kafka/bin/kafka-server-stop.sh'
Restart=on-failure
RestartSec=2s
SuccessExitStatus=0  143
[Install]
WantedBy=multi-user.target

EOF


systemctl daemon-reload
systemctl enable kafka.service 
systemctl start kafka.service 
systemctl status kafka.service 
5.5 放行防火墙
[root@iscsi:/usr/local/kafka/config]# firewall-cmd --permanent --add-port=9092/tcp
success
[root@iscsi:/usr/local/kafka/config]# firewall-cmd --reload
success
[root@iscsi:/usr/local/kafka/config]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: em1 em2
  sources: 
  services: dhcpv6-client ssh
  ports: 9100/tcp 3260/tcp 9092/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules: 

六、部署Logstash

6.1 下载并解压
[root@elk:/opt/src]# tar xf logstash-7.10.1-linux-x86_64.tar.gz -C /data/
[root@elk:/opt/src]# cd /data/ && mv logstash-7.10.1 logstash
6.2 配置logstash.conf文件
[root@elk:/data/logstash/config]# mv logstash-sample.conf logstash.conf
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  kafka {
    bootstrap_servers => "iscsi:9092"
    group_id => "logstash"
    auto_offset_reset => "latest"
    consumer_threads => 2
    decorate_events => true
    #codec => "json"
    topics_pattern => "elk-.*"
  }
}
filter {
  json {
    source => "message"
 }
}
output {
  elasticsearch {
    hosts => "192.168.20.100:9200"
    index => "%{[@metadata][beat]}-%{+YYYY.MM}"
                }
}

6.3 其他服务器上增加logstash-补充

此步骤为参考,用于在其他服务器上部署logstash,即采用多个logstash

[root@mysqltest:/data/logstash/config]# vim logstash.conf
# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  kafka {
    bootstrap_servers => "iscsi:9092"
    group_id => "logstashmysql"
    auto_offset_reset => "latest"
    consumer_threads => 3
    decorate_events => true
    topics_pattern => "elkmysql-.*"
  }
}

filter {
  json {
    source => "message"
 }
  date {
    match => ["timestamp_mysql","UNIX"]
    target => "@timestamp"
 }
}

output {
  elasticsearch {
    hosts => ["192.168.20.100:9200"]
    index => "mysql-slowlog-%{+YYYY.MM}"
    #user => "elastic"
    #password => "changeme"
  }
}

6.4 华为防火墙直接用logstash接收-补充

华为防火墙的日志直接由防火墙设置日志发送至logstas。

[root@lo:/data/logstash/config]# vim logstash.conf 

# Sample Logstash configuration for creating a simple
# Beats -> Logstash -> Elasticsearch pipeline.

input {
  udp {
    port => "514"
    type => "syslog"
  }
}

output {
  elasticsearch {
    hosts => ["192.168.20.100:9200"]
    index => "fwlogstash_syslog-%{+YYYY.MM}"
    #user => "elastic"
    #password => "changeme"
  }
}

6.5 启动logstash
#启动测试
[root@elk:/data/logstash]# ./bin/logstash -f /data/logstash/config/logstash.conf

[root@elk:/data/elasticsearch]# ps -ef | grep logstash
root      3470  1779 99 18:17 pts/0    00:03:00 /data/logstash/jdk/bin/java -Xms12g -Xmx12g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djruby.compile.invokedynamic=true -Djruby.jit.threshold=0 -Djruby.regexp.interruptible=true -XX:+HeapDumpOnOutOfMemoryError -Djava.security.egd=file:/dev/urandom -Dlog4j2.isThreadContextMapInheritable=true -cp /data/logstash/logstash-core/lib/jars/animal-sniffer-annotations-1.14.jar:/data/logstash/logstash-core/lib/jars/checker-compat-qual-2.0.0.jar:/data/logstash/logstash-core/lib/jars/commons-codec-1.14.jar:/data/logstash/logstash-core/lib/jars/commons-compiler-3.1.0.jar:/data/logstash/logstash-core/lib/jars/commons-logging-1.2.jar:/data/logstash/logstash-core/lib/jars/error_prone_annotations-2.1.3.jar:/data/logstash/logstash-core/lib/jars/google-java-format-1.1.jar:/data/logstash/logstash-core/lib/jars/gradle-license-report-0.7.1.jar:/data/logstash/logstash-core/lib/jars/guava-24.1.1-jre.jar:/data/logstash/logstash-core/lib/jars/j2objc-annotations-1.1.jar:/data/logstash/logstash-core/lib/jars/jackson-annotations-2.9.10.jar:/data/logstash/logstash-core/lib/jars/jackson-core-2.9.10.jar:/data/logstash/logstash-core/lib/jars/jackson-databind-2.9.10.4.jar:/data/logstash/logstash-core/lib/jars/jackson-dataformat-cbor-2.9.10.jar:/data/logstash/logstash-core/lib/jars/janino-3.1.0.jar:/data/logstash/logstash-core/lib/jars/javassist-3.26.0-GA.jar:/data/logstash/logstash-core/lib/jars/jruby-complete-9.2.13.0.jar:/data/logstash/logstash-core/lib/jars/jsr305-1.3.9.jar:/data/logstash/logstash-core/lib/jars/log4j-api-2.13.3.jar:/data/logstash/logstash-core/lib/jars/log4j-core-2.13.3.jar:/data/logstash/logstash-core/lib/jars/log4j-jcl-2.13.3.jar:/data/logstas/logstash-core/lib/jars/log4j-slf4j-impl-2.13.3.jar:/data/logstash/logstash-core/lib/jars/logstash-core.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.core.commands-3.6.0.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.core.contenttype-3.4.100.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.core.expressions-3.4.300.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.core.filesystem-1.3.100.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.core.jobs-3.5.100.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.core.resources-3.7.100.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.core.runtime-3.7.0.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.equinox.app-1.3.100.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.equinox.common-3.6.0.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.equinox.preferences-3.4.1.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.equinox.registry-3.5.101.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.jdt.core-3.10.0.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.osgi-3.7.1.jar:/data/logstash/logstash-core/lib/jars/org.eclipse.text-3.5.101.jar:/data/logstash/logstash-core/lib/jars/reflections-0.9.11.jar:/data/logstash/logstash-core/lib/jars/slf4j-api-1.7.25.jar org.logstash.Logstash -f /data/logstash/config/logstash.conf
root      3670  1857  0 18:20 pts/1    00:00:00 grep --color=auto logstash



6.6 添加守护进程
需要更改细化
cat >> /etc/systemd/system/logstash.service << EOF
[Unit]
Description=logstash server
Requires=network.target
After=network.target

[Service]
Type=simple
User=root
ExecStart=/bin/sh -c '/data/logstash/bin/logstash -f /data/logstash/config/logstash.conf'
ExecStop=/usr/bin/kill -9
Restart=on-failure
RestartSec=5s
[Install]
WantedBy=multi-user.target

EOF
6.7 启动服务
systemctl daemon-reload
systemctl enable logstash.service
systemctl start logstash.service
systemctl status logstash.service

七、部署Kibana

kibana主要用于数据的展示和呈现

7.1 下载并解压
tar -xf kibana-7.10.1-linux-x86_64.tar.gz -C /data/
cd /data/ && mv kibana-7.10.1-linux-x86_64 kibana

7.2 配置kibana.yaml文件
[root@elk:/data/kibana/config]# vim kibana.yml
elasticsearch.hosts: ["http://es.lowan.com:9200"]

7.3 启动kibana
[root@elk:/data]# chown -R elasticsearch:elasticsearch kibana
[root@elk:/data/kibana/bin]# su -c "/data/kibana/bin/kibana" elasticsearch
[root@elk:/data/logstash/bin]# ps -ef | grep kibana
root      4167  1857  0 18:54 pts/1    00:00:00 su -c /data/kibana/bin/kibana elasticsearch
elastic+  4168  4167 34 18:54 ?        00:00:20 /data/kibana/bin/../node/bin/node /data/kibana/bin/../src/cli/dist
root      4251  3716  0 18:55 pts/2    00:00:00 grep --color=auto kibana

7.4 放行防火墙
[root@elk:/data/logstash/bin]# firewall-cmd --permanent --add-port=5601/tcp
success
[root@elk:/data/logstash/bin]# firewall-cmd --reload
success
[root@elk:/data/logstash/bin]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: em1 em2
  sources: 
  services: dhcpv6-client ssh
  ports: 9200/tcp 9100/tcp 5601/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: 
  rich rules: 

7.5 添加守护进程
后续需要更改细化
cat >> /etc/systemd/system/kibana.service << EOF
[Unit]
Description=kibana server
Requires=network.target
After=logstash.service

[Service]
Type=simple
User=elasticsearch
ExecStart=/bin/sh -c "/data/kibana/bin/kibana"
ExecStop=/usr/bin/kill -9
Restart=on-failure
RestartSec=2s
[Install]
WantedBy=multi-user.target

EOF



八、部署filebeat-Linux

8.1 下载并解压
tar -xf filebeat-7.10.1-linux-x86_64.tar.gz -C /data
cd /data && mv filebeat-7.10.1-linux-x86_64 filebeat
8.2 编辑filebeat.yml配置文件
[root@test156:/data/filebeat]# vim filebeat.yml
# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /data/app/P005/log/error/P005-error.log
    - /data/app/P005/log/info/P005-info.log
    - /data/app/P005/log/warn/P005-warn.log
  fields_under_root: true
  fields:
    logtopic: p005logs
  scan_frequency: 120s

- type: log
  enabled: true
  paths:
    - /data/app/P011/log/error.log
    - /data/app/P011/log/log.log
  fields_under_root: true
  fields:
    logtopic: p011logs
  json.keys_under_root: true
  json.overwrite_keys: true
  scan_frequency: 120s

- type: log
  enabled: true
  paths:
    - /data/app/P010/log/hes-plan-error.log
    - /data/app/P010/log/hes-plan-info.log
  fields_under_root: true
  fields:
    logtopic: p010logs
  scan_frequency: 120s

- type: log
  enabled: true
  paths:
    - /data/app/P022/log/error/P022-error.log
    - /data/app/P022/log/info/P022-info.log
    - /data/app/P022/log/warn/P022-warn.log
  fields_under_root: true
  fields:
    logtopic: p022logs
    scan_frequency: 120s

- type: log
  enabled: true
  paths:
    - /data/app/P003/log/ami-info.log
    - /data/app/P003/log/ami-error.log
  fields_under_root: true
  fields:
    logtopic: p003logs
  scan_frequency: 120s

- type: log
  enabled: true
  paths:
    - /data/tools/nginx/logs/access.log
    - /data/tools/nginx/logs/error.log
  fields_under_root: true
  fields:
    logtopic: p003weblogs
  scan_frequency: 120s
# ======================= Elasticsearch template setting =======================
setup.template.settings:
  index.number_of_shards: 3
# =================================== Kibana ===================================
setup.kibana:
# ================================== Outputs ===================================
output.kafka:
  enabled: true
  hosts: "iscsi:9092"
  topic: 'elk-%{[logtopic]}'
  version: 2.0.0
  partition.round_robin:
    reachable_only: false
  required_acks: 1
  compression: gzip
  max_message_bytes: 10485760
# ================================== Logging ===================================
logging.level: info

补充收集的filebeat.yml配置文件

[root@mysql:/data/tools/filebeat]# vim filebeat.yml 

# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /data/tools/mysql/data/mysql-slow.log
  fields_under_root: true
  fields:
    logtopic: slowlogs
  exclude_lines: ['^\# Time']
  multiline.pattern: '^\# Time|^\# User'
  multiline.negate: true
  multiline.match: after
  tail_files: true
  #json.keys_under_root: true
  #json.overwrite_keys: true
  scan_frequency: 60s
# ======================= Elasticsearch template setting =======================
setup.template.settings:
  index.number_of_shards: 3
setup.kibana:
# ================================== Outputs ===================================
output.kafka:
  enabled: true
  hosts: "192.168.20.101:9092"
  topic: 'elkmysql-%{[logtopic]}'
  version: 2.0.0
  partition.round_robin:
    reachable_only: false
  required_acks: 1
  compression: gzip
  max_message_bytes: 10485760
# ================================== Logging ===================================
logging.level: debug

补充:mysql当时测试的命令
MySQL [(none)]> select now(); select sleep(8); select now();
+---------------------+
| now()               |
+---------------------+
| 2021-02-23 12:33:02 |
+---------------------+
1 row in set (0.00 sec)

+----------+
| sleep(8) |
+----------+
|        0 |
+----------+
1 row in set (8.00 sec)

+---------------------+
| now()               |
+---------------------+
| 2021-02-23 12:33:10 |
+---------------------+
1 row in set (0.00 sec)

8.3 添加守护进程
cat >> /etc/systemd/system/filebeat.service << EOF
[Unit]
Description=Filebeat sends log files to Logstash or directly to Elasticsearch.
Documentation=https://www.elastic.co/products/beats/filebeat
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
User=root
ExecStart=/bin/sh -c "/data/filebeat/filebeat -c /data/filebeat/filebeat.yml"
Restart=always
[Install]
WantedBy=multi-user.target

EOF

九、部署filebeat-windows系统

9.1 下载并解压
下载
https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.10.1-windows-x86_64.zip
将下载的压缩包解压到“C:\Program Files\Filebeat”

9.2 编辑filebeat.yml
修改该目录中的配置文件filebeat.yml

# ============================== Filebeat inputs ===============================
filebeat.inputs:
- type: log
  enabled: true
  paths:
    - D:\Reallin\VirtualMeters\Logs\Infos\2020-12-30.log
    #- c:\programdata\elasticsearch\logs\*
  fields_under_root: true
  fields:
    logtopic: pc174logs
  scan_frequency: 120s
# ======================= Elasticsearch template setting =======================
setup.template.settings:
  index.number_of_shards: 3
# =================================== Kibana ===================================
setup.kibana:
# ================================== Outputs ===================================
output.kafka:
  hosts: ["192.168.20.101:9092"]
  enabled: true
  topic: 'elk-%{[logtopic]}'
  version: 2.0.0
  partition.round_robin:
    reachable_only: false
  required_acks: 1
  compression: gzip
  max_message_bytes: 10485760

# ================================== Logging ===================================
logging.level: debug
9.3 开启filebeat

管理员身份打开powershell,进入filebeat安装目录"C:\Program Files\Filebeat",输入以下命令,使用刚修改好的配置文件启动filebeat

C:\Program Files\Filebeat> .\filebeat -e -c filebeat.yml
注:如果powershell报”系统中禁止执行脚本",就输入以下命令更改执行策略:
C:\Program Files\Filebeat> set-ExecutionPolicy RemoteSigned

补充安装elasticsearch-head(可以不安装,暂时用处不大)

安装node(可以不安装,暂时用处不大)

wget -c https://nodejs.org/dist/v14.15.3/node-v14.15.3-linux-x64.tar.xz
tar -xJf node-v14.15.3-linux-x64.tar.xz -C /usr/local/
cd /usr/local/ && mv node-v14.15.3-linux-x64 node

vim /etc/profile
追加
export NODE_HOME=/usr/local/node
export PATH=$PATH:$NODE_HOME/bin

source /etc/profile
node -v
npm -v

elasticsearch-head(可以不安装,暂时用处不大)

cd /usr/local/elasticsearch-head
nmp install
如遇到下列报错:
-->npm ERR! code ELIFECYCLE
-->npm ERR! errno 1
-->npm ERR! phantomjs-prebuilt@2.1.16 install: `node install.js`
-->npm ERR! Exit status 1
-->npm ERR! 
-->npm ERR! Failed at the phantomjs-prebuilt@2.1.16 install script.
-->npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

-->npm ERR! A complete log of this run can be found in:
-->npm ERR!     /root/.npm/_logs/2020-12-24T01_40_55_983Z-debug.log

执行:
npm install phantomjs-prebuilt@2.1.16 --ignore-scripts
npm install

由于9100端口已经被node_export的监控占用。因此需要修改端口为9111

vim /usr/local/elasticsearch-head/Gruntfile.js
约在 97行
修改: port: 9100 为 port: 9111
...省略
                connect: {
                        server: {
                                options: {
                                        port: 9111,
                                        hostname: '*',
                                        base: '.',
                                        keepalive: true
                                }
                        }
                }
...省略

放行防火墙
firewall-cmd --permanent --add-port=9111/tcp
firewall-cmd --reload

修改配置elasticsearch.yml 最后追加
http.cors.enabled: true
http.cors.allow-origin: "*"
然后重启elasticsearch

测试启动
[root@iscsi:/usr/local/elasticsearch-head]# npm run start

> elasticsearch-head@0.0.0 start /usr/local/elasticsearch-head
> grunt server

Running "connect:server" (connect) task
Waiting forever...
Started connect web server on http://localhost:9111

测试页面
访问:http://192.168.20.100:9111

防火墙配置


示例:iscsi
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.20.100" port port="9092" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.20.100" port port="2181" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.10.5" port port="2181" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.10.5" port port="9092" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.7.150" port port="9092" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.7.156" port port="9092" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.7.162" port port="9092" protocol="tcp" accept'


备注:去除端口或服务
firewall-cmd --permanent --remove-port=9092/tcp
firewall-cmd --permanent --remove-port=2181/tcp

firewall-cmd --reload

[root@iscsi:/usr/local/kafka]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: em1 em2
  sources: 
  services: dhcpv6-client ssh
  ports: 3260/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: time-exceeded
  rich rules: 
	rule family="ipv4" source address="192.168.20.123" port port="9100" protocol="tcp" accept
	rule family="ipv4" source address="192.168.20.100" port port="9092" protocol="tcp" accept
	rule family="ipv4" source address="192.168.20.100" port port="2181" protocol="tcp" accept
	rule family="ipv4" source address="192.168.10.5" port port="2181" protocol="tcp" accept
	rule family="ipv4" source address="192.168.10.5" port port="9092" protocol="tcp" accept
	rule family="ipv4" source address="192.168.7.150" port port="9092" protocol="tcp" accept
	rule family="ipv4" source address="192.168.7.156" port port="9092" protocol="tcp" accept


示例:elk
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.20.100 port protocol=tcp port=9200 accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.100.10 port protocol=tcp port=9200 accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.10.5 port protocol=tcp port=9200 accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.20.4 port protocol=tcp port=9200 accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.10.5 port protocol=tcp port=5601 accept'

备注:去除端口或服务
firewall-cmd --permanent --remove-port=9200/tcp
firewall-cmd --permanent --remove-port=5601/tcp
firewall-cmd --reload

[root@elk:/data/nginx/conf]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: em1 em2
  sources: 
  services: dhcpv6-client http ssh
  ports: 9111/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: time-exceeded
  rich rules: 
	rule family="ipv4" source address="192.168.20.123" port port="9100" protocol="tcp" accept
	rule family="ipv4" source address="192.168.20.100" port port="9200" protocol="tcp" accept
	rule family="ipv4" source address="192.168.100.10" port port="9200" protocol="tcp" accept
	rule family="ipv4" source address="192.168.10.5" port port="9200" protocol="tcp" accept
	rule family="ipv4" source address="192.168.20.4" port port="9200" protocol="tcp" accept
	rule family="ipv4" source address="192.168.10.5" port port="5601" protocol="tcp" accept



十、部署nginx

使用nginx反向代理kibana并且设置用户名和密码登录进行验证

10.1 环境准备
wget -c http://nginx.org/download/nginx-1.19.6.tar.gz
yum -y install gcc zlib zlib-devel pcre-devel openssl openssl-devel
mkdir -p /data/nginx/logs && cd /data/nginx
10.2 安装nginx
tar -xf /opt/src/nginx-1.19.6.tar.gz -C /data/nginx/
cd nginx-1.19.6/

编译:
 ./configure --prefix=/data/nginx --with-http_realip_module --with-http_stub_status_module --with-http_ssl_module --with-http_addition_module --with-http_flv_module --with-http_gzip_static_module --with-http_realip_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_dav_module --with-http_v2_module
 make && make install
 echo 'export PATH=/data/nginx/sbin:$PATH' >> /etc/profile
 source /etc/profile
10.3 启动nginx 并配置文件
启动
nginx


mkdir /data/nginx/conf/conf.d
设置配置文件
vim /data/nginx/conf/nginx.conf
include /data/nginx/conf/conf.d/*.conf;    增加行 http中default_type的下一行

重启
nginx -t
nginx -s reload


10.4 添加守护进程
[root@elk:/etc/systemd/system]# vim nginx.service
[Unit]
Description=nginx - high performance web server
Documentation=http://nginx.org/en/docs/
After=network-online.target remote-fs.target nss-lookup.target

[Service]
Type=forking
ExecStart=/data/nginx/sbin/nginx -c /data/nginx/conf/nginx.conf
ExecReload=/data/nginx/sbin/nginx -s reload
ExecStop=/data/nginx/sbin/nginx -s stop
PrivateTmp=true
Restart=on-failure

[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl enable nginx.service
systemctl start nginx.service
systemctl status nginx.service
10.5 创建http认证的用户名和密码
mkdir -p /etc/nginx/passwd/
cd /etc/nginx/passwd/
touch kibana.passwd
yum -y install httpd-tools
htpasswd -c -b /etc/nginx/passwd/kibana.passwd kibana mapple
10.6 修改nginx配置文件
[root@elk:/data/nginx/conf]# vim nginx.conf
#user  nobody;
worker_processes  4;

#error_log  logs/error.log;
#error_log  logs/error.log  notice;
#error_log  logs/error.log  info;

#pid        logs/nginx.pid;


events {
    worker_connections  1024;
}


http {
    include       mime.types;
    default_type  application/octet-stream;
    include /data/nginx/conf/conf.d/*.conf;
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    #access_log  logs/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    #keepalive_timeout  0;
    keepalive_timeout  65;

    #gzip  on;

    server {
        listen       80;
        server_name  localhost;

        charset utf-8;

        access_log  /data/nginx/logs/access.log  main;
        error_log   /data/nginx/logs/error.log

        auth_basic "Kibana Auth";
        auth_basic_user_file /etc/nginx/passwd/kibana.passwd;

        location / {
            proxy_pass  http://192.168.20.100:5601;
            proxy_redirect      off;
        }

        #error_page  404              /404.html;

        # redirect server error pages to the static page /50x.html
        #
        error_page   500 502 503 504  /50x.html;
        location = /50x.html {
            root   html;
        }

        # proxy the PHP scripts to Apache listening on 127.0.0.1:80
        #
        #location ~ \.php$ {
        #    proxy_pass   http://127.0.0.1;
        #}

        # pass the PHP scripts to FastCGI server listening on 127.0.0.1:9000
        #
        #location ~ \.php$ {
        #    root           html;
        #    fastcgi_pass   127.0.0.1:9000;
        #    fastcgi_index  index.php;
        #    fastcgi_param  SCRIPT_FILENAME  /scripts$fastcgi_script_name;
        #    include        fastcgi_params;
        #}

        # deny access to .htaccess files, if Apache's document root
        # concurs with nginx's one
        #
        #location ~ /\.ht {
        #    deny  all;
        #}
    }

    # another virtual host using mix of IP-, name-, and port-based configuration
    #
    #server {
    #    listen       8000;
    #    listen       somename:8080;
    #    server_name  somename  alias  another.alias;

    #    location / {
    #        root   html;
    #        index  index.html index.htm;
    #    }
    #}


    # HTTPS server
    #
    #server {
    #    listen       443 ssl;
    #    server_name  localhost;

    #    ssl_certificate      cert.pem;
    #    ssl_certificate_key  cert.key;

    #    ssl_session_cache    shared:SSL:1m;
    #    ssl_session_timeout  5m;

    #    ssl_ciphers  HIGH:!aNULL:!MD5;
    #    ssl_prefer_server_ciphers  on;

    #    location / {
    #        root   html;
    #        index  index.html index.htm;
    #    }
    #}

}



十一、ELK故障相关问题

11.1 ELK问题小记1

出现如下问题

[2021-01-06T14:53:52,124][WARN
][logstash.outputs.elasticsearch][main][69436b672085ad5f978150d20dc8e1d78f9e6c726213022bea68bc207e9d0256]
Could not index event to Elasticsearch. {:status=>400,
:action=>[“index”, {:_id=>nil, :_index=>“filebeat-2021.01”,
:routing=>nil, :_type=>"_doc"}, #LogStash::Event:0x22e04167],
:response=>{“index”=>{"_index"=>“filebeat-2021.01”, “_type”=>"_doc",
“_id”=>“4SF51nYBpgdcB08MJLE5”, “status”=>400,
“error”=>{“type”=>“illegal_argument_exception”, “reason”=>“Number of
documents in the index can’t exceed [2147483519]”}}}}

原因分片的文档数超过了21亿条限制

​ lucene的文档数限制,每个分片最大支持2的31次方个文档数量

​ 可以通过 http://192.168.20.100:9200/_cat/shards 查看

分析:该限制是分片维度而不是索引维度的。因此出现这种异常,通常是由于我们的索引分片设置的不是很合理。

解决方法:切换写入到新索引,并修改索引模版,合理设置主分片数。

11.2 ELK问题小记2

关于ELASTICSEARCH内存不断增长的问题

情景:经过长期运行,ES采用mmap的方式将索引文件映射到内存中,es进程读取lucene(索引)文件的时候读取到的数据就会占用了堆外内存的空间。而lucene文件所占用的内存即不属于JVM的堆内存,也不属于MaxDirectMemorySize 控制的堆外内存。因此这块内存是不受ES进程控制的.因此内存的RES指标会持续增长的现象,如果不放心,可以将mmapfs模式修改为niofs。不过官方还是推荐64位的linux系统使用mmapfs模式。修改方式参考如下:

vim config/elasticsearch.yml
index.store.type: niofs
node.store.allow_mmap: false

问题分析

​ 打开内存映射文件 /proc/ES进程号/smaps

[root@elk:/proc/2216]# less smaps
...省略

1000800000-17c0a80000 rw-p 00000000 00:00 0
Size:           32508416 kB      (32508416kb/1024/1024=31G)
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:            32508416 kB
Pss:            32508416 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:  32508416 kB
Referenced:     32508416 kB
Anonymous:      32508416 kB
LazyFree:              0 kB
AnonHugePages:  32507904 kB
ShmemPmdMapped:        0 kB
FilePmdMapped:         0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:         32508416 kB
THPeligible:    1
VmFlags: rd wr mr mw me lo ac

7e45639e8000-7e45a1953000 r--s 100000000 08:10 2147484179                /iscsi/elasticsearch/data/nodes/0/indices/WpsCXutCTgizaTUPcEw2QQ/0/index/_pvow.cfs
Size:            1015212 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:                6644 kB
Pss:                6644 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:      6644 kB
Private_Dirty:         0 kB
Referenced:         6644 kB
Anonymous:             0 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
FilePmdMapped:         0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:                  0 kB
SwapPss:               0 kB
Locked:                0 kB
THPeligible:    0
VmFlags: rd mr me ms
...省略

解析:
① 7e45639e8000-7e45a1953000 是该虚拟内存段的开始和结束位置
② r–s内存段的权限,最后一位p代表私有,s代表共享
③ 100000000 该虚拟内存段在对应的映射文件中的偏移量
④ 08:10 文件的主设备和次设备号
⑤ 2147484179 被映射到虚拟内存的文件的索引节点号
⑥ /iscsi/elasticsearch/data/nodes/0/indices/WpsCXutCTgizaTUPcEw2QQ/0/index/_pvow.cfs 被映射到虚拟内存的文件名称。后面带(deleted)的是内存数据,可以被销毁。
⑦ size 是进程使用内存空间,并不一定实际分配了内存(VSS)
⑧ Rss是实际分配的内存(不需要缺页中断就可以使用的)
⑨ Pss是平摊计算后的使用内存(有些内存会和其他进程共享,例如mmap进来的)
⑩ Shared_Clean 和其他进程共享的未改写页面
① Shared_Dirty 和其他进程共享的已改写页面
② Private_Clean 未改写的私有页面页面
③ Private_Dirty 已改写的私有页面页面
④ Referenced 标记为访问和使用的内存大小
⑤ Anonymous 不来自于文件的内存大小
⑥ Swap 存在于交换分区的数据大小(如果物理内存有限,可能存在一部分在主存一部分在交换分区) 注:elk中尽量不启用swap,会影响性能
⑦ KernelPageSize 内核页大小
⑧ MMUPageSize MMU页大小,基本和Kernel页大小相同


由上述介绍可见,top命令中的RES表示进程实际占有的内存大小,它通过把smaps文件中所有的Rss累计得到的
可以通过以下命令计算: grep Rss smaps |awk 'BEGIN {sum = 0;} {sum += $2} END{print sum}'
[root@elk:/proc/2216]# grep Rss smaps |awk 'BEGIN {sum = 0;} {sum += $2} END{print sum}'
38282436

在文件中,得知31G的堆内存的映射如下:
1000800000-17c0a80000 rw-p 00000000 00:00 0
Size:           32508416 kB      (32508416kb/1024/1024=31G)
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:            32508416 kB

Rss刚好为31G

超出的部分为 MaxDirectMemorySize控制的堆外内存+ lucene文件的加载内存 = (38282436-32508416)/1024 /1024 = 5.5G 

后记:补充系统防火墙的细化步骤(可以省略)


示例:iscsi
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.20.100" port port="9092" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.20.100" port port="2181" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.10.5" port port="2181" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.10.5" port port="9092" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.7.150" port port="9092" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.7.156" port port="9092" protocol="tcp" accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address="192.168.7.162" port port="9092" protocol="tcp" accept'


备注:去除端口或服务
firewall-cmd --permanent --remove-port=9092/tcp
firewall-cmd --permanent --remove-port=2181/tcp

firewall-cmd --reload

[root@iscsi:/usr/local/kafka]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: em1 em2
  sources: 
  services: dhcpv6-client ssh
  ports: 3260/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: time-exceeded
  rich rules: 
	rule family="ipv4" source address="192.168.20.123" port port="9100" protocol="tcp" accept
	rule family="ipv4" source address="192.168.20.100" port port="9092" protocol="tcp" accept
	rule family="ipv4" source address="192.168.20.100" port port="2181" protocol="tcp" accept
	rule family="ipv4" source address="192.168.10.5" port port="2181" protocol="tcp" accept
	rule family="ipv4" source address="192.168.10.5" port port="9092" protocol="tcp" accept
	rule family="ipv4" source address="192.168.7.150" port port="9092" protocol="tcp" accept
	rule family="ipv4" source address="192.168.7.156" port port="9092" protocol="tcp" accept


示例:elk
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.20.100 port protocol=tcp port=9200 accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.100.10 port protocol=tcp port=9200 accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.10.5 port protocol=tcp port=9200 accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.20.4 port protocol=tcp port=9200 accept'
firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address=192.168.10.5 port protocol=tcp port=5601 accept'

备注:去除端口或服务
firewall-cmd --permanent --remove-port=9200/tcp
firewall-cmd --permanent --remove-port=5601/tcp
firewall-cmd --reload

[root@elk:/data/nginx/conf]# firewall-cmd --list-all
public (active)
  target: default
  icmp-block-inversion: no
  interfaces: em1 em2
  sources: 
  services: dhcpv6-client http ssh
  ports: 9111/tcp
  protocols: 
  masquerade: no
  forward-ports: 
  source-ports: 
  icmp-blocks: time-exceeded
  rich rules: 
	rule family="ipv4" source address="192.168.20.123" port port="9100" protocol="tcp" accept
	rule family="ipv4" source address="192.168.20.100" port port="9200" protocol="tcp" accept
	rule family="ipv4" source address="192.168.100.10" port port="9200" protocol="tcp" accept
	rule family="ipv4" source address="192.168.10.5" port port="9200" protocol="tcp" accept
	rule family="ipv4" source address="192.168.20.4" port port="9200" protocol="tcp" accept
	rule family="ipv4" source address="192.168.10.5" port port="5601" protocol="tcp" accept


关于k8s部署日志系统参考我的这篇

06 部署日志系统

你可能感兴趣的:(kafka,zookeeper,elasticsearch)