将华为云/阿里云 RDS相关监控指标集成到Zabbix/Grafana中

一.需求分析

1.监控需求:将云厂商的RDS数据库监控指标集成到自己公司的监控系统中
2.集成方法:调用厂商提供的API接口
3.华为云/阿里云RDS的监控指标异同

##注意:
调用接口有如下两种认证方式,您可以选择其中一种进行认证鉴权。
Token认证:通过Token认证通用请求。
AK/SK认证:通过AK(Access Key ID)/SK(Secret Access Key)加密调用请求。推荐使用AK/SK认证,其安全性比Token认证要高。【文章采用该方式演示】

二.华为云RDS监控

1.支持的监控指标:https://support.huaweicloud.com/usermanual-rds/rds_06_0001.html
	mysql的dim.0=rds_cluster_id     postgres 的  dim.0=postgresql_cluster_id

2.支持监控的服务列表:https://support.huaweicloud.com/api-ces/ces_03_0059.html

3.下载使用Python SDK: https://obs.cn-north-1.myhuaweicloud.com/apig-sdk/APIGW-python-sdk.zip

4.监控说明 urlhttps://support.huaweicloud.com/api-ces/ces_03_0033.html

.5整理可参考性的指标: 
	rds001_cpu_util           CPU使用率
    rds002_mem_util           内存使用率
    rds003_iops               IOPS,单位时间内系统处理的I/O请求数量(平均值)。
    rds006_conn_count         数据库总连接数
    rds007_conn_active_count  当前活跃连接数
    rds008_qps                QPS,该指标用于统计SQL语句查询次数,含存储过程,以次/秒为单位。
    rds009_tps                TPS,该指标用于统计事务执行次数,含提交的和回退的,以次/秒为单位。
    rds039_disk_util          磁盘利用率
    rds072_conn_usage         连接数使用率
    rds074_slow_queries       慢日志个数统计


6.简单编写python代码main部分
# coding=utf-8
import requests
import time
import sys
from apig_sdk import signer

if __name__ == '__main__':
    sig = signer.Signer()
    # Set the **AK/SK** to sign and authenticate the request.
    sig.Key = "https://blog.csdn.net/meijinmeng"
    sig.Secret = "https://blog.csdn.net/meijinmeng"
    project_id="xxxxxxxx" ##通过控制台获得数据库示例的ID

	#set nam_space /metric_name
    name_space="SYS.RDS"
    metic_name="rds001_cpu_util  "
    rds_cluster_id="xxxxxxxxxxx" #通过控制台获得
   
    #get  first 获取当前时间的20分钟之前的时间,time.time()获取当前时间。
    first = time.strftime("%Y-%m-%d %H:%M:%S",time.localtime(time.time() -1200))
    ts_first = int(time.mktime(time.strptime(first, "%Y-%m-%d %H:%M:%S")))*1000
    ts_last = int(time.time())*1000






    r = signer.HttpRequest("GET", "https://ces.cn-east-2.myhuaweicloud.com/V1.0/%s/metric-data?namespace=SYS.RDS&metric_name=%s&dim.0=rds_cluster_id,xxxxxxxxxxx&from=%s&to=%s&period=1200&filter=average"%(project_id,metic_name,ts_first,ts_last))
    #r = signer.HttpRequest("GET", "https://rds.cn-east-2.myhuaweicloud.com/v3/826025ab494546f9bafbca01a811a9ef/datastores/mysql")
    r.headers = {"content-type": "application/json"}
    r.body = ""
    sig.Sign(r)

    #print(r.headers["X-Sdk-Date"])
    #print(r.headers["Authorization"])
    resp = requests.request(r.method, r.scheme + "://" + r.host + r.uri, headers=r.headers, data=r.body)

    print(resp.content.decode("utf-8"))


    data=resp.content.decode("utf-8").split('{"datapoints":[')
    data2=data[1].split(']')[0]
    last_data=data2.split("},")

    #print(last_data,type(last_data))
    data_dic=eval(last_data[1])
    print(data_dic,type(data_dic))
    print(data_dic['average'])
    print("%s 当前值为:%s %s"%(metic_name,data_dic['average'],data_dic['unit']))

	#输出结果:
	'''
	 {"datapoints":[{"average":4.6,"timestamp":1609741200000,"unit":"%"},{"average":4.8,"timestamp":1609742400000,"unit":"%"}],"metric_name":"rds001_cpu_util"}
{'average': 4.8, 'timestamp': 1609742400000, 'unit': '%'} 

4.8
rds001_cpu_util 当前值为:4.8 %
'''


7.将sdk代码集成到zabbix/Grafana
将metic_name=sys.argv[1] 作为一个变量传入
将华为云/阿里云 RDS相关监控指标集成到Zabbix/Grafana中_第1张图片

三.阿里云RDS监控

1.功能:zabbix通过阿里云api 自动发现、监控阿里云RDS-Mysql数据库  
2.注意事项:
  脚本会收集RDS别名,
  不要默认别名
  不要使用中文别名(zabbix不识别)
  切记aliyun-python-sdk-core==2.3.5,新版本的sdk有bug

3.环境要求:python = 2.7
模块安装/usr/bin/env pip2.7 install aliyun-python-sdk-core==2.3.5 aliyun-python-sdk-rds==2.1.4 datetime

4.使用方法:

    从阿里云控制台获取 AccessKey ,并修改脚本中的 ID 与 Secret
    修改区域 RegionId
    将两个脚本放置于以下目录
    /etc/zabbix/script  && chmod +x /etc/zabbix/script/*

    修改zabbix-agentd.conf,添加以下内容
  #rds
  UserParameter=rds.discovery,/usr/bin/env python2.7 /etc/zabbix/script/discovery_rds.py
  UserParameter=check.rds[*],/usr/bin/env python2.7 /etc/zabbix/script/check_rds.py $1 $2 $3
 重启zabbix-agent
 zabbix控制台导入模板,并关联主机


5.代码部分:

#check_rds.py

#coding=utf-8
from aliyunsdkcore import client
from aliyunsdkrds.request.v20140815 import DescribeResourceUsageRequest,DescribeDBInstancePerformanceRequest
import json,sys,datetime

ID = 'ID'
Secret = 'Secret'
RegionId = 'cn-shenzhen'

clt = client.AcsClient(ID,Secret,RegionId)

Type = sys.argv[1]
DBInstanceId = sys.argv[2]
Key = sys.argv[3]

# 阿里云返回的数据为UTC时间,因此要转换为东八区时间。其他时区同理。
UTC_End = datetime.datetime.today() - datetime.timedelta(hours=8)
UTC_Start = UTC_End - datetime.timedelta(minutes=25)

StartTime = datetime.datetime.strftime(UTC_Start, '%Y-%m-%dT%H:%MZ')
EndTime = datetime.datetime.strftime(UTC_End, '%Y-%m-%dT%H:%MZ')

def GetResourceUsage(DBInstanceId,Key):
    ResourceUsage = DescribeResourceUsageRequest.DescribeResourceUsageRequest()
    ResourceUsage.set_accept_format('json')
    ResourceUsage.set_DBInstanceId(DBInstanceId)
    ResourceUsageInfo = clt.do_action_with_exception(ResourceUsage)
    #print ResourceUsageInfo
    Info = (json.loads(ResourceUsageInfo))[Key]
    print Info

def GetPerformance(DBInstanceId,MasterKey,IndexNum,StartTime,EndTime):
    Performance = DescribeDBInstancePerformanceRequest.DescribeDBInstancePerformanceRequest()
    Performance.set_accept_format('json')
    Performance.set_DBInstanceId(DBInstanceId)
    Performance.set_Key(MasterKey)
    Performance.set_StartTime(StartTime)
    Performance.set_EndTime(EndTime)
    PerformanceInfo = clt.do_action_with_exception(Performance)
    #print PerformanceInfo
    Info = (json.loads(PerformanceInfo))
    Value = Info['PerformanceKeys']['PerformanceKey'][0]['Values']['PerformanceValue'][-1]['Value']
    print str(Value).split('&')[IndexNum]


if (Type == "Disk"):
    GetResourceUsage(DBInstanceId, Key)

elif (Type == "Performance"):

    #平均每秒钟的输入流量
    if (Key == "MySQL_NetworkTraffic_In"):
        IndexNum = 0
        MasterKey = "MySQL_NetworkTraffic"
        GetPerformance(DBInstanceId,MasterKey,IndexNum,StartTime,EndTime)

    #平均每秒钟的输出流量
    elif (Key == "MySQL_NetworkTraffic_Out"):
        IndexNum = 1
        MasterKey = "MySQL_NetworkTraffic"
        GetPerformance(DBInstanceId,MasterKey,IndexNum,StartTime,EndTime)

    #每秒SQL语句执行次数
    elif (Key == "MySQL_QPS"):
        IndexNum = 0
        MasterKey = "MySQL_QPSTPS"
        GetPerformance(DBInstanceId,MasterKey,IndexNum,StartTime,EndTime)

    #平均每秒事务数
    elif (Key == "MySQL_TPS"):
        IndexNum = 1
        MasterKey = "MySQL_QPSTPS"
        GetPerformance(DBInstanceId,MasterKey,IndexNum,StartTime,EndTime)

    #当前活跃连接数
    elif (Key == "MySQL_Sessions_Active"):
        MasterKey = "MySQL_Sessions"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #当前总连接数
    elif (Key == "MySQL_Sessions_Totle"):
        MasterKey = "MySQL_Sessions"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #InnoDB缓冲池的读命中率
    elif (Key == "ibuf_read_hit"):
        MasterKey = "MySQL_InnoDBBufferRatio"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #InnoDB缓冲池的利用率
    elif (Key == "ibuf_use_ratio"):
        MasterKey = "MySQL_InnoDBBufferRatio"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #InnoDB缓冲池脏块的百分率
    elif (Key == "ibuf_dirty_ratio"):
        MasterKey = "MySQL_InnoDBBufferRatio"
        IndexNum = 2
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #InnoDB平均每秒钟读取的数据量
    elif (Key == "inno_data_read"):
        MasterKey = "MySQL_InnoDBDataReadWriten"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #InnoDB平均每秒钟写入的数据量
    elif (Key == "inno_data_written"):
        MasterKey = "MySQL_InnoDBDataReadWriten"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒向InnoDB缓冲池的读次数
    elif (Key == "ibuf_request_r"):
        MasterKey = "MySQL_InnoDBLogRequests"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒向InnoDB缓冲池的写次数
    elif (Key == "ibuf_request_w"):
        MasterKey = "MySQL_InnoDBLogRequests"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒日志写请求数
    elif (Key == "Innodb_log_write_requests"):
        MasterKey = "MySQL_InnoDBLogWrites"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒向日志文件的物理写次数
    elif (Key == "Innodb_log_writes"):
        MasterKey = "MySQL_InnoDBLogWrites"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒向日志文件完成的fsync()写数量
    elif (Key == "Innodb_os_log_fsyncs"):
        MasterKey = "MySQL_InnoDBLogWrites"
        IndexNum = 2
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MySQL执行语句时在硬盘上自动创建的临时表的数量
    elif (Key == "tb_tmp_disk"):
        MasterKey = "MySQL_TempDiskTableCreates"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MyISAM平均每秒Key Buffer利用率
    elif (Key == "Key_usage_ratio"):
        MasterKey = "MySQL_MyISAMKeyBufferRatio"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MyISAM平均每秒Key Buffer读命中率
    elif (Key == "Key_read_hit_ratio"):
        MasterKey = "MySQL_MyISAMKeyBufferRatio"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MyISAM平均每秒Key Buffer写命中率
    elif (Key == "Key_write_hit_ratio"):
        MasterKey = "MySQL_MyISAMKeyBufferRatio"
        IndexNum = 2
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MyISAM平均每秒钟从缓冲池中的读取次数
    elif (Key == "myisam_keyr_r"):
        MasterKey = "MySQL_MyISAMKeyReadWrites"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MyISAM平均每秒钟从缓冲池中的写入次数
    elif (Key == "myisam_keyr_w"):
        MasterKey = "MySQL_MyISAMKeyReadWrites"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MyISAM平均每秒钟从硬盘上读取的次数
    elif (Key == "myisam_keyr"):
        MasterKey = "MySQL_MyISAMKeyReadWrites"
        IndexNum = 2
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MyISAM平均每秒钟从硬盘上写入的次数
    elif (Key == "myisam_keyw"):
        MasterKey = "MySQL_MyISAMKeyReadWrites"
        IndexNum = 3
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒Delete语句执行次数
    elif (Key == "com_delete"):
        MasterKey = "MySQL_COMDML"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒Insert语句执行次数
    elif (Key == "com_insert"):
        MasterKey = "MySQL_COMDML"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒Insert_Select语句执行次数
    elif (Key == "com_insert_select"):
        MasterKey = "MySQL_COMDML"
        IndexNum = 2
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒Replace语句执行次数
    elif (Key == "com_replace"):
        MasterKey = "MySQL_COMDML"
        IndexNum = 3
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒Replace_Select语句执行次数
    elif (Key == "com_replace_select"):
        MasterKey = "MySQL_COMDML"
        IndexNum = 4
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒Select语句执行次数
    elif (Key == "com_select"):
        MasterKey = "MySQL_COMDML"
        IndexNum = 5
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒Update语句执行次数
    elif (Key == "com_update"):
        MasterKey = "MySQL_COMDML"
        IndexNum = 6
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒从InnoDB表读取的行数
    elif (Key == "inno_row_readed"):
        MasterKey = "MySQL_RowDML"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒从InnoDB表更新的行数
    elif (Key == "inno_row_update"):
        MasterKey = "MySQL_RowDML"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒从InnoDB表删除的行数
    elif (Key == "inno_row_delete"):
        MasterKey = "MySQL_RowDML"
        IndexNum = 2
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒从InnoDB表插入的行数
    elif (Key == "inno_row_insert"):
        MasterKey = "MySQL_RowDML"
        IndexNum = 3
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #平均每秒向日志文件的物理写次数
    elif (Key == "Inno_log_writes"):
        MasterKey = "MySQL_RowDML"
        IndexNum = 4
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MySQL实例CPU使用率(占操作系统总数)
    elif (Key == "cpuusage"):
        MasterKey = "MySQL_MemCpuUsage"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MySQL实例内存使用率(占操作系统总数)
    elif (Key == "memusage"):
        MasterKey = "MySQL_MemCpuUsage"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #MySQL实例的IOPS(每秒IO请求次数)
    elif (Key == "io"):
        MasterKey = "MySQL_IOPS"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #ins_size实例总空间使用量
    elif (Key == "ins_size"):
        MasterKey = "MySQL_DetailedSpaceUsage"
        IndexNum = 0
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #data_size数据空间
    elif (Key == "data_size"):
        MasterKey = "MySQL_DetailedSpaceUsage"
        IndexNum = 1
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #log_size日志空间
    elif (Key == "log_size"):
        MasterKey = "MySQL_DetailedSpaceUsage"
        IndexNum = 2
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #tmp_size临时空间
    elif (Key == "tmp_size"):
        MasterKey = "MySQL_DetailedSpaceUsage"
        IndexNum = 3
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

    #other_size系统空间
    elif (Key == "other_size"):
        MasterKey = "MySQL_DetailedSpaceUsage"
        IndexNum = 4
        GetPerformance(DBInstanceId, MasterKey, IndexNum, StartTime, EndTime)

#discovery_rds.py

#coding=UTF-8
from aliyunsdkcore import client
from aliyunsdkrds.request.v20140815 import DescribeDBInstancesRequest
import json

ID = 'ID'
Secret = 'Secret'
RegionId = 'cn-shenzhen'

clt = client.AcsClient(ID,Secret,RegionId)

DBInstanceIdList = []
DBInstanceIdDict = {}
ZabbixDataDict = {}
def GetRdsList():
    RdsRequest = DescribeDBInstancesRequest.DescribeDBInstancesRequest()
    RdsRequest.set_accept_format('json')
    #RdsInfo = clt.do_action(RdsRequest)
    RdsInfo = clt.do_action_with_exception(RdsRequest)
    for RdsInfoJson in (json.loads(RdsInfo))['Items']['DBInstance']:
        DBInstanceIdDict = {}
        try:
            DBInstanceIdDict["{#DBINSTANCEID}"] = RdsInfoJson['DBInstanceId']
            DBInstanceIdDict["{#DBINSTANCEDESCRIPTION}"] = RdsInfoJson['DBInstanceDescription']
            DBInstanceIdList.append(DBInstanceIdDict)
        except Exception, e:
            print Exception, ":", e
            print "Please check the RDS alias !Alias must not be the same as DBInstanceId!!!"
            


GetRdsList()
ZabbixDataDict['data'] = DBInstanceIdList
print json.dumps(ZabbixDataDict)

鸣谢:https://github.com/XWJR-Ops/zabbix-RDS-monitor 给的阿里云代码灵感,以后用阿里云备用。

你可能感兴趣的:(python基础及实践,监控系统及应用)