【精心整理】DB2 + HADR + TSA 高可用配置(实用)

目录

前言

一、概念简介

二、配置过程

2.1. 服务器环境配置

2.2. 安装TSA

2.3. 配置TSA

2.3.1. 备机执行 db2haicu

2.3.2. 主机执行 db2haicu

2.4. 查看TSA集群状态

三、常见问题

问题一:DB2 HADR 和 TSAMP 故障切换是如何工作的?

问题二:备机执行db2haicu创建资源时最后报错

问题三:备机执行db2haicu创建资源时最后报错

问题四:执行preprpnode报错

问题五:集群节点状态异常(一般发生在异常切换的时候 )

四、常用命令


前言

12年就注册了CSDN,当时主要是上学为了下载一些资源,后来就一直没用,因为自己也没有技术沉淀,写不出什么花。如今干了十年IT运维,多少也算是有一点点沉淀了。兜兜转转看了好几个平台,还是觉得CSDN的技术氛围最纯粹,都是一群热爱技术的我们。所以,就又回到CSDN,准备与大家一起分享技术,一起共同进步。转眼又一年,2023马上就要到尾声了,今天看到这个话题活动,就想跟大家分享分享。下面我准备了一份关于DB2数据库的技术文档,希望对大家有帮助。虽然在去IOE、大搞信创的大背景下,DB2数据库已经日渐衰落,不过还是想跟大家分享,我们只谈技术。

一、概念简介

TSA(Tivoli System Automation for Multiplatforms)是一款高可用集群软件,它基于IBM的RSCT技术构建。TSA底层可以作为SA MP组件内置于DB2的安装介质中,与DB2一起安装。TSA一般和DB2的HADR配合使用,以实现故障的自动切换和服务IP的提供。

二、配置过程

2.1. 服务器环境配置

操作系统:RedHat 6.5

数据库:DB2 10.5.0.7

A机:192.168.0.11 数据库:mytsa 主机名:tsa-svra

B机:192.168.0.12 数据库:mytsa 主机名:tsa-svrb

服务IP:192.168.0.10

定额(仲裁)设备IP:192.168.0.254

因为没有专门的定额设备或者叫仲裁设备,这里我们暂时使用网关来配置。

2.2. 安装TSA

1. DB2安装包自带TSA软件(/opt/server_t/db2/linuxamd64/tsamp),安装前可执行prereqSAM进行适配性检查,如有依赖包缺失会提示,根据提示安装所需依赖包即可,依赖包在对应系统版本的iso文件中。执行installSAM即可完成TSA的安装。

2. 安装完成后会提示添加profile配置文件

installSAM: Waring: Must set CT_MANAFEMENT_SCOPE=2
在/root/.bash_profile中添加 export CT_MANAFEMENT_SCOPE=2  (重要!!!)
source .bash_profile使配置生效

3. 导入TSA的license

#samlicm –i xxx.lic
#samlicm –s 可查看授权情况 

注意:DB2 10.5.0.8及以上版本在安装DB2数据库服务端的时候是默认会安装TSA的,DB2 10.5.0.8以下版本的需要手工单独安装TSA。 

2.3. 配置TSA

1. 在主备机的 /etc/hosts 文件中添加主机名

echo "tsa-svra   192.168.0.11" >> /etc/hosts
echo "tsa-svrb   192.168.0.12" >> /etc/hosts

2. 创建节点数据库集群

#/usr/sbin/rsct/install/bin/recfgct (初始化ct_node_id,避免由于虚拟机克隆导致的node重复问题,会影响动态逻辑分区(DLPAR))

#/usr/sbin/rsct/bin/preprpnode tsa-svra tsa-svrb(此步骤为主备机提供通讯,若此步骤不执行将无法正常进行通讯)

3. 安装TSA切换脚本

# $DB2DIR/install/tsamp/db2cptsa (即/opt/ibm/db2/V10.5/ha/tsa 目录下脚本) 

4. 使用数据库实例用户通过db2haicu进行集群配置,首先在备机(切记先备后主)进行配置:

2.3.1. 备机执行 db2haicu

Welcome to the DB2 High Availability Instance Configuration Utility (db2haicu).
 
You can find detailed diagnostic information in the DB2 server diagnostic log file called db2diag.log. Also, you can use the utility called db2pd to query the status of the cluster domains you create.
 
For more information about configuring your clustered environment using db2haicu, see the topic called 'DB2 High Availability Instance Configuration Utility (db2haicu)' in the DB2 Information Center.
 
db2haicu determined the current DB2 database manager instance is 'tsainst'. The cluster configuration that follows will apply to this instance.
 
db2haicu is collecting information on your current setup. This step may take some time as db2haicu will need to activate all databases for the instance to discover all paths ...
When you use db2haicu to configure your clustered environment, you create cluster domains. For more information, see the topic 'Creating a cluster domain with db2haicu' in the DB2 Information Center. db2haicu is searching the current machine for an existing active cluster domain ...
db2haicu did not find a cluster domain on this machine. db2haicu will now query the system for information about cluster nodes to create a new cluster domain ...
 
db2haicu did not find a cluster domain on this machine. To continue configuring your clustered environment for high availability, you must create a cluster domain; otherwise, db2haicu will exit.
 
Create a domain and continue? [1]
1. Yes
2. No
1
Create a unique name for the new domain:   # 创建资源名称,根据实际情况取名
db2_tsainst_mytsa
Nodes must now be added to the new domain.
How many cluster nodes will the domain 'db2_tsainst_mytsa' contain?  # 资源组包含几个节点
2
Enter the host name of a machine to add to the domain:  # 输入节点名称,需与步骤 2.3 preprnode 执行名称以及HADR配置主机名一致
tsa-svra
Enter the host name of a machine to add to the domain:
tsa-svrb   
db2haicu can now create a new domain containing the 2 machines that you specified. If you choose not to create a domain now, db2haicu will exit.
 
Create the domain now? [1]
1. Yes
2. No
1
Creating domain 'db2_tsainst_mytsa' in the cluster ...
Creating domain 'db2_tsainst_mytsa' in the cluster was successful.
You can now configure a quorum device for the domain. For more information, see the topic "Quorum devices" in the DB2 Information Center. If you do not configure a quorum device for the domain, then a human operator will have to manually intervene if subsets of machines in the cluster lose connectivity.
 
Configure a quorum device for the domain called 'db2_tsainst_mytsa'? [1]  # 创建定额(仲裁)设备
1. Yes
2. No
 
The following is a list of supported quorum device types:
  1. Network Quorum
Enter the number corresponding to the quorum device type to be used: [1]
1
Specify the network address of the quorum device:   # 定额设备IP,这里我们使用网关IP
192.168.0.254
Configuring quorum device for domain 'db2_tsainst_mytsa' ...
Configuring quorum device for domain 'db2_tsainst_mytsa' was successful.
The cluster manager found the following total number of network interface cards on the machines in the cluster domain: '2'.  You can add a network to your cluster domain using the db2haicu utility.
 
Create networks for these network interface cards? [1]   # 添加网卡进集群,用于通讯及心跳检测
1. Yes
2. No
1
Enter the name of the network for the network interface card: 'eth1' on cluster node: 'tsa-svra'
1. Create a new public network for this network interface card.
2. Create a new private network for this network interface card.
3. Skip this step.
Enter selection:
1
Are you sure you want to add the network interface card 'eth1' on cluster node 'tsa-svra' to the network 'db2_public_network_0'? [1]
1. Yes
2. No
1
Adding network interface card 'eth1' on cluster node 'tsa-svra' to the network 'db2_public_network_0' ...
Adding network interface card 'eth1' on cluster node 'tsa-svra' to the network 'db2_public_network_0' was successful.
Enter the name of the network for the network interface card: 'eth1' on cluster node: 'tsa-svrb'
1. db2_public_network_0
2. Create a new public network for this network interface card.
3. Create a new private network for this network interface card.
4. Skip this step.
Enter selection:  # 记住这里还是选择1,在各个创建的网络中添加
1
Are you sure you want to add the network interface card 'eth1' on cluster node 'tsa-svrb' to the network 'db2_public_network_0'? [1]
1. Yes
2. No
1
Adding network interface card 'eth1' on cluster node 'tsa-svrb' to the network 'db2_public_network_0' ...
Adding network interface card 'eth1' on cluster node 'tsa-svrb' to the network 'db2_public_network_0' was successful.
Retrieving high availability configuration parameter for instance 'tsainst' ...
The cluster manager name configuration parameter (high availability configuration parameter) is not set. For more information, see the topic "cluster_mgr - Cluster manager name configuration parameter" in the DB2 Information Center. Do you want to set the high availability configuration parameter?
The following are valid settings for the high availability configuration parameter:
  1.TSA
  2.Vendor
Enter a value for the high availability configuration parameter: [1]
1
Setting a high availability configuration parameter for instance 'tsainst' to 'TSA'.
Adding DB2 database partition '0' to the cluster ...
Adding DB2 database partition '0' to the cluster was successful.
Do you want to validate and automate HADR failover for the HADR database 'TSADB'? [1]
1. Yes
2. No
1
Adding HADR database 'TSADB' to the domain ...
HADR database 'TSADB' has been determined to be valid for high availability. However, the database cannot be added to the cluster from this node because db2haicu detected this node is the standby for HADR database 'TSADB'. Run db2haicu on the primary for HADR database 'TSADB' to configure the database for automated failover.
All cluster configurations have been completed successfully. db2haicu exiting ...

2.3.2. 主机执行 db2haicu

Welcome to the DB2 High Availability Instance Configuration Utility (db2haicu).

You can find detailed diagnostic information in the DB2 server diagnostic log file called db2diag.log. Also, you can use the utility called db2pd to query the status of the cluster domains you create.

For more information about configuring your clustered environment using db2haicu, see the topic called 'DB2 High Availability Instance Configuration Utility (db2haicu)' in the DB2 Information Center.

db2haicu determined the current DB2 database manager instance is 'tsainst'. The cluster configuration that follows will apply to this instance.

db2haicu is collecting information on your current setup. This step may take some time as db2haicu will need to activate all databases for the instance to discover all paths ...
When you use db2haicu to configure your clustered environment, you create cluster domains. For more information, see the topic 'Creating a cluster domain with db2haicu' in the DB2 Information Center. db2haicu is searching the current machine for an existing active cluster domain ...
db2haicu found a cluster domain called 'db2_tsainst_TSADB' on this machine. The cluster configuration that follows will apply to this domain.

Retrieving high availability configuration parameter for instance 'tsainst' ...
The cluster manager name configuration parameter (high availability configuration parameter) is not set. For more information, see the topic "cluster_mgr - Cluster manager name configuration parameter" in the DB2 Information Center. Do you want to set the high availability configuration parameter?
The following are valid settings for the high availability configuration parameter:
  1.TSA
  2.Vendor
Enter a value for the high availability configuration parameter: [1]
1
Setting a high availability configuration parameter for instance 'tsainst' to 'TSA'.
Adding DB2 database partition '0' to the cluster ...
Adding DB2 database partition '0' to the cluster was successful.
Do you want to validate and automate HADR failover for the HADR database 'TSADB'? [1]
1. Yes
2. No
1
Adding HADR database 'TSADB' to the domain ...
Adding HADR database 'TSADB' to the domain was successful.
Do you want to configure a virtual IP address for the HADR database 'TSADB'? [1]   # 添加虚拟IP,即服务地址
1. Yes
2. No
1
Enter the virtual IP address:
192.168.0.10
Enter the subnet mask for the virtual IP address '192.168.0.10': [255.255.255.0]
255.255.255.0
Select the network for the virtual IP '192.168.0.10':
1. db2_public_network_0
Enter selection:
1
Adding virtual IP address 192.168.0.10 to the domain ... 
Adding virtual IP address 192.168.0.10 to the domain was successful. 
All cluster configurations have been completed successfully. db2haicu exiting …

2.4. 查看TSA集群状态

[root@xsdev-008 tmp]# lssam

Online IBM.ResourceGroup:db2_tsainst_tsa-svra_0-rg Nominal=Online
        '- Online IBM.Application:db2_tsainst_tsa-svra_0-rs
                '- Online IBM.Application:db2_tsainst_tsa-svra_0-rs:tsa-svra
Online IBM.ResourceGroup:db2_tsainst_tsa-svrb_0-rg Nominal=Online
        '- Online IBM.Application:db2_tsainst_tsa-svrb_0-rs
                '- Online IBM.Application:db2_tsainst_tsa-svrb_0-rs:tsa-svrb
Online IBM.ResourceGroup:db2_tsainst_tsainst_TSADB-rg Nominal=Online
        '- Online IBM.Application:db2_tsainst_tsainst_TSADB-rs
                |- Online IBM.Application:db2_tsainst_tsainst_TSADB-rs:tsa-svra
                '- Offline IBM.Application:db2_tsainst_tsainst_TSADB-rs:tsa-svrb
Online IBM.Equivalency:db2_public_network_0
        |- Online IBM.NetworkInterface:eth1:tsa-svra
        '- Online IBM.NetworkInterface:eth1:tsa-svrb
Online IBM.Equivalency:db2_tsainst_tsa-svra_0-rg_group-equ
        '- Online IBM.PeerNode:tsa-svra:tsa-svra
Online IBM.Equivalency:db2_tsainst_tsa-svrb_0-rg_group-equ
        '- Online IBM.PeerNode:tsa-svrb:tsa-svrb
Online IBM.Equivalency:db2_tsainst_tsainst_TSADB-rg_group-equ
        |- Online IBM.PeerNode:tsa-svra:tsa-svra
        '- Online IBM.PeerNode:tsa-svrb:tsa-svrb

至此,DB2 + HADR +TSA 高可用部署完毕。

三、常见问题

问题一:DB2 HADR 和 TSAMP 故障切换是如何工作的?

答:当HADR中的主机发生故障,"hadrV97_monitor.ksh" (/opt/ibm/db2/V10.5/ha/tsa)脚本侦测到故障,然后报告给TSAMP。TSAMP在故障主机调用"hadrV97_stop.ksh"脚本。当调用结束后,在HADR备机上调用"hadrV97_start.ksh"脚本。"hadrV97_start.ksh"脚本会调用 db2gcf程序; db2gcf程序会尝试故障切换让备机接管主机工作。接管的具体命令为"TAKEOVER HADR BY FORCE PEER WINDOW ONLY"。当旧的主机从故障中恢复,TSAMP会自动通过脚本"db2V97_start.ksh"启动DB2实例。作为这个操作的一部分,HADR数据库会通过命令"start hadr on db as standby"恢复到 pair状态。

问题二:备机执行db2haicu创建资源时最后报错

报错信息:Creating resources for the instance '' has failed.

答: 检查 db2haicu 步骤 Enter the host name of a machine to add to the domain: 中输入节点、以及preprnode执行加入的加点名称是否一致。 可以通过查看数据库 db2diag 诊断日志定位:

2023-12-01-15.06.57.052079-240 E1413216E647          LEVEL: Error
PID     : 9438                 TID : 139919062497056 PROC : db2haicu
INSTANCE: tsainst              NODE : 000
HOSTNAME: tsa-svrb
FUNCTION: DB2 UDB, high avail services, sqlhaUICreateHADR, probe:7029
RETCODE : ECF=0x90000531=-1879046863=ECF_SQLHA_INVALID_PARAM
          Invalid parameter to SQLHA call
MESSAGE : There is a mismatch between the local hostname value and the HADR DB
          CFG value (HADR_LOCAL_HOST or localHost). Correct this and re-run
          db2haicu.

问题三:备机执行db2haicu创建资源时最后报错

报错信息:DB2 HA scripts were not found in '/usr/sbin/rsct/sapolicies/db2'. Run the db2cptsa utility to install the scripts and then re-run the command.

答:安装TSA切换脚本,执行# $DB2DIR/install/tsamp/db2cptsa  ($DB2DIR是数据库安装目录,以我为例即/opt/ibm/db2/V10.5/ha/tsa 目录下脚本),这个脚本作用其实是将故障转移所需要的脚本手工cp过来而已。

问题四:执行preprpnode报错

报错信息::/opt/rsct/bin/lsrsrc-api: 2612-022 A session could not be established with the RMC daemon on tsa-svra.
preprpnode: Unable to obtain the public key from tsa-svra. 

答:出现类似报错一般为防火墙原因,关闭防火墙后再重新执行即可。
systemctl stop firewalld 

service iptables stop

根据操作系统版本不同,上述两个命令可能不同,请根据实际情况执行。

问题五:集群节点状态异常(一般发生在异常切换的时候 )

报错信息:

Online IBM.ResourceGroup:db2_tsainst_tsa-svra_0-rg Nominal=Online
        '- Online IBM.Application:db2_tsainst_tsa-svra_0-rs
                '- Online IBM.Application:db2_tsainst_tsa-svra_0-rs:tsa-svra
Online IBM.ResourceGroup:db2_tsainst_tsa-svrb_0-rg Nominal=Online
        '- Online IBM.Application:db2_tsainst_tsa-svrb_0-rs
                '- Online IBM.Application:db2_tsainst_tsa-svrb_0-rs:tsa-svrb
Online IBM.ResourceGroup:db2_tsainst_tsainst_TSADB-rg Control=MemberInProblemState Nominal=Online
        |- Online IBM.Application:db2_tsainst_tsainst_TSADB-rs Control=MemberInProblemState
                |- Online IBM.Application:db2_tsainst_tsainst_TSADB-rs:tsa-svra
                '- Failed offline IBM.Application:db2_tsainst_tsainst_TSADB-rs:tsa-svrb
        '- Online IBM.ServiceIP:db2ip_192.168.0.10-rs
                |- Online IBM.ServiceIP:db2ip_192.168.0.10-rs:tsa-svra
                '- Offline IBM.ServiceIP:db2ip_192.168.0.10-rs:tsa-svrb
Online IBM.Equivalency:db2_tsainst_tsa-svra_0-rg_group-equ
        '- Online IBM.PeerNode:tsa-svra:tsa-svra
Online IBM.Equivalency:db2_tsainst_tsa-svrb_0-rg_group-equ
        '- Online IBM.PeerNode:tsa-svrb:tsa-svrb
Online IBM.Equivalency:db2_tsainst_tsainst_TSADB-rg_group-equ
        |- Online IBM.PeerNode:tsa-svra:tsa-svra
        '- Online IBM.PeerNode:tsa-svrb:tsa-svrb
Online IBM.Equivalency:db2_private_network_0
        |- Online IBM.NetworkInterface:eth1:tsa-svra
        '- Online IBM.NetworkInterface:eth1:tsa-svrb
Online IBM.Equivalency:db2_public_network_0
        |- Online IBM.NetworkInterface:eth0:tsa-svra
        '- Online IBM.NetworkInterface:eth0:tsa-svrb

答:出现类似报错需要重置集群节点状态,用root用户执行如下命令:
resetrsrc -s 'Name=="db2_tsainst_tsainst_TSADB-rs" && NodeNameList="tsa-svrb"' IBM.Application

四、常用命令

命令

说明

preprpnode

这个命令为集群中包含的节点准备安全设置。当发出这个命令时,在节点之间交换公共密钥并修改RMC访问控制列表(ACL),让集群的所有节点都能够访问集群资源。

mkrpdomain

这个命令创建一个新的集群定义。它用来指定集群的名称以及要添加进集群的节点列表。

lsrpdomain

这个命令列出运行这个命令的节点所属集群的相关信息。

startrpdomain/stoprpdomain

这些命令分别使集群在线和离线。

addrpnode

在定义并运行集群之后,使用这个命令在集群中添加新节点。

startrpnode/stoprpnode

这些命令分别使集群中的单独节点在线和离线。在执行系统维护时常常使用这些命令。停止节点,执行修复或维护,然后重新启动节点,这时它会重新加入集群。

lsrpnode

这个命令用来查看为集群定义的节点列表,以及每个节点的操作状态(OpState)。注意,这个命令只在集群中的在线节点上有效;在离线节点上,它不显示节点列表。

rmrpdomain

这个命令删除一个定义的集群。

rmrpnode

这个命令从集群定义中删除一个或多个节点。


呕心沥血整理了,希望大家喜欢,不足之处,请多指教。

提前祝大家春节快乐,龙行龘龘!

你可能感兴趣的:(数据库)