Oracle 11g R2 Data Guard 容灾系统构建S

2019-06-18  本文已影响0人  chimpansee

目录

异地或同城备机故障模拟

异地或同城备机出现故障不会影响主系统的继续正常运行,中断现象出现后,主机能很快识别,一旦网络恢复,主机也能在短时间内恢复的。
我们重启异地和同城备机系统。

[oracle@beijing ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 17 11:33:18 2019

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> create table tab1(c1 int);

Table created.

SQL> @/home/oracle/batchinsert.sql

PL/SQL procedure successfully completed.

SQL> /
[oracle@beijing_2 ~]$ su -
Password: 
Last login: Mon Jun 17 17:15:47 CST 2019 on tty1
[root@beijing_2 ~]# reboot
[oracle@chengdu dbs]$ su -
Password:
Last login: Sun Jun 16 20:10:40 CST 2019 on pts/1
[root@chengdu ~]# reboot
SQL> select dest_id,dest_name,status,error,valid_now,fail_date,fail_sequence,valid_type from v$archive_dest where dest_id in (2,3);

   DEST_ID DEST_NAME            STATUS    ERROR                                    VALID_NOW        FAIL_DATE FAIL_SEQUENCE VALID_TYPE
---------- -------------------- --------- ---------------------------------------- ---------------- --------- ------------- ---------------
         2 LOG_ARCHIVE_DEST_2   ERROR     ORA-03135: connection lost contact       INACTIVE         17-JUN-19            74 ONLINE_LOGFILE
         3 LOG_ARCHIVE_DEST_3   ERROR     ORA-03113: end-of-file on communication  INACTIVE         17-JUN-19            73 ONLINE_LOGFILE
                                          channel


SQL>

所有的传输中断了。停留在不同的日志上,同城的停止在73号日志,异地的停止在74号日志。同样reboot命令重启系统,主数据库系统显示的错误提示还是一样的。

我们重新启用异地和同城备用系统。

[oracle@chengdu ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 17 17:48:06 2019

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area 1135747072 bytes
Fixed Size                  2227664 bytes
Variable Size             704643632 bytes
Database Buffers          419430400 bytes
Redo Buffers                9445376 bytes
Database mounted.
Database opened.
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@chengdu ~]$ lsnrctl start

LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 17-JUN-2019 17:48:30

Copyright (c) 1991, 2011, Oracle.  All rights reserved.

Starting /home/oracle/app/oracle/product/11.2.0/dbhome_1/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 11.2.0.3.0 - Production
System parameter file is /home/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora
Log messages written to /home/oracle/app/oracle/diag/tnslsnr/chengdu/listener/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=chengdu)(PORT=1521)))

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date                17-JUN-2019 17:48:30
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /home/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora
Listener Log File         /home/oracle/app/oracle/diag/tnslsnr/chengdu/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=chengdu)(PORT=1521)))
Services Summary...
Service "chengdu" has 1 instance(s).
  Instance "orcl", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
[oracle@chengdu ~]$
[oracle@beijing_2 ~]$ lsnrctl start

LSNRCTL for Linux: Version 11.2.0.3.0 - Production on 17-JUN-2019 17:49:46

Copyright (c) 1991, 2011, Oracle.  All rights reserved.

Starting /home/oracle/app/oracle/product/11.2.0/dbhome_1/bin/tnslsnr: please wait...

TNSLSNR for Linux: Version 11.2.0.3.0 - Production
System parameter file is /home/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora
Log messages written to /home/oracle/app/oracle/diag/tnslsnr/beijing_2/listener/alert/log.xml
Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=beijing_2)(PORT=1521)))

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.3.0 - Production
Start Date                17-JUN-2019 17:49:47
Uptime                    0 days 0 hr. 0 min. 0 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /home/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora
Listener Log File         /home/oracle/app/oracle/diag/tnslsnr/beijing_2/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=beijing_2)(PORT=1521)))
Services Summary...
Service "beijing_2" has 1 instance(s).
  Instance "orcl", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
[oracle@beijing_2 ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 17 17:49:51 2019

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area  709836800 bytes
Fixed Size                  2231752 bytes
Variable Size             440402488 bytes
Database Buffers          260046848 bytes
Redo Buffers                7155712 bytes
Database mounted.
Database opened.
SQL>
SQL> /

   DEST_ID DEST_NAME            STATUS    ERROR                                    VALID_NOW        FAIL_DATE FAIL_SEQUENCE VALID_TYPE
---------- -------------------- --------- ---------------------------------------- ---------------- --------- ------------- ---------------
         2 LOG_ARCHIVE_DEST_2   VALID                                              YES                                    0 ONLINE_LOGFILE
         3 LOG_ARCHIVE_DEST_3   VALID                                              YES                                    0 ONLINE_LOGFILE

SQL> archive log list
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     76
Next log sequence to archive   78
Current log sequence           78
SQL>

我们模拟的故障大约持续了短短的5分钟时间,主数据库系统提示,日志传送已经全部自动恢复了。这个容灾系统又重新开始正常工作了。

主数据库系统故障模拟

data guard 集群中的数据库就两种角色,主数据库,备用数据库。将某角色改变为另一角色,我们叫角色转换。角色转换有两种方式,一种是switchover,另一种是故障切换。
我们这里用switchover方式,这种方式更类似于一种计划好的用于灾备演练,或维护数据库的方式。是我们人为手工进行的角色切换。
switchover能保证角色转换时数据没有丢失。故障模拟前,我们要对主备系统做一系列的配置和检查。

[oracle@beijing dbs]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Mon Jun 17 22:34:52 2019

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> alter system set standby_file_management=auto;

System altered.

SQL> alter system set fal_server=beijing_2;

System altered.

SQL> alter system set fal_client=beijing;

System altered.

SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

我们计划在同城主备数据库之间做switchover。所以我们在主数据库上将异地的日志传送停止下来。

SQL> alter system set log_archive_dest_state_2=defer;

System altered.

我们在主数据库也创建对应的standby redo log。切换后,可以用于接收beijing_2发送来的redo log。

SQL> host          
[oracle@beijing ~]$ cd $ORACLE_HOME/dbs
[oracle@beijing dbs]$ mkdir srl
[oracle@beijing dbs]$ exit
exit

SQL> alter database add standby logfile group 10 '?/dbs/srl/srl10.dbf' size 52428800;

Database altered.

SQL> alter database add standby logfile group 11 '?/dbs/srl/srl11.dbf' size 52428800;

Database altered.

SQL> 
SQL> alter database add standby logfile group 12 '?/dbs/srl/srl12.dbf' size 52428800;

Database altered.

SQL> 

我们在同城备用数据上增加primary相关参数。

SQL> alter system set log_archive_dest_1='location=USE_DB_RECOVERY_FILE_DEST db_unique_name=beijing_2';

System altered.

SQL> alter system set log_archive_dest_2='service=beijing valid_for=(ONLINE_LOGFILES,PRIMARY_ROLE) lgwr sync  affirm db_unique_name=beijing';

System altered.

SQL> alter system set log_archive_dest_state_1=enable;

System altered.

SQL> alter system set log_archive_dest_state_2=defer;

System altered.

主数据库检查。

[oracle@beijing dbs]$ tnsping beijing_2

TNS Ping Utility for Linux: Version 11.2.0.3.0 - Production on 18-JUN-2019 00:43:40

Copyright (c) 1997, 2011, Oracle.  All rights reserved.

Used parameter files:


Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.35)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = beijing_2)))
OK (0 msec)

同城备机检查。

[oracle@beijing_2 dbs]$ tnsping beijing

TNS Ping Utility for Linux: Version 11.2.0.3.0 - Production on 18-JUN-2019 00:43:31

Copyright (c) 1997, 2011, Oracle.  All rights reserved.

Used parameter files:


Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.1.30)(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = beijing)))
OK (0 msec)
SQL> select dest_id,dest_name,status,gap_status from v$archive_dest_status;

   DEST_ID DEST_NAME            STATUS    GAP_STATUS
---------- -------------------- --------- ------------------------
         1 LOG_ARCHIVE_DEST_1   VALID
         2 LOG_ARCHIVE_DEST_2   DEFERRED
         3 LOG_ARCHIVE_DEST_3   VALID     NO GAP

SQL> select dest_id,dest_name,delay_mins from v$archive_dest;

   DEST_ID DEST_NAME            DELAY_MINS
---------- -------------------- ----------
         1 LOG_ARCHIVE_DEST_1            0
         2 LOG_ARCHIVE_DEST_2            0
         3 LOG_ARCHIVE_DEST_3            0
SQL> select switchover_status from v$database;

SWITCHOVER_STATUS
--------------------
TO STANDBY

具备转换条件。

SQL> alter database commit to switchover to physical standby;

Database altered.

SQL>
SQL> shutdown immediate;
ORA-01092: ORACLE instance terminated. Disconnection forced
SQL> startup mount;
ORA-24324: service handle not initialized
ORA-01041: internal error. hostdef extension doesn't exist
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@beijing ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Tue Jun 18 01:05:41 2019

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup mount;
ORACLE instance started.

Total System Global Area 1135747072 bytes
Fixed Size                  2227664 bytes
Variable Size             738198064 bytes
Database Buffers          385875968 bytes
Redo Buffers                9445376 bytes
Database mounted.
SQL> 
SQL> select switchover_status from v$database;

SWITCHOVER_STATUS
--------------------
NOT ALLOWED

不允许转换。

没有收到主句库送来的标记。

我们看看成都异地的备用数据库情况。

SQL> select switchover_status from v$database;

SWITCHOVER_STATUS
--------------------
TO PRIMARY

data guard默认switchover的备用数据库是log_archive_dest_2的指向的备用数据库。将错就错,我们继续switchover,将成都的备用数据库转换成主数据库。

SQL> select switchover_status from v$database;

SWITCHOVER_STATUS
--------------------
TO PRIMARY

SQL> alter database commit to switchover to primary;

Database altered.

SQL> shutdown immediate;
ORA-01109: database not open


Database dismounted.
ORACLE instance shut down.
SQL> startup;
ORACLE instance started.

Total System Global Area 1135747072 bytes
Fixed Size                  2227664 bytes
Variable Size             738198064 bytes
Database Buffers          385875968 bytes
Redo Buffers                9445376 bytes
Database mounted.
Database opened.
SQL> select protection_mode,database_role,guard_status from v$database;

PROTECTION_MODE      DATABASE_ROLE    GUARD_S
-------------------- ---------------- -------
MAXIMUM PERFORMANCE  PRIMARY          NONE

成功将异地成都的数据库转换为主数据库了。

我们将原来的北京主数据库重启。

SQL> select switchover_status from v$database;

SWITCHOVER_STATUS
--------------------
TO STANDBY

SQL> alter database commit to switchover to physical standby;

Database altered.

SQL> shutdown immediate;
ORA-01092: ORACLE instance terminated. Disconnection forced
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[oracle@chengdu ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Tue Jun 18 12:52:42 2019

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORA-32004: obsolete or deprecated parameter(s) specified for RDBMS instance
ORACLE instance started.

Total System Global Area 1135747072 bytes
Fixed Size                  2227664 bytes
Variable Size             704643632 bytes
Database Buffers          419430400 bytes
Redo Buffers                9445376 bytes
Database mounted.
Database opened.
SQL> select protection_mode,database_role,guard_status from v$database;

PROTECTION_MODE      DATABASE_ROLE    GUARD_S
-------------------- ---------------- -------
MAXIMUM PERFORMANCE  PHYSICAL STANDBY NONE
SQL> alter database recover managed standby database disconnect;

Database altered.

我们看到原先的主数据库也已经完成转换,成为了一台物理备机。

做一个日志切换,我们看看能否正常工作。

新主数据库chengdu。

SQL> alter system switch logfile;

System altered.

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     108
Next log sequence to archive   110
Current log sequence           110

新物理备机beijing。

SQL> archive log list;
Database log mode              Archive Mode
Automatic archival             Enabled
Archive destination            USE_DB_RECOVERY_FILE_DEST
Oldest online log sequence     109
Next log sequence to archive   0
Current log sequence           110

日志传送没有问题了。
beijing_2在switchover中没有用上,完成转换后也没有在data guard集群中。因为我们在chengdu节点没有配置和beijing_2的连接。
至此,主数据库演练模拟基本完成。

主数据库故障切换(failover)模拟

上一节的演练,没有任何的数据丢失,一切在掌握之中,顺利的完成角色的switch。现实中,主数据库突然宕机,主数据库数据中心遭遇灾难等等突发事件,如何保证业务连续性,顺利启动备用节点呢?我们这节模拟故障的发生,并如何保证业务连续性。
灾难计划中,我们已经考虑到主业务节点故障的情况,这意味着不同的保护模式或多或少的会造成数据的丢失,故障切换意味着一些意想不到的事情已经发生了。

SQL> @/home/oracle/batchinsert.sql
[oracle@beijing ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on Tue Jun 18 18:05:31 2019

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> select count(*) from hem.tab1;

  COUNT(*)
----------
       943

SQL> /

  COUNT(*)
----------
       969

SQL> 
SQL> alter system switch logfile;

System altered.

SQL> shutdown abort;
ORACLE instance shut down.

-我们在同城和异地备份系统gap

同城是否有log gap


SQL> select thread#,low_sequence#,high_sequence# from v$archive_gap;

no rows selected

SQL> /

no rows selected

SQL> select unique thread# as thread,max(sequence#) over(partition by thread#) as last     from v$archived_log;

    THREAD       LAST
---------- ----------
         1        111

SQL> 

异地是否有log gap

SQL> select thread#,low_sequence#,high_sequence# from v$archive_gap;

no rows selected

SQL> select unique thread# as thread,max(sequence#) over(partition by thread#) as last     from v$archived_log;

    THREAD       LAST
---------- ----------
         1        111

-停止日志的应用(redo apply)

分别停止同城和异地备库的MRP

SQL> alter database recover managed standby database cancel;

Database altered.

SQL>

-完成所有日志的应用
分别完成同城和异地备库的MRP

SQL> alter database recover managed standby database finish;

Database altered.

SQL> 

这一步没出现提示错误,如果有错误提示,需要不同的处理方法。

-切换到主数据库
先看看状态是否正确
同城的。

SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;

SWITCHOVER_STATUS
--------------------
TO PRIMARY

SQL

异地的

SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;

SWITCHOVER_STATUS
--------------------
TO PRIMARY

SQL> 

都可以切换成主数据库系统。我们都试试。

SQL> SELECT SWITCHOVER_STATUS FROM V$DATABASE;

SWITCHOVER_STATUS
--------------------
TO PRIMARY

SQL> alter database commit to switchover to primary;

Database altered.

SQL> 

两台备机都切换成主数据库系统了。

SQL> alter database open;

Database altered.

SQL> select database_role from v$database;

DATABASE_ROLE
----------------
PRIMARY

SQL> 
SQL> select count(*) from hem.tab1;

  COUNT(*)
----------
      1395

SQL> 

chengdu的数据库

SQL> select count(*) from hem.tab1;

  COUNT(*)
----------
      1352

SQL> 

原来主数据库beijing

SQL> shutdown abort;
ORACLE instance shut down.
SQL> startup 
ORACLE instance started.

Total System Global Area 1135747072 bytes
Fixed Size                  2227664 bytes
Variable Size             738198064 bytes
Database Buffers          385875968 bytes
Redo Buffers                9445376 bytes
Database mounted.
Database opened.

SQL> select count(*) from hem.tab1;

  COUNT(*)
----------
      1395

SQL> 

同城的beijing_2的数据得到了最大的保护,而异地chengdu的数据丢失了部分。

SQL> select switchover_status from v$database;

SWITCHOVER_STATUS
--------------------
FAILED DESTINATION

错误的归档传送目标。我们停止它。

SQL> alter system set log_archive_dest_state_2=defer;

System altered.

SQL> alter system set log_archive_dest_state_3=defer;

System altered.
SQL> select switchover_status from v$database;

SWITCHOVER_STATUS
--------------------
NOT ALLOWED

SQL> 

对于beijing数据库来讲没有合适的physical standby 供它使用,所以这里无法将其自己转换为physical standby了。
可以参考上一篇文章的内容,重建beijing数据库,然后将其作为beijing_2的同城备机,这样在这两者之间有可以通过switchover 进行交涉交换了。

容灾系统的构建Step by Step 告一段落。
后续,我们利用Oracle 12c 来构建data guard集群。

上一篇下一篇

猜你喜欢

热点阅读