Switchover to a Physical Standby using SQL *Plus

A operação de switchover no Data Guard corresponde a uma mudança planejada/desejada de função, seja para testes periódicos de estratégia de DR, migrações ou upgrades. Nela, um dos ambientes physical standby recebe do primary os redo logs restantes, e trocam de role/função, ou seja, o standby passa a operar como primary.

Algumas etapas preliminares que seriam importantes antes de efetuar esta atividade:

Validar se o Redo Apply está operando normalmente, ou seja, sem nenhum gap ou erro/warning;
Certificar que no momento da mudança de role, não existe nenhum Scheduler Job ou RMAN em execução;
É indicado termos um Fashback Database Guaranteed Point (GRP) gerado, para termos um “rollback” rápido em caso de falha no switchover;
Antes de proceder com a tarefa, deixar o alert.log do primary e standby abertos para acompanhamento de cada etapa.

Verificando se todos os Redos foram recebidos no standby

Rodando o comando abaixo no primary, que reporta que a sequence do current é 48:

[oracle@fornix1 trace]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Apr 17 16:01:00 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> SELECT THREAD#,SEQUENCE#, STATUS FROM V$LOG;
 
   THREAD#  SEQUENCE# STATUS
---------- ---------- ----------------
         1         46 INACTIVE
         1         47 ACTIVE
         1         48 CURRENT

A consulta abaixo no ambiente standby nos reporta que não temos Lag:

[oracle@fornix2 trace]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Apr 17 16:02:50 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> SELECT CLIENT_PROCESS,SEQUENCE#,STATUS FROM V$MANAGED_STANDBY;
 
CLIENT_P  SEQUENCE# STATUS
-------- ---------- ------------
ARCH              0 CONNECTED
N/A               0 ALLOCATED
N/A               0 ALLOCATED
ARCH             45 CLOSING
ARCH              0 CONNECTED
ARCH             47 CLOSING
Archival          0 IDLE
LGWR             48 RECEIVING
N/A              48 APPLYING_LOG
UNKNOWN           0 IDLE
UNKNOWN           0 IDLE
 
11 rows selected.

A consulta abaixo no primary nos reporta que o Data Guard está configurado em maximum performance:

SQL> SELECT DB_UNIQUE_NAME, PROTECTION_MODE, SYNCHRONIZATION_STATUS, SYNCHRONIZED FROM V$ARCHIVE_DEST_STATUS
WHERE DB_UNIQUE_NAME='cortexDR';  2
 
DB_UNIQUE_NAME                 PROTECTION_MODE      SYNCHRONIZATION_STATUS SYN
------------------------------ -------------------- ---------------------- ---
cortexDR                       MAXIMUM PERFORMANCE  CHECK CONFIGURATION    NO

Verificando se há gap de Redo Apply

No physical standby, antes de efetuarmos o switchover, é bom termos o processo MRPn no status de “Applying_Log”, conforme exemplo abaixo no standby. Caso este status esteja como “Wait_for_gap”, o ideal é mister resolvê-lo.

[oracle@fornix2 trace]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Apr 17 16:13:32 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> SELECT STATUS FROM V$MANAGED_STANDBY WHERE PROCESS LIKE 'MRP%';
 
STATUS
------------
APPLYING_LOG

Realizando o Switchover

Temos a opçção de verificar se o standby está pronto para a operação, usando o comando abaixo no primary. Caso o output seja diferente do exemplo (como um ORA-16475 / ORA-16470), deve ser investigado e resolvido:

SQL> ALTER DATABASE SWITCHOVER TO cortexDR VERIFY;
 
Database altered.

Finalmente realizando o switchover no primary. Como este é um ambiente de laboratório, nao foi necessário checar se havia Scheduled ou RMAN jobs em execução, mas esses detalhes devem ser vistos em um banco produtivo.

SQL> ALTER DATABASE SWITCHOVER TO cortexDR;
 
Database altered.

Alert do primary. Destaco aqui as mensagens: “Waiting for target standby to receive all redo”, “Waiting for target standby to apply all redo”, “Converting the primary database to a new standby database”, “Switchover complete. Database shutdown required”:

ALTER DATABASE SWITCHOVER TO cortexDR
2021-04-17T16:21:01.074995-03:00
NET  (PID:5036): The Time Management Interface (TMI) is being enabled for role transition
NET  (PID:5036): information.  This will result in messages beingoutput to the alert log
NET  (PID:5036): file with the prefix 'TMI: '.  This is being enabled to make the timing of
NET  (PID:5036): the various stages of the role transition available for diagnostic purposes.
NET  (PID:5036): This output will end when the role transition is complete.
TMI: dbsdrv switchover to target BEGIN 2021-04-17 16:21:01.075367
NET  (PID:5036): Starting switchover [Process ID: 5036]
TMI: kcv_switchover_to_target convert to physical BEGIN 2021-04-17 16:21:01.324806
2021-04-17T16:21:01.324887-03:00
ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY [Process Id: 5036] (cortex)
NET  (PID:5036): Waiting for target standby to receive all redo
2021-04-17T16:21:01.389990-03:00
NET  (PID:5036): Waiting for all non-current ORLs to be archived
2021-04-17T16:21:01.391569-03:00
NET  (PID:5036): All non-current ORLs have been archived
2021-04-17T16:21:01.391633-03:00
NET  (PID:5036): Waiting for all FAL entries to be archived
2021-04-17T16:21:01.391677-03:00
NET  (PID:5036): All FAL entries have been archived
2021-04-17T16:21:01.391728-03:00
NET  (PID:5036): Waiting for LAD:2 to become synchronized
2021-04-17T16:21:02.394356-03:00
NET  (PID:5036): Active, synchronized Physical Standby switchover target has been identified
NET  (PID:5036): Preventing updates and queries at the Primary
2021-04-17T16:21:03.397979-03:00
NET  (PID:5036): Generating and shipping final logs to target standby
Switchover End-Of-Redo Log thread 1 sequence 48 has been fixed
Switchover: Primary highest seen SCN set to 0x00000000004d0639
NET  (PID:5036): Noswitch archival of T-1.S-48
NET  (PID:5036): End-Of-Redo Branch archival of T-1.S-48
NET  (PID:5036): LGWR is scheduled to archive to LAD:2 after log switch
NET  (PID:5036): SRL selected for T-1.S-48 for LAD:2
NET  (PID:5036): Archiving is disabled due to current logfile archival
Primary will check for some target standby to have received all redo
NET  (PID:5036): Waiting for target standby to apply all redo
2021-04-17T16:21:05.354878-03:00
Backup controlfile written to trace file /oracle/19.3.0/base/diag/rdbms/cortex/cortex/trace/cortex_ora_5036.trc
NET  (PID:5036): Converting the primary database to a new standby database
Clearing standby activation ID 548987031 (0x20b8e097)
The primary database controlfile was created using the
'MAXLOGFILES 16' clause.
There is space for up to 13 standby redo logfiles
Use the following SQL commands on the standby database to create
standby redo logfiles that match the primary database:
ALTER DATABASE ADD STANDBY LOGFILE 'srl1.f' SIZE 209715200;
ALTER DATABASE ADD STANDBY LOGFILE 'srl2.f' SIZE 209715200;
ALTER DATABASE ADD STANDBY LOGFILE 'srl3.f' SIZE 209715200;
ALTER DATABASE ADD STANDBY LOGFILE 'srl4.f' SIZE 209715200;
Archivelog for thread 1 sequence 48 required for standby recovery
Switchover: Primary controlfile converted to standby controlfile succesfully.
Switchover: Complete - Database shutdown required
TMI: kcv_switchover_to_target convert to physical END 2021-04-17 16:21:05.511197
NET  (PID:5036): Sending request(convert to primary database) to switchover target CORTEXDR
2021-04-17T16:21:09.103894-03:00
NET  (PID:5036): Switchover complete. Database shutdown required
USER (ospid: 5036): terminating the instance

Alert do Standby. Destaco as mensagens: “‘ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY’ from primary database”, “Standby became primary SCN: 5047863” e “Switchover: Complete – Database mounted as primary”

SWITCHOVER: received request 'ALTER DATABASE COMMIT TO SWITCHOVER  TO PRIMARY' from primary database.
2021-04-17T16:21:05.379653-03:00
ALTER DATABASE SWITCHOVER TO PRIMARY (CORTEXDR)
Maximum wait for role transition is 15 minutes.
TMI: kcv_commit_to_so_to_primary wait for MRP to finish BEGIN 2021-04-17 16:21:05.379734
Switchover: Media recovery is still active
 rmi (PID:6287): Role Change: Canceling MRP - no more redo to apply
2021-04-17T16:21:05.412861-03:00
MRP0 (PID:4394): MRP0: Background Media Recovery cancelled with status 16037
2021-04-17T16:21:05.412967-03:00
Errors in file /oracle/19.3.0/base/diag/rdbms/cortexdr/CORTEXDR/trace/CORTEXDR_mrp0_4394.trc:
ORA-16037: user requested cancel of managed recovery operation
MRP0 (PID:4394): Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
stopping change tracking
2021-04-17T16:21:05.441252-03:00
Errors in file /oracle/19.3.0/base/diag/rdbms/cortexdr/CORTEXDR/trace/CORTEXDR_mrp0_4394.trc:
ORA-16037: user requested cancel of managed recovery operation
2021-04-17T16:21:05.441292-03:00
Background Media Recovery process shutdown (CORTEXDR)
2021-04-17T16:21:06.393276-03:00
 rmi (PID:6287): Role Change: Canceled MRP
TMI: kcv_commit_to_so_to_primary wait for MRP to finish END 2021-04-17 16:21:06.393344
TMI: kcv_commit_to_so_to_primary Switchover from physical BEGIN 2021-04-17 16:21:06.394600
 rmi (PID:6287): Killing 3 processes (PIDS:6257,4533,4535) (all RFS) in order to disallow current and future RFS connections. Requested by OS process 6287
2021-04-17T16:21:06.397786-03:00
Process termination requested for pid 6257 [source = rdbms], [info = 2] [request issued by pid: 6287, uid: 54322]
2021-04-17T16:21:06.397900-03:00
Process termination requested for pid 4533 [source = rdbms], [info = 2] [request issued by pid: 6287, uid: 54322]
2021-04-17T16:21:06.398006-03:00
Process termination requested for pid 4535 [source = rdbms], [info = 2] [request issued by pid: 6287, uid: 54322]
2021-04-17T16:21:08.485740-03:00
Backup controlfile written to trace file /oracle/19.3.0/base/diag/rdbms/cortexdr/CORTEXDR/trace/CORTEXDR_rmi_6287.trc
SwitchOver after complete recovery through change 5047865
 rmi (PID:6287): ORL pre-clearing operation disabled by switchover
Online log +DG_FRA/CORTEXDR/ONLINELOG/group_1.261.1066479861: Thread 1 Group 1 was previously cleared
Online log +DG_DATA/CORTEXDR/ONLINELOG/group_1.263.1066479867: Thread 1 Group 1 was previously cleared
Online log +DG_FRA/CORTEXDR/ONLINELOG/group_2.262.1066479871: Thread 1 Group 2 was previously cleared
Online log +DG_DATA/CORTEXDR/ONLINELOG/group_2.262.1066479877: Thread 1 Group 2 was previously cleared
Online log +DG_FRA/CORTEXDR/ONLINELOG/group_3.263.1066479879: Thread 1 Group 3 was previously cleared
Online log +DG_DATA/CORTEXDR/ONLINELOG/group_3.261.1066479885: Thread 1 Group 3 was previously cleared
Standby became primary SCN: 5047863
 rmi (PID:6287): RT: Role transition work is not done
 rmi (PID:6287): The Time Management Interface (TMI) is being enabled for role transition
 rmi (PID:6287): information.  This will result in messages beingoutput to the alert log
 rmi (PID:6287): file with the prefix 'TMI: '.  This is being enabled to make the timing of
 rmi (PID:6287): the various stages of the role transition available for diagnostic purposes.
 rmi (PID:6287): This output will end when the role transition is complete.
 rmi (PID:6287): Redo network throttle feature is disabled at mount time
2021-04-17T16:21:08.936921-03:00
 rmi (PID:6287): Database role cleared from PHYSICAL STANDBY [kcvs.c:1030]
Switchover: Complete - Database mounted as primary
TMI: kcv_commit_to_so_to_primary Switchover from physical END 2021-04-17 16:21:08.937582
SWITCHOVER: completed request from primary database.
2021-04-17T16:22:01.293623-03:00
ARC0 (PID:3566): Becoming the 'no SRL' ARCH

Abrindo o novo primary database:

[oracle@fornix2 trace]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Apr 17 16:32:27 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> select instance_name,status from v$instance;
 
INSTANCE_NAME    STATUS
---------------- ------------
CORTEXDR         MOUNTED
 
SQL> ALTER DATABASE OPEN;
 
Database altered.

Montando o novo standby e habilitando o sincronismo:

[oracle@fornix1 trace]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Apr 17 16:33:39 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
Connected to an idle instance.
 
SQL> startup mount;
ORACLE instance started.
 
Total System Global Area 2583690520 bytes
Fixed Size                  8899864 bytes
Variable Size             553648128 bytes
Database Buffers         2013265920 bytes
Redo Buffers                7876608 bytes
Database mounted.
SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;
 
Database altered.

Gerando redo no novo primary:

[oracle@fornix2 trace]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Sat Apr 17 16:35:42 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> ALTER SYSTEM SWITCH LOGFILE;
 
System altered.

Checando que o alert do novo standby o recebeu e vai proceder com o processo:

Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT
2021-04-17T16:34:36.972515-03:00
MRP0 (PID:9153): Resetting standby activation ID 0 (0x0)
2021-04-17T16:34:36.985072-03:00
Media Recovery End-Of-Redo indicator encountered
2021-04-17T16:34:36.985125-03:00
Media Recovery Continuing
2021-04-17T16:34:36.990195-03:00
MRP0 (PID:9153): Media Recovery Log +DG_RECO/CORTEX/ARCHIVELOG/2021_04_17/thread_1_seq_49.300.1070123641
MRP0 (PID:9153): Media Recovery Waiting for T-1.S-50 (in transit)
2021-04-17T16:34:37.061145-03:00
Recovery of Online Redo Log: Thread 1 Group 4 Seq 50 Reading mem 0
  Mem# 0: +DG_FRA/CORTEX/ONLINELOG/group_4.261.1066367139
2021-04-17T16:35:55.941372-03:00
 rfs (PID:9244): Primary database is in MAXIMUM PERFORMANCE mode
 rfs (PID:9244): Re-archiving LNO:4 T-1.S-50
2021-04-17T16:35:56.544876-03:00
 rfs (PID:9244): Selected LNO:5 for T-1.S-51 dbid 548968087 branch 1039033628
2021-04-17T16:35:56.582024-03:00
ARC3 (PID:9106): Archived Log entry 59 added for T-1.S-50 ID 0x22930349 LAD:1
2021-04-17T16:35:56.686164-03:00
MRP0 (PID:9153): Media Recovery Waiting for T-1.S-51 (in transit)
2021-04-17T16:35:56.688068-03:00
Recovery of Online Redo Log: Thread 1 Group 5 Seq 51 Reading mem 0
  Mem# 0: +DG_FRA/CORTEX/ONLINELOG/group_5.262.1066367153

Obs: Este procedimento foi criado pelo senhor Ahmed Baraka (www.ahmedbaraka.com) e foi apenas reproduzido por mim em um laboratório pessoal para fins de aprendizado.

Verificando se todos os Redos foram recebidos no standby

Verificando se há gap de Redo Apply

Realizando o Switchover

Leave a Comment Cancel Reply