Monitoring Data Guard Physical Standby

Este artigo explorará algumas formas para monitorarmos o Data Guard, e para simular um mínimo de operações sendo executadas no primary, faremos a criação de alguns objetos para execução de DML randômico.

Criando a tabela no banco primary que sofrerá as alterações:

[oracle@fornix1 ~]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Wed Apr 7 04:07:46 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> ALTER USER HR IDENTIFIED BY oracle;
 
User altered.
 
SQL> ALTER USER HR ACCOUNT UNLOCK;
 
User altered.
 
SQL> CONN HR/oracle
Connected.
SQL> SHOW USER;
USER is "HR"
SQL> CREATE TABLE HR.NAMES ( ID NUMBER, NAME VARCHAR2(50), HDATE DATE, SAL NUMBER, REGION VARCHAR2(1));
 
Table created.
 
SQL> CREATE SEQUENCE S CACHE 1000;
 
Sequence created.

Criando uma package que contém 2 procedures que executam os processos de DML, com o seguinte script:

/* Create Random Load Package */
CREATE OR REPLACE PACKAGE LOAD_GENERATOR 
IS 
   -- insert random rows in NAMES table
   PROCEDURE INSERT_NAMES ( P_ROWS IN NUMBER);
   -- random DML on NAMES 
   PROCEDURE RandomDML(P_ITERATION IN NUMBER, P_MAX IN NUMBER);   
END load_generator;
/
 
CREATE OR REPLACE PACKAGE Body LOAD_GENERATOR
IS
  -- generate random text: its lengnth between 4 and the passed value
   FUNCTION G_TEXT(P_SIZE IN NUMBER) RETURN VARCHAR2
    IS
     V VARCHAR2(2000);
    BEGIN
     FOR I IN 1..DBMS_RANDOM.VALUE(4,P_SIZE) LOOP
      V := V || CHR(ROUND(DBMS_RANDOM.VALUE(65,90)));  -- 122
     END LOOP;
     RETURN V;
   END;
 
   PROCEDURE INSERT_NAMES ( P_ROWS IN NUMBER)
   IS
    V1 VARCHAR2(15);
    V2 VARCHAR2(15);
   BEGIN
    FOR I IN 1..P_ROWS LOOP
        V1 := G_TEXT(15);
        V2 := G_TEXT(15);
        INSERT INTO NAMES VALUES ( S.NEXTVAL,       -- ID
         V1 || ' ' || V2,                           -- NAME
         TRUNC(SYSDATE-DBMS_RANDOM.VALUE(60,1800)), -- HDATE
         ROUND(DBMS_RANDOM.VALUE(1000,55000)),      -- SAL
         DECODE( TO_CHAR(ROUND(DBMS_RANDOM.VALUE(1,4))),   '1','N','2','W','3','E','4','S') ); -- REGION
      IF MOD(I,100) = 0 THEN
       COMMIT;
      END IF;
    END LOOP;
    COMMIT;
   END INSERT_NAMES;
 
   PROCEDURE RandomDML(P_ITERATION IN NUMBER, P_MAX IN NUMBER)
   IS
    N NUMBER;
    M NUMBER;
    V_NEW_SAL NUMBER;
    V1 VARCHAR2(15);
    V2 VARCHAR2(15);
   BEGIN
    FOR I IN 1.. P_ITERATION LOOP
     N := ROUND(DBMS_RANDOM.VALUE(1,3));
     IF N=1 THEN
        V1 := G_TEXT(15);
        V2 := G_TEXT(15);
        INSERT INTO NAMES VALUES ( S.NEXTVAL,       -- ID
         V1 || ' ' || V2,                           -- NAME
         TRUNC(SYSDATE)-DBMS_RANDOM.VALUE(60,1800), -- HDATE
         ROUND(DBMS_RANDOM.VALUE(1000,55000)),      -- SAL
         DECODE( TO_CHAR(ROUND(DBMS_RANDOM.VALUE(1,4))),   '1','N','2','W','3','E','4','S') ); -- REGION
     ELSIF N=2 THEN
       M := ROUND(DBMS_RANDOM.VALUE(1,P_MAX));
       V_NEW_SAL := ROUND(DBMS_RANDOM.VALUE(1000,55000));
       UPDATE NAMES SET SAL = V_NEW_SAL
        WHERE ID = M;
     ELSIF N=3 THEN
       M := ROUND(DBMS_RANDOM.VALUE(1,P_MAX));
       DELETE NAMES WHERE ID = M;
     END IF;     
     -- DBMS_LOCK.SLEEP(ROUND(DBMS_RANDOM.VALUE(0.1,2),2));
     COMMIT;
    END LOOP;    
   END RandomDML;
END load_generator;  
/
Package created.
 
SQL> SQL>   2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32   33   34   35   36   37   38   39   40   41   42   43   44   45   46   47   48   49   50   51   52   53   54   55   56   57   58   59   60   61   62   63   64   65   66
Package body created.

Criando um arquivo sql com o comando que dispara os processos criados acima:

[oracle@fornix1 ~]$ cd ~
[oracle@fornix1 ~]$ cat loaddml.sql
begin
hr.LOAD_GENERATOR.RANDOMDML (&1, &2);
end;
/
[oracle@fornix1 ~]$

Criando arquivo shell que aceita 3 argumentos: número de conexões, número de iterações e número de linhas:

[oracle@fornix1 ~]$ cd ~
[oracle@fornix1 ~]$ chmod +x loaddml.sh
[oracle@fornix1 ~]$ cat loaddml.sh
#!/bin/bash
# apply random DML load on Oracle DB
# parameters: 1 connections, 2 Iterations, 3 rows in names
users=$1
SRVC="cortex"
UNPW="hr/oracle"
SQLCMD="/home/oracle/loaddml.sql"
x=1
y=$users
ITER=$2
MAX=$3
 
while [ $x -le $y ]
    do
     sqlplus -s $UNPW@$SRVC @$SQLCMD $ITER $MAX &
     x=`expr $x + 1`
done
[oracle@fornix1 ~]$

Realizando teste de execução (com 5 conexões, 2000 iterações e 100000 linhas na tabela HR.NAMES):

[oracle@fornix1 ~]$ ./loaddml.sh 5 2000 100000
[oracle@fornix1 ~]$ old   2: hr.LOAD_GENERATOR.RANDOMDML (&1, &2);
new   2: hr.LOAD_GENERATOR.RANDOMDML (2000, 100000);
old   2: hr.LOAD_GENERATOR.RANDOMDML (&1, &2);
new   2: hr.LOAD_GENERATOR.RANDOMDML (2000, 100000);
old   2: hr.LOAD_GENERATOR.RANDOMDML (&1, &2);
new   2: hr.LOAD_GENERATOR.RANDOMDML (2000, 100000);
old   2: hr.LOAD_GENERATOR.RANDOMDML (&1, &2);
new   2: hr.LOAD_GENERATOR.RANDOMDML (2000, 100000);
old   2: hr.LOAD_GENERATOR.RANDOMDML (&1, &2);
new   2: hr.LOAD_GENERATOR.RANDOMDML (2000, 100000);
 
PL/SQL procedure successfully completed.
 
 
PL/SQL procedure successfully completed.
 
 
PL/SQL procedure successfully completed.
 
 
PL/SQL procedure successfully completed.
 
 
PL/SQL procedure successfully completed.

Monitorando o Data Guard usando o Alert.log:

Qualquer erro reportado pelo DG será apresentado no arquivo Alert.log. Na situação abaixo, deixei o ambiente stadby down, e fui observar o alert do primary. Geralmente os DBAs criam monitoramentos procurando pelo erro ORA-, ou em casos específicos abaixo sobre algum problema de “no listener”:

[grid@fornix2 ~]$ crsctl stat res -t
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Status failed, or completed with errors.
[grid@fornix2 ~]$
[oracle@fornix1 trace]$ cd /oracle/19.3.0/base/diag/rdbms/cortex/cortex/trace
[oracle@fornix1 trace]$ tail -20f alert_cortex.log
 
Fatal NI connect error 12541, connecting to:
 (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=fornix2.localdomain)(PORT=1522))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=cortexDR.localdomain)(CID=(PROGRAM=oracle)(HOST=fornix1)(USER=oracle))))
 
  VERSION INFORMATION:
        TNS for Linux: Version 19.0.0.0.0 - Production
        TCP/IP NT Protocol Adapter for Linux: Version 19.0.0.0.0 - Production
  Version 19.3.0.0.0
  Time: 07-APR-2021 04:28:40
  Tracing not turned on.
  Tns error struct:
    ns main err code: 12541
 
TNS-12541: TNS:no listener
    ns secondary err code: 12560
    nt main err code: 511
 
TNS-00511: No listener
    nt secondary err code: 111
    nt OS err code: 0

Monitorando o Data Guard usando a V$DATAGUARD_STATUS

Essa view nos reporta as últimas 256 mensagens (incluindo os erros) no alert.log que são relacionados com o physical standby, conforme exemplo abaixo:

ALTER SESSION SET NLS_DATE_FORMAT='DD-Mon-RR HH12:MI AM';
ALTER SESSION SET NLS_DATE_LANGUAGE=AMERICAN;
 
SELECT FACILITY, ERROR_CODE, TIMESTAMP, MESSAGE
FROM V$DATAGUARD_STATUS
WHERE TRUNC(TIMESTAMP)= TRUNC(SYSTIMESTAMP)
ORDER BY TIMESTAMP;
[oracle@fornix1 trace]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Wed Apr 7 04:34:28 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> ALTER SESSION SET NLS_DATE_FORMAT='DD-Mon-RR HH12:MI AM';
ALTER SESSION SET NLS_DATE_LANGUAGE=AMERICAN;
 
SELECT FACILITY, ERROR_CODE, TIMESTAMP, MESSAGE
FROM V$DATAGUARD_STATUS
WHERE TRUNC(TIMESTAMP)= TRUNC(SYSTIMESTAMP)
ORDER BY TIMESTAMP;
Session altered.
 
SQL>
Session altered.
 
SQL> SQL>   2    3    4
 
FACILITY                 ERROR_CODE TIMESTAMP
------------------------ ---------- ------------------
MESSAGE
--------------------------------------------------------------------------------
Log Transport Services            0 07-Apr-21 03:49 AM
Redo network throttle feature is disabled at mount time
 
Log Transport Services            0 07-Apr-21 03:49 AM
STARTING ARCH PROCESSES
 
Log Transport Services            0 07-Apr-21 03:49 AM
STARTING ARCH PROCESSES COMPLETE
 
 
FACILITY                 ERROR_CODE TIMESTAMP
------------------------ ---------- ------------------
MESSAGE
--------------------------------------------------------------------------------
Log Transport Services        12541 07-Apr-21 04:33 AM
Error 12541 received logging on to the standby
 
Log Transport Services        12541 07-Apr-21 04:33 AM
Check whether the listener is up and running.
 
Log Transport Services        12541 07-Apr-21 04:34 AM
Error 12541 received logging on to the standby

Detectando Archive Log GAP

Executando o script abaixo no standby, podemos identificar se existe gaps de archive logs no Data Guard:

[oracle@fornix2 ~]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Wed Apr 7 04:42:17 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> SELECT THREAD#, LOW_SEQUENCE#, HIGH_SEQUENCE# FROM V$ARCHIVE_GAP;
 
no rows selected

Os scripts abaixo são mais alternativas de encontrar gaps, sendo executados no primary:

Archive logs mais recentes no primary:

SQL> SELECT MAX(SEQUENCE#), THREAD#
FROM V$ARCHIVED_LOG
WHERE RESETLOGS_CHANGE# = (SELECT MAX(RESETLOGS_CHANGE#) FROM V$ARCHIVED_LOG)
GROUP BY THREAD#;  2    3    4
 
MAX(SEQUENCE#)    THREAD#
-------------- ----------
            67          1

Archive log mais recente a ser transferido para o Standby:

SQL> SELECT DESTINATION, STATUS, ARCHIVED_THREAD#, ARCHIVED_SEQ#
FROM V$ARCHIVE_DEST_STATUS WHERE STATUS <> 'DEFERRED' AND STATUS <> 'INACTIVE';  2
 
DESTINATION
--------------------------------------------------------------------------------
STATUS    ARCHIVED_THREAD# ARCHIVED_SEQ#
--------- ---------------- -------------
+DG_RECO
VALID                    1            67
 
cortexdr
VALID                    1            67

Validando o transport e apply Lag

Rodando o comando abaixo no standby, podemos checar se existe Lag na camada de transport ou apply (em meu exemplo, deixei o Redo apply como OFF e rodei o shell script mencionado no início do artigo para gerar um lag mínimo de Apply):

SQL> ALTER SESSION SET NLS_DATE_FORMAT='DD-Mon-RR HH12:MI AM';
ALTER SESSION SET NLS_DATE_LANGUAGE=AMERICAN;
 
SELECT NAME, VALUE, UNIT, TIME_COMPUTED
FROM V$DATAGUARD_STATS
WHERE NAME IN ('transport lag','apply lag','apply finish time');
Session altered.
 
SQL>
Session altered.
 
SQL> SQL>   2    3
 
NAME
--------------------------------
VALUE
----------------------------------------------------------------
UNIT                           TIME_COMPUTED
------------------------------ ------------------------------
transport lag
+00 00:00:00
day(2) to second(0) interval   04/07/2021 05:05:33
 
apply lag
+00 00:01:05
day(2) to second(0) interval   04/07/2021 05:05:33
 
NAME
--------------------------------
VALUE
----------------------------------------------------------------
UNIT                           TIME_COMPUTED
------------------------------ ------------------------------
 
apply finish time
+00 00:00:05.610
day(2) to second(3) interval   04/07/2021 05:05:33

Habilitando o Redo Apply:

[oracle@fornix2 ~]$ dgmgrl sys/oracle@CORTEXDR
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Wed Apr 7 05:00:04 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
 
Welcome to DGMGRL, type "help" for information.
Connected to "cortexDR"
Connected as SYSDBA.
DGMGRL> EDIT DATABASE CORTEXDR SET STATE=APPLY-ON;
Succeeded.

Realizando novamente a consulta:

SQL> ALTER SESSION SET NLS_DATE_FORMAT='DD-Mon-RR HH12:MI AM';
ALTER SESSION SET NLS_DATE_LANGUAGE=AMERICAN;
 
SELECT NAME, VALUE, UNIT, TIME_COMPUTED
FROM V$DATAGUARD_STATS
WHERE NAME IN ('transport lag','apply lag','apply finish time');
Session altered.
 
SQL>
Session altered.
 
SQL> SQL>   2    3
 
NAME
--------------------------------
VALUE
----------------------------------------------------------------
UNIT                           TIME_COMPUTED
------------------------------ ------------------------------
transport lag
+00 00:00:00
day(2) to second(0) interval   04/07/2021 05:06:46
 
apply lag
+00 00:00:00
day(2) to second(0) interval   04/07/2021 05:06:46
 
NAME
--------------------------------
VALUE
----------------------------------------------------------------
UNIT                           TIME_COMPUTED
------------------------------ ------------------------------
 
apply finish time
+00 00:00:00.000
day(2) to second(3) interval   04/07/2021 05:06:46

Também é possível checar a existência de Lags caso tenhamos configurado o Data Broker:

[oracle@fornix2 ~]$ dgmgrl sys/oracle@CORTEXDR
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Wed Apr 7 05:08:30 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
 
Welcome to DGMGRL, type "help" for information.
Connected to "cortexDR"
Connected as SYSDBA.
DGMGRL> SHOW DATABASE CORTEXDR;
 
Database - cortexdr
 
  Role:               PHYSICAL STANDBY
  Intended State:     APPLY-ON
  Transport Lag:      0 seconds (computed 0 seconds ago)
  Apply Lag:          0 seconds (computed 0 seconds ago)
  Average Apply Rate: 22.00 KByte/s
  Real Time Query:    OFF
  Instance(s):
    CORTEXDR
 
Database Status:
SUCCESS

De forma mais detalhada, podemos ver pelo Broker também os archive logs enviados pelo primary:

[oracle@fornix1 ~]$ dgmgrl sys/oracle@CORTEX
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Wed Apr 7 05:11:13 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
 
Welcome to DGMGRL, type "help" for information.
Connected to "cortex"
Connected as SYSDBA.
DGMGRL> SHOW DATABASE CORTEX SendQEntries;
PRIMARY_SEND_QUEUE
        STANDBY_NAME       STATUS     RESETLOGS_ID           THREAD              LOG_SEQ       TIME_GENERATED       TIME_COMPLETED        FIRST_CHANGE#         NEXT_CHANGE#       SIZE (KBs)
                          CURRENT       1039033628                1                   70  04/07/2021 05:05:24                                   5786739                                   275

Validando pelo Broker os archive logs recebidos pelo standby mas que não foram aplicados:

[oracle@fornix2 ~]$ dgmgrl sys/oracle@CORTEXDR
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Wed Apr 7 05:12:49 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.
 
Welcome to DGMGRL, type "help" for information.
Connected to "cortexDR"
Connected as SYSDBA.
DGMGRL> SHOW DATABASE CORTEXDR RECVQENTRIES;
STANDBY_RECEIVE_QUEUE
              STATUS     RESETLOGS_ID           THREAD              LOG_SEQ       TIME_GENERATED       TIME_COMPLETED        FIRST_CHANGE#         NEXT_CHANGE#       SIZE (KBs)
 
DGMGRL>

Monitorando processos em execução do Data Guard

Ao executar o comando abaixo no standby, podemos ter os detalhes dos processos que estão compondo a solução de Data Guard em nosso ambiente:

[oracle@fornix2 ~]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Wed Apr 7 05:16:14 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> SELECT PID, PROCESS, STATUS, CLIENT_PROCESS, CLIENT_PID, THREAD#, SEQUENCE# SEQ#, BLOCK#, BLOCKS
FROM V$MANAGED_STANDBY
ORDER BY PROCESS;  2    3
 
PID                      PROCESS   STATUS       CLIENT_P
------------------------ --------- ------------ --------
CLIENT_PID                                  THREAD#       SEQ#     BLOCK#
---------------------------------------- ---------- ---------- ----------
    BLOCKS
----------
10631                    ARCH      CLOSING      ARCH
10631                                             1         68       6144
      1017
 
10633                    ARCH      CONNECTED    ARCH
10633                                             0          0          0
         0
 
PID                      PROCESS   STATUS       CLIENT_P
------------------------ --------- ------------ --------
CLIENT_PID                                  THREAD#       SEQ#     BLOCK#
---------------------------------------- ---------- ---------- ----------
    BLOCKS
----------
 
10629                    ARCH      CLOSING      ARCH
10629                                             1         69      12288
      1371
 
10623                    ARCH      CONNECTED    ARCH
10623                                             0          0          0
 
PID                      PROCESS   STATUS       CLIENT_P
------------------------ --------- ------------ --------
CLIENT_PID                                  THREAD#       SEQ#     BLOCK#
---------------------------------------- ---------- ---------- ----------
    BLOCKS
----------
         0
 
10627                    DGRD      ALLOCATED    N/A
N/A                                               0          0          0
         0
 
10621                    DGRD      ALLOCATED    N/A
 
PID                      PROCESS   STATUS       CLIENT_P
------------------------ --------- ------------ --------
CLIENT_PID                                  THREAD#       SEQ#     BLOCK#
---------------------------------------- ---------- ---------- ----------
    BLOCKS
----------
N/A                                               0          0          0
         0
 
12501                    MRP0      APPLYING_LOG N/A
N/A                                               1         70        960
    409600
 
 
PID                      PROCESS   STATUS       CLIENT_P
------------------------ --------- ------------ --------
CLIENT_PID                                  THREAD#       SEQ#     BLOCK#
---------------------------------------- ---------- ---------- ----------
    BLOCKS
----------
10653                    RFS       IDLE         Archival
3699                                              1          0          0
         0
 
11330                    RFS       IDLE         LGWR
3713                                              1         70        960
         1
 
PID                      PROCESS   STATUS       CLIENT_P
------------------------ --------- ------------ --------
CLIENT_PID                                  THREAD#       SEQ#     BLOCK#
---------------------------------------- ---------- ---------- ----------
    BLOCKS
----------
 
 
9 rows selected.

Monitorando taxa do Redo Apply

Executando o comando abaixo no standy, podemos ter acesso a algumas taxas relacionadas com o Redo Apply:

[oracle@fornix2 ~]$ sqlplus / as sysdba
 
SQL*Plus: Release 19.0.0.0.0 - Production on Wed Apr 7 05:25:57 2021
Version 19.3.0.0.0
 
Copyright (c) 1982, 2019, Oracle.  All rights reserved.
 
 
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
 
SQL> SELECT START_TIME , ITEM, SOFAR || ' ' || UNITS Sofar
FROM V$RECOVERY_PROGRESS
WHERE ITEM IN ('Active Apply Rate', 'Average Apply Rate', 'Redo Applied');  2    3
 
START_TIME          ITEM
------------------- --------------------------------
SOFAR
-------------------------------------------------------------------------
2021-04-07:05:06:37 Active Apply Rate
0 KB/sec
 
2021-04-07:05:06:37 Average Apply Rate
2 KB/sec
 
2021-04-07:05:06:37 Redo Applied
3 Megabytes
 
 
START_TIME          ITEM
------------------- --------------------------------
SOFAR
-------------------------------------------------------------------------
2021-04-07:05:00:29 Active Apply Rate
1179 KB/sec
 
2021-04-07:05:00:29 Average Apply Rate
79 KB/sec
 
2021-04-07:05:00:29 Redo Applied
18 Megabytes
 
 
6 rows selected.

Obs: Este procedimento foi criado pelo senhor Ahmed Baraka (www.ahmedbaraka.com) e foi apenas reproduzido por mim em um laboratório pessoal para fins de aprendizado.

Leave a Comment

Your email address will not be published.