Artigo simples de apoio para aquelas ocasiões onde identifcamos um alto consumo de CPU em nosso ambiente, e precisamos partir para a próxima fase da investigação: possíveis processos ofensores, tanto no S.O quanto no database.
O modo mais simples e direto é utilizar o “top”, que por default já ordena a lista de processos por consumo de CPU, conforme exemplo abaixo:
[oracle@oel7 ~]$ top
top - 19:08:51 up 10 min, 2 users, load average: 1.24, 0.95, 0.58
Tasks: 354 total, 7 running, 261 sleeping, 0 stopped, 0 zombie
%Cpu(s): 66.9 us, 0.7 sy, 0.0 ni, 32.3 id, 0.0 wa, 0.0 hi, 0.2 si, 0.0 st
KiB Mem : 16150724 total, 10330188 free, 1223556 used, 4596980 buff/cache
KiB Swap: 1261564 total, 1261564 free, 0 used. 11574284 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7527 root 20 0 7324 100 0 R 100.0 0.0 0:12.13 stress
7525 root 20 0 7324 100 0 R 99.7 0.0 0:12.16 stress
7528 root 20 0 7324 100 0 R 98.3 0.0 0:11.98 stress
7526 root 20 0 7324 100 0 R 98.0 0.0 0:12.10 stress
3948 grid -2 0 1554740 64444 61216 S 1.7 0.4 0:06.41 asm_vktm_+asm
4092 oracle -2 0 2983028 61048 57784 S 1.3 0.4 0:06.21 ora_vktm_cortex
7576 oracle 20 0 163736 4688 3768 R 1.0 0.0 0:00.12 top
3600 grid 20 0 2143224 132860 89712 S 0.7 0.8 0:07.65 oraagent.bin
3074 grid 20 0 2574648 112892 68044 S 0.3 0.7 0:16.99 ohasd.bin
Também é possível identicar uma lista de processos a partir do comando ps abaixo, onde a primeira coluna aponta a % de consumo, a segunda o PID do processo (que nos pode ser insumo para outras análises):
ps -e -o pcpu,pid,user,tty,args | sort -n -k 1 -r | head
[oracle@oel7 ~]$ ps -e -o pcpu,pid,user,tty,args | sort -n -k 1 -r | head
103 9401 root pts/0 stress --cpu 4 --timeout 60s
103 9399 root pts/0 stress --cpu 4 --timeout 60s
103 9398 root pts/0 stress --cpu 4 --timeout 60s
101 9400 root pts/0 stress --cpu 4 --timeout 60s
2.3 3074 grid ? /grid/19.3.0/product/bin/ohasd.bin reboot
1.2 3600 grid ? /grid/19.3.0/product/bin/oraagent.bin
1.1 4092 oracle ? ora_vktm_cortex
1.1 3948 grid ? asm_vktm_+ASM
1.1 3173 gdm ? /usr/bin/gnome-shell
0.9 2586 root tty1 /usr/bin/X :0 -background none -noreset -audit 4 -verbose -auth /run/gdm/auth-for-gdm-XDgYhG/database -seat seat0 -nolisten tcp vt1
[oracle@oel7 ~]$
Caso o processo seja proveniente do banco de dados Oracle, através do PID e da consulta abaixo é possível termos mais detalhes do mesmo. Nota: a query foi retirada da obra “Linux and Solaris Recipes for Oracle DBAs”, de Darl Kuln:
SET LINES 200 PAGES 0 HEAD OFF LONG 100000
COL dummy_value NOPRINT
SELECT 'dummy_value' dummy_value,
'USERNAME : ' || s.username || CHR(10) ||
'SCHEMA : ' || s.schemaname || CHR(10) ||
'OSUSER : ' || s.osuser || CHR(10) ||
'MODULE : ' || s.program || CHR(10) ||
'ACTION : ' || s.schemaname || CHR(10) ||
'CLIENT INFO : ' || s.osuser || CHR(10) ||
'PROGRAM : ' || s.program || CHR(10) ||
'SPID : ' || p.spid || CHR(10) ||
'SID : ' || s.sid || CHR(10) ||
'SERIAL# : ' || s.serial# || CHR(10) ||
'KILL STRING : ' || '''' || s.sid || ',' || s.serial# || '''' || CHR(10) ||
'MACHINE : ' || s.machine || CHR(10) ||
'TYPE : ' || s.type || CHR(10) ||
'TERMINAL : ' || s.terminal || CHR(10) ||
'CPU : ' || q.cpu_time/1000000 || CHR(10) ||
'ELAPSED_TIME: ' || q.elapsed_time/1000000 || CHR(10) ||
'BUFFER_GETS : ' || q.buffer_gets || CHR(10) ||
'SQL_ID : ' || q.sql_id || CHR(10) ||
'CHILD_NUM : ' || q.child_number || CHR(10) ||
'START_TIME : ' || TO_CHAR(s.sql_exec_start,'dd-mon-yy hh24:mi') || CHR(10) ||
'STATUS : ' || s.status || CHR(10) ||
'SQL_TEXT : ' || q.sql_fulltext
FROM v$session s
JOIN v$process p ON (s.paddr = p.addr)
LEFT OUTER JOIN v$sql q ON (s.sql_id = q.sql_id)
WHERE s.username IS NOT NULL -- eliminates background procs
AND NVL(q.sql_text,'x') NOT LIKE '%dummy_value%' -- eliminates this query from output
AND p.spid = '&PID_FROM_OS'
ORDER BY q.cpu_time;
[oracle@oel7 admin]$ sqlplus / as sysdba
SQL*Plus: Release 19.0.0.0.0 - Production on Tue Jan 17 19:39:59 2023
Version 19.3.0.0.0
Copyright (c) 1982, 2019, Oracle. All rights reserved.
Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0
SQL> SET LINES 200 PAGES 0 HEAD OFF LONG 100000
COL dummy_value NOPRINT
SELECT 'dummy_value' dummy_value,
'USERNAME : ' || s.username || CHR(10) ||
'SCHEMA : ' || s.schemaname || CHR(10) ||
SQL> 'OSUSER : ' || s.osuser || CHR(10) ||
'MODULE : ' || s.program || CHR(10) ||
'ACTION : ' || s.schemaname || CHR(10) ||
'CLIENT INFO : ' || s.osuser || CHR(10) ||
SQL> 2 3 4 5 'PROGRAM : ' || s.program || CHR(10) ||
6 7 8 9 'SPID : ' || p.spid || CHR(10) ||
'SID : ' || s.sid || CHR(10) ||
10 11 'SERIAL# : ' || s.serial# || CHR(10) ||
12 'KILL STRING : ' || '''' || s.sid || ',' || s.serial# || '''' || CHR(10) ||
13 'MACHINE : ' || s.machine || CHR(10) ||
14 'TYPE : ' || s.type || CHR(10) ||
'TERMINAL : ' || s.terminal || CHR(10) ||
15 16 'CPU : ' || q.cpu_time/1000000 || CHR(10) ||
17 'ELAPSED_TIME: ' || q.elapsed_time/1000000 || CHR(10) ||
'BUFFER_GETS : ' || q.buffer_gets || CHR(10) ||
18 19 'SQL_ID : ' || q.sql_id || CHR(10) ||
20 'CHILD_NUM : ' || q.child_number || CHR(10) ||
'START_TIME : ' || TO_CHAR(s.sql_exec_start,'dd-mon-yy hh24:mi') || CHR(10) ||
21 22 'STATUS : ' || s.status || CHR(10) ||
'SQL_TEXT : ' || q.sql_fulltext
23 24 FROM v$session s
25 JOIN v$process p ON (s.paddr = p.addr)
LEFT OUTER JOIN v$sql q ON (s.sql_id = q.sql_id)
26 27 WHERE s.username IS NOT NULL -- eliminates background procs
28 AND NVL(q.sql_text,'x') NOT LIKE '%dummy_value%' -- eliminates this query from output
AND p.spid = '&PID_FROM_OS'
29 30 ORDER BY q.cpu_time;
Enter value for pid_from_os: 26405
old 29: AND p.spid = '&PID_FROM_OS'
new 29: AND p.spid = '26405'
USERNAME : SYSTEM
SCHEMA : SYSTEM
OSUSER : SWIV
MODULE : sqlplus.exe
ACTION : SYSTEM
CLIENT INFO : SWIV
PROGRAM : sqlplus.exe
SPID : 26405
SID : 93
SERIAL# : 22672
KILL STRING : '93,22672'
MACHINE : WORKGROUP\xxx
TYPE : USER
TERMINAL : xxx
CPU : .237271
ELAPSED_TIME: .386385
BUFFER_GETS : 5233
SQL_ID : 7xr6jha03c1wx
CHILD_NUM : 0
START_TIME : 17-jan-23 19:38
STATUS : INACTIVE
SQL_TEXT : SELECT * FROM DBA_OBJECTS
SQL>
A partir daqui podemos avançar na investigação dentro do banco de dados, por exemplo validando o plano de execução:
SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR('&sql_id',&child_num));
SQL> SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR('&sql_id',&child_num));
Enter value for sql_id: 7xr6jha03c1wx
Enter value for child_num: 0
old 1: SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR('&sql_id',&child_num))
new 1: SELECT * FROM table(DBMS_XPLAN.DISPLAY_CURSOR('7xr6jha03c1wx',0))
SQL_ID 7xr6jha03c1wx, child number 0
-------------------------------------
SELECT * FROM DBA_OBJECTS
Plan hash value: 2114560210
------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 393 (100)| |
| 1 | VIEW | DBA_OBJECTS | 66791 | 30M| 393 (3)| 00:00:01 |
| 2 | UNION-ALL | | | | | |
| 3 | TABLE ACCESS BY INDEX ROWID | SUM$ | 1 | 26 | 0 (0)| |
|* 4 | INDEX UNIQUE SCAN | I_SUM$_1 | 1 | | 0 (0)| |
|* 5 | TABLE ACCESS BY INDEX ROWID BATCHED | USER_EDITIONING$ | 1 | 6 | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | I_USER_EDITIONING | 2 | | 1 (0)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID BATCHED| OBJ$ | 1 | 40 | 3 (0)| 00:00:01 |
|* 8 | INDEX RANGE SCAN | I_OBJ1 | 1 | | 2 (0)| 00:00:01 |
|* 9 | TABLE ACCESS BY INDEX ROWID | SUM$ | 1 | 26 | 0 (0)| |
|* 10 | INDEX UNIQUE SCAN | I_SUM$_1 | 1 | | 0 (0)| |
|* 11 | FILTER | | | | | |
|* 12 | HASH JOIN | | 72419 | 10M| 332 (1)| 00:00:01 |
| 13 | TABLE ACCESS FULL | USER$ | 130 | 2340 | 4 (0)| 00:00:01 |
|* 14 | HASH JOIN | | 72419 | 9547K| 328 (1)| 00:00:01 |
| 15 | INDEX FULL SCAN | I_USER2 | 130 | 3250 | 1 (0)| 00:00:01 |
|* 16 | TABLE ACCESS FULL | OBJ$ | 72419 | 7779K| 327 (1)| 00:00:01 |
| 17 | NESTED LOOPS | | 1 | 32 | 4 (0)| 00:00:01 |
| 18 | NESTED LOOPS | | 1 | 23 | 3 (0)| 00:00:01 |
| 19 | TABLE ACCESS BY INDEX ROWID | IND$ | 1 | 10 | 2 (0)| 00:00:01 |
|* 20 | INDEX UNIQUE SCAN | I_IND1 | 1 | | 1 (0)| 00:00:01 |
|* 21 | TABLE ACCESS CLUSTER | TAB$ | 1 | 13 | 1 (0)| 00:00:01 |
|* 22 | INDEX RANGE SCAN | I_OBJ1 | 1 | 9 | 1 (0)| 00:00:01 |
|* 23 | TABLE ACCESS CLUSTER | TAB$ | 1 | 13 | 2 (0)| 00:00:01 |
|* 24 | INDEX UNIQUE SCAN | I_OBJ# | 1 | | 1 (0)| 00:00:01 |
|* 25 | TABLE ACCESS BY INDEX ROWID | SEQ$ | 1 | 8 | 1 (0)| 00:00:01 |
|* 26 | INDEX UNIQUE SCAN | I_SEQ1 | 1 | | 0 (0)| |
|* 27 | TABLE ACCESS BY INDEX ROWID BATCHED| USER_EDITIONING$ | 1 | 6 | 2 (0)| 00:00:01 |
|* 28 | INDEX RANGE SCAN | I_USER_EDITIONING | 2 | | 1 (0)| 00:00:01 |
|* 29 | TABLE ACCESS BY INDEX ROWID BATCHED| USER_EDITIONING$ | 1 | 6 | 2 (0)| 00:00:01 |
|* 30 | INDEX RANGE SCAN | I_USER_EDITIONING | 2 | | 1 (0)| 00:00:01 |
| 31 | NESTED LOOPS SEMI | | 1 | 29 | 2 (0)| 00:00:01 |
|* 32 | INDEX SKIP SCAN | I_USER2 | 1 | 20 | 1 (0)| 00:00:01 |
|* 33 | INDEX RANGE SCAN | I_OBJ4 | 1 | 9 | 1 (0)| 00:00:01 |
| 34 | NESTED LOOPS | | 1 | 36 | 3 (0)| 00:00:01 |
| 35 | TABLE ACCESS FULL | LINK$ | 1 | 18 | 2 (0)| 00:00:01 |
| 36 | TABLE ACCESS CLUSTER | USER$ | 1 | 18 | 1 (0)| 00:00:01 |
|* 37 | INDEX UNIQUE SCAN | I_USER# | 1 | | 0 (0)| |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("S"."OBJ#"=:B1)
5 - filter("UE"."TYPE#"=:B1)
6 - access("UE"."USER#"=:B1)
8 - access("EO"."OBJ#"=:B1)
9 - filter(BITAND("S"."XPFLAGS",34368126976)=0)
10 - access("S"."OBJ#"=:B1)
11 - filter(((("O"."TYPE#"<>1 AND "O"."TYPE#"<>2 AND "O"."TYPE#"<>6) OR ("O"."TYPE#"=1 AND IS
NULL) OR ("O"."TYPE#"=2 AND =1) OR ("O"."TYPE#"=6 AND =1)) AND (BITAND("U"."SPARE1",16)=0 OR
BITAND("O"."FLAGS",1048576)=1048576 OR ("O"."TYPE#"<>88 AND IS NULL) OR ( IS NOT NULL AND
((SYS_CONTEXT('userenv','current_edition_name')='ORA$BASE' AND "U"."TYPE#"<>2) OR ("U"."TYPE#"=2
AND "U"."SPARE2"=TO_NUMBER(SYS_CONTEXT('userenv','current_edition_id'))) OR IS NOT NULL)))))
12 - access("O"."SPARE3"="U"."USER#")
14 - access("O"."OWNER#"="U"."USER#")
16 - filter(("O"."NAME"<>'_NEXT_OBJECT' AND "O"."NAME"<>'_default_auditing_options_' AND
"O"."TYPE#"<>10 AND "O"."LINKNAME" IS NULL AND BITAND("O"."FLAGS",128)=0))
20 - access("I"."OBJ#"=:B1)
21 - filter(("I"."BO#"="T"."OBJ#" AND BITAND("T"."PROPERTY",36893488147419103232)=3689348814741910
3232))
22 - access("IO"."OBJ#"="I"."BO#" AND "IO"."TYPE#"=2)
filter("IO"."TYPE#"=2)
23 - filter(BITAND("T"."PROPERTY",36893488147419103232)=0)
24 - access("T"."OBJ#"=:B1)
25 - filter((BITAND("S"."FLAGS",1024)=0 OR "S"."FLAGS" IS NULL))
26 - access("S"."OBJ#"=:B1)
27 - filter("TYPE#"=:B1)
28 - access("UE"."USER#"=:B1)
29 - filter("UE"."TYPE#"=:B1)
30 - access("UE"."USER#"=:B1)
32 - access("U2"."TYPE#"=2 AND "U2"."SPARE2"=TO_NUMBER(SYS_CONTEXT('userenv','current_edition_id')
))
filter(("U2"."TYPE#"=2 AND "U2"."SPARE2"=TO_NUMBER(SYS_CONTEXT('userenv','current_edition_id'
))))
33 - access("O2"."DATAOBJ#"=:B1 AND "O2"."TYPE#"=88 AND "O2"."OWNER#"="U2"."USER#")
37 - access("L"."OWNER#"="U"."USER#")
Note
-----
- this is an adaptive plan
91 rows selected.
SQL>
Validando por exemplo os waitings da query:
SELECT sid,
DECODE(state, 'WAITING','Waiting', 'Working') state,
DECODE(state, 'WAITING', 'So far '||seconds_in_wait,
'Last waited '|| wait_time/100)|| ' seconds for '||event
FROM v$session
WHERE sid = '&&session_id'
AND serial# = '&&serial';
SQL> SELECT sid,
DECODE(state, 'WAITING','Waiting', 'Working') state,
DECODE(state, 'WAITING', 'So far '||seconds_in_wait,
'Last waited '|| wait_time/100)|| ' seconds for '||event
FROM v$session
WHERE sid = '&&session_id'
AND serial# = '&&serial'; 2 3 4 5 6 7
Enter value for session_id: 93
old 6: WHERE sid = '&&session_id'
new 6: WHERE sid = '93'
Enter value for serial: 22672
old 7: AND serial# = '&&serial'
new 7: AND serial# = '22672'
93 Waiting So far 0 seconds for SQL*Net message from client
SQL>