How to identify Network-Bottlenecks on Oracle Linux

Parecido com o artigo sobre consumo de CPU, Memória e I/O, hoje escreverei sobre Network.

Netstat

A ferramenta que nos permite uma análise preliminar sobre o consumo de Network em nosso ambiente é o netstat. Com a opção “-ptc”, conseguimos a lista do PID, tipo de conexão, de forma contínua, conforme exemplo abaixo:

[root@oel7 ~]# netstat -ptc
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 oel7.localdomain:ssh    192.168.0.8:54791       ESTABLISHED 5219/sshd: oracle [
tcp        0      0 localhost:54829         localhost:11777         ESTABLISHED 3630/ocssd.bin
tcp        0      0 localhost:11777         localhost:54829         ESTABLISHED 3630/ocssd.bin
tcp        0      0 oel7.localdomain:ssh    192.168.0.8:55650       ESTABLISHED 28094/sshd: root@no
tcp        0    144 oel7.localdomain:ssh    192.168.0.8:55649       ESTABLISHED 28089/sshd: root@pt
tcp        0      0 oel7.localdomain:ssh    192.168.0.8:54790       ESTABLISHED 5215/sshd: oracle [
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 oel7.localdomain:ssh    192.168.0.8:54791       ESTABLISHED 5219/sshd: oracle [
tcp        0      0 localhost:54829         localhost:11777         ESTABLISHED 3630/ocssd.bin
tcp        0      0 localhost:11777         localhost:54829         ESTABLISHED 3630/ocssd.bin
tcp        0      0 oel7.localdomain:ssh    192.168.0.8:55650       ESTABLISHED 28094/sshd: root@no
tcp        0     48 oel7.localdomain:ssh    192.168.0.8:55649       ESTABLISHED 28089/sshd: root@pt
tcp        0      0 oel7.localdomain:ssh    192.168.0.8:54790       ESTABLISHED 5215/sshd: oracle [
^C

Caso a coluna “Send-Q” (bytes not acknowleged by remote host) estiver constantemente com valores altos, pode indicar um consumo alto de network. Caso seja associado à um processo específico, e se for oracle, podemos procurar dentro do database pelo PID. Pode ser uma query executando em um banco remoto via DB_LINK, por exemplo.

SAR

O comando SAR nos dá visão sobre muitas verticais do sistema, mas com a opção abaixo, conseguimos ter dados sobre as interfaces de rede e seu respectivo uso, por exemplo o número de pacotes transmitidos e recebidos por segundo:

[oracle@oel7 ~]$ sar -n DEV
Linux 4.14.35-1902.3.2.el7uek.x86_64 (oel7.localdomain)         01/25/2023      _x86_64_        (6 CPU)

08:06:49 PM       LINUX RESTART

08:10:02 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
08:20:01 PM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:20:01 PM    enp0s3      4.97      8.09      0.34      0.95      0.00      0.00      0.15
08:20:01 PM        lo      0.07      0.07      0.00      0.00      0.00      0.00      0.00
08:20:01 PM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:30:01 PM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:30:01 PM    enp0s3      4.99      8.25      0.34      0.95      0.00      0.00      0.14
08:30:01 PM        lo      0.07      0.07      0.00      0.00      0.00      0.00      0.00
08:30:01 PM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:40:01 PM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:40:01 PM    enp0s3      4.96      8.24      0.33      0.95      0.00      0.00      0.15
08:40:01 PM        lo      0.07      0.07      0.00      0.00      0.00      0.00      0.00
08:40:01 PM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:50:01 PM virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
08:50:01 PM    enp0s3      5.89      8.93      0.44      1.07      0.00      0.00      0.13
08:50:01 PM        lo      0.07      0.07      0.00      0.00      0.00      0.00      0.00
08:50:01 PM    virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:    virbr0-nic      0.00      0.00      0.00      0.00      0.00      0.00      0.00
Average:       enp0s3      5.20      8.38      0.36      0.98      0.00      0.00      0.14
Average:           lo      0.07      0.07      0.00      0.00      0.00      0.00      0.00
Average:       virbr0      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Nota: o “DEV” significa network devices.

Caso seja identificado alguma anomalia de conexão entre servidores, podemos utilizar o comando “traceroute” para capturar a rota usada pelos pacotes enviados de uma máquina A para a máquina B. Desse modo, o time de Redes/S.O pode intervir caso alguma configuração não esteja correta (por exemplo, apontamento de roteador incorreto, provocando mais saltos, etc):

[oracle@oel7 ~]$ ping 192.168.0.123
PING 192.168.0.123 (192.168.0.123) 56(84) bytes of data.
64 bytes from 192.168.0.123: icmp_seq=1 ttl=255 time=0.798 ms
64 bytes from 192.168.0.123: icmp_seq=2 ttl=255 time=0.418 ms
^C
--- 192.168.0.123 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1019ms
rtt min/avg/max/mdev = 0.418/0.608/0.798/0.190 ms
[oracle@oel7 ~]$ traceroute 192.168.0.123
traceroute to 192.168.0.123 (192.168.0.123), 30 hops max, 60 byte packets
 1  192.168.0.123 (192.168.0.123)  1.464 ms  1.390 ms  1.289 ms
[oracle@oel7 ~]$

Leave a Comment

Your email address will not be published.